This document provides an introduction to the concepts behind Quality of Service (QoS). For a description of how to use Symbian's QoS API see About the QoS API
As the demand for network services increases in scope and variety the network carriers need a means of sharing network resources among the subscribers. Some subscribers may demand a high bandwidth such as streaming video, others may be satisfied with voice telephony and messaging. The service provider needs a means of dealing effectively with these competing demands and charging for them appropriately. The service provider needs tools to limit the rate and volume of traffic entering and passing through the network. These tools need to be sophisticated enough to adapt to changes in the environment so that the resources available to the network at any time can be allocated effectively.
The allocation of shared network resources and the tools to manage this is known as Quality of Service (QoS). QoS has to be implemented at every stage of the network for the overall QoS measurement and control to be effective. The aim of the QoS determination is threefold:
To ensure that the actual performance of the network reflects the perceptions of the end user.
The measurement must be useful for network management and the dispersal of shared resources.
The measurement must accurately reflect the cost of carrying the traffic so that any level of QoS can be charged for at a reasonable rate.
The Quality of Service (QoS) of any communications network is defined in the most general sense by three things; bandwidth, latency and reliability (other defining factors such as coverage, security and interoperability are taken as a given).
To put a human perspective on these three factors imagine that you are in the following situations.
As a Spanish ambassador to the court of the English king, Henry VII, in 1495 you are expected to be in regular communication with your superiors in Madrid. Bandwidth was not a problem as long as your letters were light enough to be carried by a horse. Reliability was not too poor despite the occasional messenger falling ill on the road or a ship floundering. Latency was by far the biggest problem with the average letter taking three weeks or more so it could easily be 2 months before you got a reply. This made effective diplomacy a tortuous process at best.
The telegraph and morse code provided a dramatic reduction in latency but at the cost of reduced bandwidth. To overcome this the language of 'telegraphese' was invented as an early form of data compression. So a journalist receiving a telegram "REPORT ADEN WARWISE" would know that it meant "File a report about the military and civilian preparedness in Aden for the imminent war". A journalist replying "ADEN UNWARWISE" might trigger a 1000 word editorial attacking the government for their lack of foresight. This adhoc data compression (without error correction) was forced on the users by the low bandwidth and occasionally lead to misunderstandings.
Even if there is sufficient bandwidth and the network latency is satisfactory, poor reliability can mean that the service is unacceptable. Chronic telephone systems in remote locations exhibit this problem. The bandwidth and latency are sufficient for the purpose however getting a working line can mean waiting hours or even days. Even sophisticated telephone systems can suffer this problem. In the immediate aftermath of the 9/11 attacks on the World Trade Centre buildings the telephone network was so swamped with call attempts that the network controllers were having to deny dial tones (in random rotation) to up to a third of the lines.
Any QoS measurement must take a balanced account of all three factors.
Even though QoS is a technical solution the emphasis on the human factor is important particularly in the case of a Universal Mobile Telephone System (UMTS) because it is the end user decides what QoS they want and whether they are getting what they asked for.
When implementing QoS within a UMTS it is useful to divide up the kind of traffic that can be carried into different classes, each with their own characteristics. As well as bandwidth, latency and reliability there are some other practical characteristics that the network needs to adapt to if it is to carry the traffic effectively. Two of these are whether the traffic comes in bursts and whether the upstream and downstream traffic characteristics are symmetric or asymmetric.
The classes of traffic carried by a UMTS are divided for QoS purposes into; Conversational, Streaming, Interactive and Background classes.
Voice telephony. This is characterised by a fixed, and relatively small bandwidth with a small latency and comparatively tolerant of transmission errors. The upstream and downstream data rates are symmetrical.
This class of traffic is continuous data such as streaming video. Some errors can be tolerated (say dropped lines in a single video frame). The data rates are high and generally asymmetrical.
This class of traffic handles user request/server response traffic such as web browsing. It is medium bandwidth, must be reliable and is bursty asymmetric data. Unlike the other classes different priorities can be attached to interactive traffic.
This traffic is low volume occasional traffic such as e-mail. A reasonable delay is acceptable to the user but reliability must be high.
The following table illustrates, roughly, the characteristics of these classes:
|
This section describes the attributes that determine QoS. The example used is the UMTS QoS as that provides the facility for providing a guaranteed level of service as well as a maximum level of service. The attributes for the upstream and the downstream exist independently.
|
Suppose that we have the case whereby a user is using an client application to connect to a network and the client has been asked by the user to deliver more data than the agreed data rate with the network. In this case the client can do several things with the excess data:
Discard the excess data. In this case the network is not involved and it is up to the client to decide whether the user of the user is informed of this decision (if at all).
Slow down the data rate to the rate that has been agreed with the network. This is called 'traffic smoothing'. When the average data rate is within limits but occasional bursts of data are outside the agreed limits this is called 'burst smoothing'. Naturally enough the client then has an opportunity to inform the user that this is taking place. The client-network relationship is maintained at the agreed level therefore there is no requirement for the network to tell the client of any degradation of the service.
Offer the data to the network but mark some of it as excessive so that the network server can choose whether to discard excessive data. If the server is not under excessive load it may choose to handle this data anyway. This technique is called 'traffic policing'. In this case the client-network agreement has been broken and, depending on the configuration, the network server may or may not tell the client of the consequences of this.
The main technique for traffic smoothing is by using the 'token bucket algorithm'.
It is necessary to regulate the data rate within a network to avoid bursts of data overloading the system. In QoS terms a client might have committed to (i.e. paid for) a particular guaranteed and maximum bit rate. The token rate algorithm is used to ensure that the client keeps within these limits.
An analogy is a parcel conveyor. The parcels have irregular sizes and arrive at random. To get on the the conveyor they have to fit through an aperture and they must not touch another parcel, as illustrated below:
Parcel A is too big to get through the aperture and is discarded. Parcel B, arriving shortly after A will fit through the aperture but as there is another parcel already on the conveyer parcel B has to wait. As the conveyor belt slowly moves to the right eventually enough space opens up to fit parcel B on the conveyor. Parcel C however has to wait behind parcel B, and it has to wait longer than B did for enough space to open up to allow C to fit.
The system provides a way of taking a un-throttled stream and converting it into a stream with a maximum throughput limit. Substitute packet for parcel, packet size for parcel size, token bucket size for aperture width and bit rate for conveyor speed and you have the picture.
With the following definitions the token bucket algorithm works as follows:
|
Conformance with the agreed level of service can be defined as: "Data is conformant if the amount of data submitted during any arbitrarily chosen time period T does not exceed (b+rT)".
The token bucket counter is computed as follows:
As each packet arrives the length of the packet set t = rT where T is the time since the last packet was processed. The decision as to whether to discard, queue or pass the packet on to the network is made as follows:
|
If there are packets in the queue then at regular time intervals of time T the value of t is increased by rT, with a maximum limit of b and the test described in the table above is repeated.
The Architecture of the Symbian QoS support allows QoS functionality to be provided to network flows based on policies. There are three kinds of policy; Flowspec policies, Modulespec policies and Extension policies.
|