In order to understand bandwidth adaptation mechanisms for multimedia appli- cations, it is necessary to understand first the impact of bandwidth variations on received media quality.
4.2.1 Downloading and Streaming
In contrast to data communication or to simple media downloading, real-time media streaming is often subject to strict delay constraints. The main difference with respect to a download application is that media playback starts as data is still being received, so that playback could be interrupted if the decoder ran out of data to decode.
Typical streaming applications operate as follows:
1. Data request. A request is sent to the media sender so that data streaming to the receiver starts.
2. Client buffer loading. As data starts to reach the client, decoding does not start immediately. Instead the client waits to have “enough” data to start decoding.
3. Playback. Once there is sufficient data available at the client, playback starts and at that point only relatively minor adjustments in the playback timing are possible, so that the rate at which media is played back (e.g., the number of video frames per second) needs to remain nearly constant.1An example of playback adaptation can be found in [39] and in Chapter 16.
There are multiple strategies for clients to determine that enough data is avail- able to begin decoding. For example, a target total number of bits may have to be buffered before playback starts. Alternatively, a predetermined time for buffering (e.g., a few seconds) could be chosen so that users always experience the same time latency before playback starts. Finally, a more practical approach may be to wait until the number of bits that represents a selected playback time has been received (e.g., the number of bits needed to encode a video segment with a prede- termined duration). Details can be found in Chapter 14.
Note that the primary concern in many applications is playback latency, rather than storage at the receiver. Thus, in applications such as streaming to a com- puter, where there is plenty of memory available, the amount of data loaded be- fore playback may still be kept small to limit the initial latency in the system. Note also that latency is particularly important when the user is expected to frequently switch media sources, as the latency penalty will be incurred every time the user switches.
Regardless of which technique is chosen to preload the buffer, at the time when playback starts, the decoder will have a certain number of bits available for de- coding. This available data translates into a playback duration (e.g., if there are
1More precisely, the number of frames per second received needs to be consistent with what the receiver expects to play back. Thus, adaptation mechanisms that involve both transmitter and receiver are possible (e.g., so that fewer frames per second are transmitted and played back when bandwidth availability is reduced).
N compressed frames in the decoder buffer andK frames/second are decoded, then the decoder will be able to play from the buffer forN/Kseconds). Thus the amount of data available in the decoder buffer at a given time tells us for how long playback could proceed, even if no data was to be received from the network.
4.2.2 Available Bandwidth and Media Quality
To understand the need for bandwidth adaptation, consider what would happen if reductions in channel bandwidth were not matched by reductions in the source coding rate. Assume a constant media playback rate, for example, a fixed number of frames per second in a video application. When network bandwidth becomes lower, if the number of bits per frame does not change, then the number of frames per second received is bound to decrease. Since the receiver continues playing frames at the same rate, eventually there will be no frames left for playback in the receiver buffer and thus playback will be interrupted. This will be a decoder buffer underflow.
Thus, as network bandwidth fluctuates, bandwidth adaptation is needed to en- sure that playback is not interrupted. Roughly speaking, this requires that the number of frames/second provided by the network matches (on average) the num- ber of frames/second consumed for playback at the receiver. The general goal of bandwidth adaptation mechanisms will be to manage the quality of the frames transmitted so that when the available bandwidth is reduced, the rate (and hence the quality) of transmitted frames is also reduced. In essence, the goal is to avoid service interruptions by lowering the media quality in a “graceful” manner.
We next provide a more detailed discussion of the delay constraints that are present in a real-time media communications system.
4.2.3 Delay-Constrained Transmission
Consider, as an example of delay-constrained transmission, a real-time video transmission system where all operations have to be completed within a prede- termined time interval. The end-to-end delay from a video source to a destination contains the following five components, as illustrated in Figure 4.1:
• Encoding delay Te: the delay required to capture and encode a video frame.
• Encoder buffer delayTeb: the time the encoded media data corresponding to a given frame spends in the transmission buffer. Note that this delay could be zero if the channel bandwidth is higher than the bit rate produced by the encoder, that is, data transmission would start as soon as video data is placed in the buffer.
FIGURE 4.1: Delay components of a communication system.
• Transmission channel delayTc: the delay for encoded data being trans- mitted through the network, caused by transmission, congestion, and pos- sible retransmission over a lossy channel.
• Decoder buffer delayTdb: the time for encoded data to wait in the de- coder buffer for decoding. This delay allows smoothing out the variations in transmission delay and in rates across frames.
• Decoding delayTd: the delay for decoding process and final display.
Both encoding and decoding delays are usually fixed, so that we focus primarily on the remaining delay terms. Note that when considering pre-encoded media, the encoding delay is not considered as the video is already encoded off-line and ready for transmission. In this case we can consider that the encoding buffer is of infinite size and contains the complete encoded sequence.
In summary, the main constraint in the system is the status of the decoder buffer, that is, as long as the decoder buffer contains data, decoding can proceed. Thus bandwidth adaptation mechanisms should be designed with the objective of en- suring that the decoder buffer is not “starved” of data. In some cases, accurate and timely information about the decoder buffer state is available, which can then be used to make bandwidth adjustments (this would be the case, for example, if the client makes bandwidth adaptation decisions). In other cases, exact informa- tion may not be available, but the state of the buffer could be estimated using, for example, estimates of available bandwidth.
Different applications may have different delay requirements. For example, for live interactive video, a round-trip delay between 150 and 400 ms is usually re- quired, while an initial play-out delay up to a few seconds is acceptable for nonin- teractive video streaming. Once selected, end-to-end delay requirements impose a constraint on the encoding rate for each frame.