175 research outputs found

    Considering Bluetooth's Subband Codec (SBC) for Wideband Speech and Audio on the Internet

    Get PDF
    The Bluetooth Special Interest Group (SIG) has standardized the subband coding (SBC) audio codec to connect headphones via wireless Bluetooth links. SBC compresses audio at high fidelity while having an ultra-low algorithm delay. To make SBC suitable for the Internet, we extend it by using a time and packet loss concealment (PLC) algorithm that is based on ITU's G.711 Appendix I. The design is novel in the aspect of the interface between codec and speech receiver. We developed a new approach on how to distribute the functionality of a speech receiver between codec and application. Our approach leads to easier implementations of high quality VoIP applications. We conducted subjective and objective listening tests of the audio quality of SBC and PLC in order to determine an optimal coding mode and the trade-off between coding mode and packet loss rate. More precisely, we conducted MUSHRA listening tests for selected sample items. These tests results are then compared with the results of multiple objective assessment algorithms (ITU P.862 PESQ, ITU BS.1387-1 PEAQ, Creusere's algorithm). We found out that a combination of the PEAQ basic and advanced values best matches---after third order linear regression---the subjective MUSHRA results . The linear regression has coefficient of determination of RÂČ=0.907ÂČ. By comparison, our individual human ratings show a correlation of about R=0.9 compared to our averaged human rating results. Using the combination of both PEAQ algorithms, we calculate hundred thousands of objective audio quality ratings varying audio content and algorithmic parameters of SBC and PLC. The results show which set of parameters value are best suitable for a bandwidth and delay constrained link. The transmission quality of SBC is enhanced significantly by selecting optimal encoding parameters as compared to the default parameter sets given in the standard. Finally, we present preliminary objective tests results on the comparison of the audio codecs SBC, CELT, APT-X and ULD coding speech and audio transmission. They all allow a mono and stereo transmission of music at ultra-low coding delays (<10ms), which is especially useful for distributed ensemble performances over the Internet

    Scalable Speech Coding for IP Networks

    Get PDF
    The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss. Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support. This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec. The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition

    Energy-efficient wireless communication

    Get PDF
    In this chapter we present an energy-efficient highly adaptive network interface architecture and a novel data link layer protocol for wireless networks that provides Quality of Service (QoS) support for diverse traffic types. Due to the dynamic nature of wireless networks, adaptations in bandwidth scheduling and error control are necessary to achieve energy efficiency and an acceptable quality of service. In our approach we apply adaptability through all layers of the protocol stack, and provide feedback to the applications. In this way the applications can adapt the data streams, and the network protocols can adapt the communication parameters

    Quality of media traffic over Lossy internet protocol networks: Measurement and improvement.

    Get PDF
    Voice over Internet Protocol (VoIP) is an active area of research in the world of communication. The high revenue made by the telecommunication companies is a motivation to develop solutions that transmit voice over other media rather than the traditional, circuit switching network. However, while IP networks can carry data traffic very well due to their besteffort nature, they are not designed to carry real-time applications such as voice. As such several degradations can happen to the speech signal before it reaches its destination. Therefore, it is important for legal, commercial, and technical reasons to measure the quality of VoIP applications accurately and non-intrusively. Several methods were proposed to measure the speech quality: some of these methods are subjective, others are intrusive-based while others are non-intrusive. One of the non-intrusive methods for measuring the speech quality is the E-model standardised by the International Telecommunication Union-Telecommunication Standardisation Sector (ITU-T). Although the E-model is a non-intrusive method for measuring the speech quality, but it depends on the time-consuming, expensive and hard to conduct subjective tests to calibrate its parameters, consequently it is applicable to a limited number of conditions and speech coders. Also, it is less accurate than the intrusive methods such as Perceptual Evaluation of Speech Quality (PESQ) because it does not consider the contents of the received signal. In this thesis an approach to extend the E-model based on PESQ is proposed. Using this method the E-model can be extended to new network conditions and applied to new speech coders without the need for the subjective tests. The modified E-model calibrated using PESQ is compared with the E-model calibrated using i ii subjective tests to prove its effectiveness. During the above extension the relation between quality estimation using the E-model and PESQ is investigated and a correction formula is proposed to correct the deviation in speech quality estimation. Another extension to the E-model to improve its accuracy in comparison with the PESQ looks into the content of the degraded signal and classifies packet loss into either Voiced or Unvoiced based on the received surrounding packets. The accuracy of the proposed method is evaluated by comparing the estimation of the new method that takes packet class into consideration with the measurement provided by PESQ as a more accurate, intrusive method for measuring the speech quality. The above two extensions for quality estimation of the E-model are combined to offer a method for estimating the quality of VoIP applications accurately, nonintrusively without the need for the time-consuming, expensive, and hard to conduct subjective tests. Finally, the applicability of the E-model or the modified E-model in measuring the quality of services in Service Oriented Computing (SOC) is illustrated

    A novel multimedia adaptation architecture and congestion control mechanism designed for real-time interactive applications

    Get PDF
    PhDThe increasing use of interactive multimedia applications over the Internet has created a problem of congestion. This is because a majority of these applications do not respond to congestion indicators. This leads to resource starvation for responsive flows, and ultimately excessive delay and losses for all flows therefore loss of quality. This results in unfair sharing of network resources and increasing the risk of network ‘congestion collapse’. Current Congestion Control Mechanisms such as ‘TCP-Friendly Rate Control’ (TFRC) have been able to achieve ‘fair-share’ of network resource when competing with responsive flows such as TCP, but TFRC’s method of congestion response (i.e. to reduce Packet Rate) is not ideally matched for interactive multimedia applications which maintain a fixed Frame Rate. This mismatch of the two rates (Packet Rate and Frame Rate) leads to buffering of frames at the Sender Buffer resulting in delay and loss, and an unacceptable reduction of quality or complete loss of service for the end-user. To address this issue, this thesis proposes a novel Congestion Control Mechanism which is referred to as ‘TCP-friendly rate control – Fine Grain Scalable’ (TFGS) for interactive multimedia applications. This new approach allows multimedia frames (data) to be sent as soon as they are generated, so that the multimedia frames can reach the destination as quickly as possible, in order to provide an isochronous interactive service. This is done by maintaining the Packet Rate of the Congestion Control Mechanism (CCM) at a level equivalent to the Frame Rate of the Multimedia Encoder.The response to congestion is to truncate the Packet Size, hence reducing the overall bitrate of the multimedia stream. This functionality of the Congestion Control Mechanism is referred to as Packet Size Truncation (PST), and takes advantage of adaptive multimedia encoding, such as Fine Grain Scalable (FGS), where the multimedia frame is encoded in order of significance, Most to Least Significant Bits. The Multimedia Adaptation Manager (MAM) truncates the multimedia frame to the size indicated by the Packet Size Truncation function of the CCM, accurately mapping user demand to available network resource. Additionally Fine Grain Scalable encoding can offer scalability at byte level granularity, providing a true match to available network resources. This approach has the benefits of achieving a ‘fair-share’ of network resource when competing with responsive flows (as similar to TFRC CCM), but it also provides an isochronous service which is of crucial benefit to real-time interactive services. Furthermore, results illustrate that an increased number of interactive multimedia flows (such as voice) can be carried over congested networks whilst maintaining a quality level equivalent to that of a standard landline telephone. This is because the loss and delay arising from the buffering of frames at the Sender Buffer is completely removed. Packets sent maintain a fixed inter-packet-gap-spacing (IPGS). This results in a majority of packets arriving at the receiving end at tight time intervals. Hence, this avoids the need of using large Playout (de-jitter) Buffer sizes and adaptive Playout Buffer configurations. As a result this reduces delay, improves interactivity and Quality of Experience (QoE) of the multimedia application

    Quality aspects of Internet telephony

    Get PDF
    Internet telephony has had a tremendous impact on how people communicate. Many now maintain contact using some form of Internet telephony. Therefore the motivation for this work has been to address the quality aspects of real-world Internet telephony for both fixed and wireless telecommunication. The focus has been on the quality aspects of voice communication, since poor quality leads often to user dissatisfaction. The scope of the work has been broad in order to address the main factors within IP-based voice communication. The first four chapters of this dissertation constitute the background material. The first chapter outlines where Internet telephony is deployed today. It also motivates the topics and techniques used in this research. The second chapter provides the background on Internet telephony including signalling, speech coding and voice Internetworking. The third chapter focuses solely on quality measures for packetised voice systems and finally the fourth chapter is devoted to the history of voice research. The appendix of this dissertation constitutes the research contributions. It includes an examination of the access network, focusing on how calls are multiplexed in wired and wireless systems. Subsequently in the wireless case, we consider how to handover calls from 802.11 networks to the cellular infrastructure. We then consider the Internet backbone where most of our work is devoted to measurements specifically for Internet telephony. The applications of these measurements have been estimating telephony arrival processes, measuring call quality, and quantifying the trend in Internet telephony quality over several years. We also consider the end systems, since they are responsible for reconstructing a voice stream given loss and delay constraints. Finally we estimate voice quality using the ITU proposal PESQ and the packet loss process. The main contribution of this work is a systematic examination of Internet telephony. We describe several methods to enable adaptable solutions for maintaining consistent voice quality. We have also found that relatively small technical changes can lead to substantial user quality improvements. A second contribution of this work is a suite of software tools designed to ascertain voice quality in IP networks. Some of these tools are in use within commercial systems today
    • 

    corecore