12 research outputs found

    Scalable Speech Coding for IP Networks

    Get PDF
    The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss. Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support. This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec. The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition

    Low Delay Sparse and Mixed Excitation CELP Coders for Wideband Speech Coding

    Get PDF
    Code Excited Linear Prediction (CELP) algorithmsare proposed for compression of speech in 8 kHz band atswitched or variable bit rate and algorithmic delay not exceeding2 msec. Two structures of Low-Delay CELP coders are analyzed:Low-delay sparse excitation and mixed excitation CELP. Sparseexcitation is based on MP-MLQ and multilayer models. Mixedexcitation CELP algorithm stems from the narrowband G.728standard. As opposed to G.728 LD-CELP coder, mixed excitationcodebook consists of pseudorandom vectors and sequencesobtained with Long-Term Prediction (LTP). Variable rate codingconsists in maximizing vector dimension while keeping therequired speech quality. Good speech quality (MOS=3.9according to PESQ algorithm) is obtained at average bit rate 33.5kbit/sec

    Évaluation subjective de la qualité (proposition d'un système de référence pour les codecs en bande élargie)

    Get PDF
    L'évolution des systèmes de télécommunications conduit à la conception de codecs de la parole et du son de plus en plus sophistiqués, accroissant ainsi la concurrence de l'industrie de l'audio et accordant une importance grandissante à la qualité de service. Si l'évaluation de la qualité des codecs peut s'opérer suivant des mesures objectives ou subjectives, les secondes restent les plus fiables dans la mesure où la qualité perçue par les utilisateurs est intrinsèquement subjective. Toutefois, les tests subjectifs requièrent des signaux d'ancrage, i.e. des signaux artificiels visant la reproduction des défauts perceptifs des codecs de sorte que les dégradations provoquées soient aisément contrôlables. Le système de référence actuellement normalisé par l'Union Internationale des Télécommunications est le MNRU (Modulated Noise Reference Unit) qui simule le bruit de quantification introduit par les premiers codecs en forme d'onde. L'évolution de la technologie rend aujourd'hui ce système obsolète, et il s'agit donc de concevoir un nouveau système d'ancrage plus adapté aux codecs actuels. En considérant la qualité audio comme un objet multidimensionnel, nous avons mis en évidence un espace perceptif à quatre dimensions, et ce à partir de deux approches de réduction de dimensionnalité, l'AFM (Analyse Factorielle Multiple) et la MDS 3 voies (MultiDimensional Scaling). A partir des quatre dimensions identifiées Réduction de la largeur de bande , Bruit de fond , Écho/Réverbération et Distorsion de la parole , nous avons modélisé puis validé les signaux d'ancrage des trois premières dimensions et proposé deux modèles de signaux d'ancrage pour la quatrième.The evolution of technology led to the design of very sophisticated speech and audio codecs. Accordingly, the competition in audio devices manufacturing has increased and today the quality of service becomes crucial for telecommunications operators. Quality of codecs is assessed through objective and subjective measures, the second ones being the most reliable since the quality perceived by users is inherently subjective. Nevertheless, subjective tests require anchor signals corresponding to artificial signals, which reproduce the perceptual impairments of codecs in such a manner that the amount of degradation can be easily controlled. The reference system currently standardized by the International Telecommunication Union is the Modulated Noise Reference Unit (MNRU), which simulates the quantization noise of the first generation of waveform codecs. Due to the evolution of codecs, the MNRU system became obsolete and researchers aim at designing a new reference system of anchor signals more suited to current codecs. Assuming that speech and audio quality is multidimensional, we first identified four perceptual dimensions using two dimensionality reduction techniques the MFA (Multiple Factor Analysis) and the 3 way MDS (MultiDimensional Scaling). From the identified dimensions, namely Bandwidth limitation , Background noise , Echo/Reverberation and Speech distortion , we succeeded in modeling and validating anchor signals for three of them and we suggested two models of anchor signals for the last one.RENNES1-Bibl. électronique (352382106) / SudocSudocFranceF

    VOIP WITH ADAPTIVE RATE IN MULTI- TRANSMISSION RATE WIRELESS LANS

    Get PDF
    “Voice over Internet Protocol (VoIP)” is a popular communication technology that plays a vital role in term of cost reduction and flexibility. However, like any emerging technology, there are still some issues with VoIP, namely providing good Quality of Service (QoS), capacity consideration and providing security. This study focuses on the QoS issue of VoIP, specifically in “Wireless Local Area Networks (WLAN)”. IEEE 802.11 is the most popular standard of wireless LANs and it offers different transmission rates for wireless channels. Different transmission rates are associated with varying available bandwidth that shall influence the transmission of VoIP traffic

    A novel non-intrusive objective method to predict voice quality of service in LTE networks.

    Get PDF
    This research aimed to introduce a novel approach for non-intrusive objective measurement of voice Quality of Service (QoS) in LTE networks. While achieving this aim, the thesis established a thorough knowledge of how voice traffic is handled in LTE networks, the LTE network architecture and its similarities and differences to its predecessors and traditional ground IP networks and most importantly those QoS affecting parameters which are exclusive to LTE environments. Mean Opinion Score (MOS) is the scoring system used to measure the QoS of voice traffic which can be measured subjectively (as originally intended). Subjective QoS measurement methods are costly and time-consuming, therefore, objective methods such as Perceptual Evaluation of Speech Quality (PESQ) were developed to address these limitations. These objective methods have a high correlation with subjective MOS scores. However, they either require individual calculation of many network parameters or have an intrusive nature that requires access to both the reference signal and the degraded signal for comparison by software. Therefore, the current objective methods are not suitable for application in real-time measurement and prediction scenarios. A major contribution of the research was identifying LTE-specific QoS affecting parameters. There is no previous work that combines these parameters to assess their impacts on QoS. The experiment was configured in a hardware in the loop environment. This configuration could serve as a platform for future research which requires simulation of voice traffic in LTE environments. The key contribution of this research is a novel non-intrusive objective method for QoS measurement and prediction using neural networks. A comparative analysis is presented that examines the performance of four neural network algorithms for non-intrusive measurement and prediction of voice quality over LTE networks. In conclusion, the Bayesian Regularization algorithm with 4 neurons in the hidden layer and sigmoid symmetric transfer function was identified as the best solution with a Mean Square Error (MSE) rate of 0.001 and regression value of 0.998 measured for the testing data set

    ADAPTIVE SPEECH QUALITY IN VOICE-OVER-IP COMMUNICATIONS

    Get PDF
    The quality of VoIP communication relies significantly on the network that transports the voice packets because this network does not usually guarantee the available bandwidth, delay, and loss that are critical for real-time voice traffic. The solution proposed here is to manage the voice-over-IP stream dynamically, changing parameters as needed to assure quality. The main objective of this dissertation is to develop an adaptive speech encoding system that can be applied to conventional (telephony-grade) and wideband voice communications. This comprehensive study includes the investigation and development of three key components of the system. First, to manage VoIP quality dynamically, a tool is needed to measure real-time changes in quality. The E-model, which exists for narrowband communication, is extended to a single computational technique that measures speech quality for narrowband and wideband VoIP codecs. This part of the dissertation also develops important theoretical work in the area of wideband telephony. The second system component is a variable speech-encoding algorithm. Although VoIP performance is affected by multiple codecs and network-based factors, only three factors can be managed dynamically: voice payload size, speech compression and jitter buffer management. Using an existing adaptive jitter-buffer algorithm, voice packet-size and compression variation are studied as they affect speech quality under different network conditions. This study explains the relationships among multiple parameters as they affect speech transmission and its resulting quality. Then, based on these two components, the third system component is a novel adaptive-rate control algorithm that establishes the interaction between a VoIP sender and receiver, and manages voice quality in real-time. Simulations demonstrate that the system provides better average voice quality than traditional VoIP

    Étude de transformées temps-fréquence pour le codage audio faible retard en haute qualité

    Get PDF
    In recent years there has been a phenomenal increase in the number of products and applications which make use of audio coding formats. Amongthe most successful audio coding schemes, the MPEG-1 Layer III (mp3), the MPEG-2 Advanced Audio Coding (AAC) or its evolution MPEG-4High Efficiency-Advanced Audio Coding (HE-AAC) can be cited. More recently, perceptual audio coding has been adapted to achieve codingat low-delay such to become suitable for conversational applications. Traditionally, the use of filter bank such as the Modified Discrete CosineTransform (MDCT) is a central component of perceptual audio coding and its adaptation to low delay audio coding has become an important researchtopic. Low delay transforms have been developed in order to retain the performance of standard audio coding while reducing dramatically the associated algorithmic delay.This work presents some elements allowing to better accommodate the delay reduction constraint. Among the contributions, a low delay blockswitching tool which allows the direct transition between long transform and short transform without the insertion of transition window. The sameprinciple has been extended to define new perfect reconstruction conditions for the MDCT with relaxed constraints compared to the original definition.As a consequence, a seamless reconstruction method has been derived to increase the flexibility of transform coding schemes with the possibility toselect a transform for a frame independently from its neighbouring frames. Finally, based on this new approach, a new low delay window design procedure has been derived to obtain an analytic definition for a new family of transforms, permitting high quality with a substantial coding delay reduction. The performance of the proposed transforms has been thoroughly evaluated, an evaluation framework involving an objective measurement of the optimal transform sequence is proposed. It confirms the relevance of the proposed transforms used for audio coding. In addition, the new approaches have been successfully applied to the recent standardisation work items, such as the low delay audio coding developed at MPEG (LD-AAC and ELD-AAC) and they have been evaluated with numerous subjective testing, showing a significant improvement of the quality for transient signals. The new low delay window design has been adopted in G.718, a scalable speech and audio codec standardized in ITU-T and has demonstrated its benefit in terms of delay reduction while maintaining the audio quality of a traditional MDCT.Codage audio à faible retard à l'aide de la définition de nouvelles fenêtres pour la transformée MDCT et l'introduction d'un nouveau schéma de commutation de fenêtre

    VOIP WITH ADAPTIVE RATE IN MULTI- TRANSMISSION RATE WIRELESS LANS

    Get PDF
    “Voice over Internet Protocol (VoIP)” is a popular communication technology that plays a vital role in term of cost reduction and flexibility. However, like any emerging technology, there are still some issues with VoIP, namely providing good Quality of Service (QoS), capacity consideration and providing security. This study focuses on the QoS issue of VoIP, specifically in “Wireless Local Area Networks (WLAN)”. IEEE 802.11 is the most popular standard of wireless LANs and it offers different transmission rates for wireless channels. Different transmission rates are associated with varying available bandwidth that shall influence the transmission of VoIP traffic
    corecore