409 research outputs found

    Compressed Domain Packet Loss Concealment of Sinusoidally Coded Speech

    Get PDF
    In this paper we consider the problem of packet loss concealment for Voice over IP (VoIP). The speech signal is compressed at the transmitter using A sinusoidal coding scheme working at 8 kbit/s. At the receiver, packet loss concealment is carried out working directly on the quantized sinusoidal parameters, based on time-scaling of the packets surrounding the missing ones. Subjective listening tests show promising results indicating the potential of sinusoidal speech coding for VoIP

    Using Autoregressive Models for Real-Time Packet Loss Concealment in Networked Music Performance Applications

    Get PDF
    In Networked Music Performances (NMP), concealing the effects of lost/late packets on the quality of the playback audio stream is of pivotal importance to mitigate the impact of the resulting audio artifacts. Traditional packet loss concealment techniques implemented in standard audio codecs can be leveraged only at the price of an increased mouth-to-ear latency, which may easily exceed the strict delay requirements of NMP interactions. This paper investigates the adoption of a low-complexity prediction technique based on autoregressive models to fill audio gaps caused by missing packets. Numerical results show that the proposed approach outperforms packet loss concealment methods normally implemented in NMP systems, typically based on filling audio gaps with silence or repetition of the last received audio segment

    A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

    Full text link
    Packet loss is a major cause of voice quality degradation in VoIP transmissions with serious impact on intelligibility and user experience. This paper describes a system based on a generative adversarial approach, which aims to repair the lost fragments during the transmission of audio streams. Inspired by the powerful image-to-image translation capability of Generative Adversarial Networks (GANs), we propose bin2bin, an improved pix2pix framework to achieve the translation task from magnitude spectrograms of audio frames with lost packets, to noncorrupted speech spectrograms. In order to better maintain the structural information after spectrogram translation, this paper introduces the combination of two STFT-based loss functions, mixed with the traditional GAN objective. Furthermore, we employ a modified PatchGAN structure as discriminator and we lower the concealment time by a proper initialization of the phase reconstruction algorithm. Experimental results show that the proposed method has obvious advantages when compared with the current state-of-the-art methods, as it can better handle both high packet loss rates and large gaps.Comment: Accepted at EUSIPCO - 31st European Signal Processing Conference, 202

    Bilateral Waveform Similarity Overlap-and-Add Based Packet Loss Concealment for Voice over IP

    Get PDF
    This paper invested a bilateral waveform similarity overlap-and-add algorithm for voice packet lost. Since Packet lost will cause the semantic misunderstanding, it has become one of the most essential problems in speech communication. This investment is based on waveform similarity measure using overlap-and-Add algorithm and provides the bilateral information to enhance the speech signal reconstruction. Traditionally, it has been improved that waveform similarity overlap-and-add (WSOLA) technique is an effective algorithm to deal with packet loss concealment (PLC) for real-time time communication. WSOLA algorithm is widely applied to deal with the length adaptation and packet loss concealment of speech signal. Time scale modification of audio signal is one of the most essential research topics in data communication, especially in voice of IP (VoIP). Herein, the proposed the bilateral WSOLA (BWSOLA) that is derived from WSOLA. Instead of only exploitation one direction speech data, the proposed method will reconstruct the lost voice data according to the preceding and cascading data. The related algorithms have been developed to achieve the optimal reconstructing estimation. The experimental results show that the quality of the reconstructed speech signal of the bilateral WSOLA is much better compared to the standard WSOLA and GWSOLA on different packet loss rate and length using the metrics PESQ and MOS. The significant improvement is obtained by bilateral information and proposed method. The proposed bilateral waveform similarity overlap-and-add (BWSOLA) outperforms the traditional approaches especially in the long duration data loss

    A Hybrid Signal-and-Link-Parametric Approach to Single-Ended Quality Measurement of Packetized Speech

    Full text link
    A hybrid signal-and-link-parametric approach to single-ended quality measurement of packetized speech is proposed. Trans-mission link parameters are used to determine a base quality for the test signal. The base quality is adjusted by degradation factors calculated from perceptual features extracted from the test signal. The degradation factors are based on Kullback-Leibler distances between a parametric model trained online for the extracted features and reference models of normative speech behavior. The proposed method overcomes the limita-tions of pure link parametric and pure signal-based methods. Index Terms — Quality measurement, VoIP, packet loss concealment, Kullback-Leibler distance

    Assessment of Recovery Journal-Based Packet Loss Concealment Techniques for Low-Latency MIDI Streaming

    Get PDF
    In networked music performances, real-time Packet Loss Concealment (PLC) is a task of pivotal importance to compensate the detrimental impact of loss or late delivery of audio portions that often occur in low-latency audio-streaming scenarios.\\ This paper proposes an open-loop PLC method tailored for MIDI data and compares it to a closed-loop state-of-the-art benchmark in terms of effectiveness of audio recovery and communication overhead. Moreover, implementations aimed at reducing the computational overhead are proposed and compared for both approaches. Results show that the proposed open-loop policy achieves performances similar to those of the closed-loop one, while reducing the number of operations executed at the transmitter side
    • …
    corecore