409 research outputs found
Compressed Domain Packet Loss Concealment of Sinusoidally Coded Speech
In this paper we consider the problem of packet loss concealment for Voice over IP (VoIP). The speech signal is compressed at the transmitter using A sinusoidal coding scheme working at 8 kbit/s. At the receiver, packet loss concealment is carried out working directly on the quantized sinusoidal parameters, based on time-scaling of the packets surrounding the missing ones. Subjective listening tests show promising results indicating the potential of sinusoidal speech coding for VoIP
Using Autoregressive Models for Real-Time Packet Loss Concealment in Networked Music Performance Applications
In Networked Music Performances (NMP), concealing the effects of lost/late packets on the quality of the playback audio stream is of pivotal importance to mitigate the impact of the resulting audio artifacts. Traditional packet loss concealment techniques implemented in standard audio codecs can be leveraged only at the price of an increased mouth-to-ear latency, which may easily exceed the strict delay requirements of NMP interactions.
This paper investigates the adoption of a low-complexity prediction technique based on autoregressive models to fill audio gaps caused by missing packets. Numerical results show that the proposed approach outperforms packet loss concealment methods normally implemented in NMP systems, typically based on filling audio gaps with silence or repetition of the last received audio segment
A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment
Packet loss is a major cause of voice quality degradation in VoIP
transmissions with serious impact on intelligibility and user experience. This
paper describes a system based on a generative adversarial approach, which aims
to repair the lost fragments during the transmission of audio streams. Inspired
by the powerful image-to-image translation capability of Generative Adversarial
Networks (GANs), we propose bin2bin, an improved pix2pix framework to achieve
the translation task from magnitude spectrograms of audio frames with lost
packets, to noncorrupted speech spectrograms. In order to better maintain the
structural information after spectrogram translation, this paper introduces the
combination of two STFT-based loss functions, mixed with the traditional GAN
objective. Furthermore, we employ a modified PatchGAN structure as
discriminator and we lower the concealment time by a proper initialization of
the phase reconstruction algorithm. Experimental results show that the proposed
method has obvious advantages when compared with the current state-of-the-art
methods, as it can better handle both high packet loss rates and large gaps.Comment: Accepted at EUSIPCO - 31st European Signal Processing Conference,
202
Bilateral Waveform Similarity Overlap-and-Add Based Packet Loss Concealment for Voice over IP
This paper invested a bilateral waveform similarity overlap-and-add algorithm for voice packet lost. Since Packet lost will cause the semantic misunderstanding, it has become one of the most essential problems in speech communication. This investment is based on waveform similarity measure using overlap-and-Add algorithm and provides the bilateral information to enhance the speech signal reconstruction. Traditionally, it has been improved that waveform similarity overlap-and-add (WSOLA) technique is an effective algorithm to deal with packet loss concealment (PLC) for real-time time communication. WSOLA algorithm is widely applied to deal with the length adaptation and packet loss concealment of speech signal. Time scale modification of audio signal is one of the most essential research topics in data communication, especially in voice of IP (VoIP). Herein, the proposed the bilateral WSOLA (BWSOLA) that is derived from WSOLA. Instead of only exploitation one direction speech data, the proposed method will reconstruct the lost voice data according to the preceding and cascading data. The related algorithms have been developed to achieve the optimal reconstructing estimation. The experimental results show that the quality of the reconstructed speech signal of the bilateral WSOLA is much better compared to the standard WSOLA and GWSOLA on different packet loss rate and length using the metrics PESQ and MOS. The significant improvement is obtained by bilateral information and proposed method. The proposed bilateral waveform similarity overlap-and-add (BWSOLA) outperforms the traditional approaches especially in the long duration data loss
Bayesian interpolation in a dynamic sinusoidal model with application to packet-loss concealment
Publication in the conference proceedings of EUSIPCO, Aalborg, Denmark, 201
A Hybrid Signal-and-Link-Parametric Approach to Single-Ended Quality Measurement of Packetized Speech
A hybrid signal-and-link-parametric approach to single-ended quality measurement of packetized speech is proposed. Trans-mission link parameters are used to determine a base quality for the test signal. The base quality is adjusted by degradation factors calculated from perceptual features extracted from the test signal. The degradation factors are based on Kullback-Leibler distances between a parametric model trained online for the extracted features and reference models of normative speech behavior. The proposed method overcomes the limita-tions of pure link parametric and pure signal-based methods. Index Terms — Quality measurement, VoIP, packet loss concealment, Kullback-Leibler distance
Assessment of Recovery Journal-Based Packet Loss Concealment Techniques for Low-Latency MIDI Streaming
In networked music performances, real-time Packet Loss Concealment (PLC) is a task of pivotal importance to compensate the detrimental impact of loss or late delivery of audio portions that often occur in low-latency audio-streaming scenarios.\\
This paper proposes an open-loop PLC method tailored for MIDI data and compares it to a closed-loop state-of-the-art benchmark in terms of effectiveness of audio recovery and communication overhead. Moreover, implementations aimed at reducing the computational overhead are proposed and compared for both approaches. Results show that the proposed open-loop policy achieves performances similar to those of the closed-loop one, while reducing the number of operations executed at the transmitter side
- …