3,774 research outputs found
Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction Propagation Networks
Spectral sub-bands do not portray the same perceptual relevance. In audio
coding, it is therefore desirable to have independent control over each of the
constituent bands so that bitrate assignment and signal reconstruction can be
achieved efficiently. In this work, we present a novel neural audio coding
network that natively supports a multi-band coding paradigm. Our model extends
the idea of compressed skip connections in the U-Net-based codec, allowing for
independent control over both core and high band-specific reconstructions and
bit allocation. Our system reconstructs the full-band signal mainly from the
condensed core-band code, therefore exploiting and showcasing its bandwidth
extension capabilities to its fullest. Meanwhile, the low-bitrate high-band
code helps the high-band reconstruction similarly to MPEG audio codecs'
spectral bandwidth replication. MUSHRA tests show that the proposed model not
only improves the quality of the core band by explicitly assigning more bits to
it but retains a good quality in the high-band as well.Comment: Accepted to ICASSP 2023. For resources and examples, see
https://saige.sice.indiana.edu/research-projects/HARP-Net
Wavelet-Based Audio Embedding & Audio/Video Compression
With the decline in military spending, the United States relies heavily on state side support. Communications has never been more important. High-quality audio and video capabilities are a must. Watermarking, traditionally used for copyright protection, is used in a new and exciting way. An efficient wavelet-based watermarking technique embeds audio information into a video signal. Several highly effective compression techniques are applied to compress the resulting audio/video signal in an embedded fashion. This wavelet-based compression algorithm incorporates bit plane coding, first difference coding, and Huffman coding. To demonstrate the potential of this audio embedding audio/video compression system, an audio signal is embedded into a video signal and the combined signal is compressed. Results show that overall compression rates of 15:1 can be achieved. The video signal is reconstructed with a median PSNR of nearly 33dB. Finally, the audio signal is extracted with out error
Cross-layer Perceptual ARQ for Video Communications over 802.11e Wireless Networks
This work presents an application-level perceptual ARQ algorithm for video streaming over 802.11e wireless networks. A simple and effective formula is proposed to combine the perceptual and temporal importance of each packet into a single priority value, which is then used to drive the packet-selection process at each retransmission
opportunity. Compared to the standard 802.11 MAC-layer ARQ scheme, the proposed technique delivers higher perceptual quality because it can retransmit only the most perceptually important packets reducing retransmission bandwidth waste. Video streaming of H.264 test sequences has been simulated with ns in a realistic 802.11e home scenario, in which the various kinds of traffic flows have been assigned to different 802.11e access categories according to the Wi-Fi alliance WMM specification. Extensive simulations show that the proposed method consistently outperforms the standard link-layer 802.11 retransmission scheme, delivering PSNR gains up to 12 dB while achieving low transmission delay and limited impact on concurrent traffic. Moreover, comparisons with a MAC-level ARQ scheme which adapts the retry limit to the type of frame contained in packets and with an application-level deadline-based priority retransmission scheme show that the PSNR gain offered by the proposed algorithm is
significant, up to 5 dB. Additional results obtained in a scenario in which the transmission relies on an intermediate node (i.e., the access point) further confirms the consistency of the perceptual ARQ performance. Finally, results obtained by varying network conditions such as congestion and channel noise levels show the consistency of the
improvements achieved by the proposed algorithm
Low bit rate digital apeech signal processing systems
Imperial Users onl
A robust audio watermarking scheme based on reduced singular value decomposition and distortion removal
This paper presents a blind audio watermarking algorithm based on the reduced singular value decomposition(RSVD).
A new observation on one of the resulting unitary matrices is uncovered. The proposed scheme manipulates coefficients based on this observation in order to embed watermark bits. To preserve audio fidelity a threshold-
based distortion control technique is applied and this is further supplemented by distortion suppression utilizing psychoacoustic principles. Test results on real music
signals show that this watermarking scheme is in the range of imperceptibility for human hearing, is accurate and also robust against MP3 compression at various bit
rates as well as other selected attacks. The data payload is comparatively high compared to existing audio watermarking schemes
- ā¦