2,512 research outputs found
Coding overcomplete representations of audio using the MCLT
We propose a system for audio coding using the modulated complex
lapped transform (MCLT). In general, it is difficult to encode signals using
overcomplete representations without avoiding a penalty in rate-distortion
performance. We show that the penalty can be significantly reduced for
MCLT-based representations, without the need for iterative methods of
sparsity reduction. We achieve that via a magnitude-phase polar quantization
and the use of magnitude and phase prediction. Compared to systems based
on quantization of orthogonal representations such as the modulated lapped
transform (MLT), the new system allows for reduced warbling artifacts and
more precise computation of frequency-domain auditory masking functions
Audio Compression using a Modified Vector Quantization algorithm for Mastering Applications
Audio data compression is used to reduce the transmission bandwidth and storage requirements of audio data. It is the second stage in the audio mastering process with audio equalization being the first stage. Compression algorithms such as BSAC, MP3 and AAC are used as standards in this paper. The challenge faced in audio compression is compressing the signal at low bit rates. The previous algorithms which work well at low bit rates cannot be dominant at higher bit rates and vice-versa. This paper proposes an altered form of vector quantization algorithm which produces a scalable bit stream which has a number of fine layers of audio fidelity. This modified form of the vector quantization algorithm is used to generate a perceptually audio coder which is scalable and uses the quantization and encoding stages which are responsible for the psychoacoustic and arithmetical terminations that are actually detached as practically all the data detached during the prediction phases at the encoder side is supplemented towards the audio signal at decoder stage. Therefore, clearly the quantization phase which is modified to produce a bit stream which is scalable. This modified algorithm works well at both lower and higher bit rates. Subjective evaluations were done by audio professionals using the MUSHRA test and the mean normalized scores at various bit rates was noted and compared with the previous algorithms
High Quality Audio Coding with MDCTNet
We propose a neural audio generative model, MDCTNet, operating in the
perceptually weighted domain of an adaptive modified discrete cosine transform
(MDCT). The architecture of the model captures correlations in both time and
frequency directions with recurrent layers (RNNs). An audio coding system is
obtained by training MDCTNet on a diverse set of fullband monophonic audio
signals at 48 kHz sampling, conditioned by a perceptual audio encoder. In a
subjective listening test with ten excerpts chosen to be balanced across
content types, yet stressful for both codecs, the mean performance of the
proposed system for 24 kb/s variable bitrate (VBR) is similar to that of Opus
at twice the bitrate.Comment: Five pages, five figure
Advanced Television and Signal Processing Program
Contains an introduction and reports on two research projects.Advanced Television Research Progra
Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction Propagation Networks
Spectral sub-bands do not portray the same perceptual relevance. In audio
coding, it is therefore desirable to have independent control over each of the
constituent bands so that bitrate assignment and signal reconstruction can be
achieved efficiently. In this work, we present a novel neural audio coding
network that natively supports a multi-band coding paradigm. Our model extends
the idea of compressed skip connections in the U-Net-based codec, allowing for
independent control over both core and high band-specific reconstructions and
bit allocation. Our system reconstructs the full-band signal mainly from the
condensed core-band code, therefore exploiting and showcasing its bandwidth
extension capabilities to its fullest. Meanwhile, the low-bitrate high-band
code helps the high-band reconstruction similarly to MPEG audio codecs'
spectral bandwidth replication. MUSHRA tests show that the proposed model not
only improves the quality of the core band by explicitly assigning more bits to
it but retains a good quality in the high-band as well.Comment: Accepted to ICASSP 2023. For resources and examples, see
https://saige.sice.indiana.edu/research-projects/HARP-Net
- …