Search CORE

3,323 research outputs found

Coding overcomplete representations of audio using the MCLT

Author: Malvar Henrique S.
Yoon Byung‐Jun
Publication venue: IEEE Computer Society
Publication date: 01/01/2008
Field of study

We propose a system for audio coding using the modulated complex lapped transform (MCLT). In general, it is difficult to encode signals using overcomplete representations without avoiding a penalty in rate-distortion performance. We show that the penalty can be significantly reduced for MCLT-based representations, without the need for iterative methods of sparsity reduction. We achieve that via a magnitude-phase polar quantization and the use of magnitude and phase prediction. Compared to systems based on quantization of orthogonal representations such as the modulated lapped transform (MLT), the new system allows for reduced warbling artifacts and more precise computation of frequency-domain auditory masking functions

CiteSeerX

Crossref

Caltech Authors

Digital Signal Processing

Author: Baggeroer Arthur B.
Bordley Thomas E.
Dove Webster P.
Duckworth Gregory L.
Frisk George V.
Harris David B.
Hayes Monson H.
Holtzman Samuel
Kurkjian Andrew L.
Lang Stephen W.
Lim Jae S.
Malik Naveed A.
Martinez David R.
McClellan James H.
Mook Douglas R.
Musicus Bruce R.
Nawab Syed H.
Oppenheim Alan V.
Portnoff Michael R.
Quartieri Thomas F., Jr.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date: 01/01/1980
Field of study

Contains research objectives and reports on sixteen research projects.U.S. Navy - Office of Naval Research (Contract N00014-75-C-0852)National Science Foundation FellowshipNational Science Foundation (Grant ENG76-24117)U.S. Navy - Office of Naval Research (Contract N00014-77-C-0257)U.S. Air Force (Contract F19628-80-C-0002)U.S. Navy - Office of Naval Research (Contract N00014-75-C-0951)Schlumberger-Doll Research Center FellowshipHertz Foundation FellowshipGovernment of Pakistan ScholarshipU.S. Navy - Office of Naval Research (Contract N00014-77-C-0196

DSpace@MIT

Expediting TTS Synthesis with Adversarial Vocoding

Author: Donahue Chris
Dubnov Shlomo
McAuley Julian
Neekhara Paarth
Puckette Miller
Publication venue
Publication date: 16/04/2019
Field of study

Recent approaches in text-to-speech (TTS) synthesis employ neural network strategies to vocode perceptually-informed spectrogram representations directly into listenable waveforms. Such vocoding procedures create a computational bottleneck in modern TTS pipelines. We propose an alternative approach which utilizes generative adversarial networks (GANs) to learn mappings from perceptually-informed spectrograms to simple magnitude spectrograms which can be heuristically vocoded. Through a user study, we show that our approach significantly outperforms na\"ive vocoding strategies while being hundreds of times faster than neural network vocoders used in state-of-the-art TTS systems. We also show that our method can be used to achieve state-of-the-art results in unsupervised synthesis of individual words of speech.Comment: Published as a conference paper at INTERSPEECH 201

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Comparison of Signal Reconstruction Methods for the Azimuth Discrimination and Resynthesis Algorithm

Author: Barry Dan
Coyle Eugene
Lawlor Bob
Publication venue
Publication date: 01/01/2005
Field of study

The Azimuth Discrimination and Resynthesis algorithm, (ADRess), has been shown to produce high quality sound source separation results for intensity panned stereo recordings. There are however, artifacts such as phasiness which become apparent in the separated signals under certain conditions. This is largely due to the fact that only the magnitude spectra for the separated sources are estimated. Each source is then resynthesised using the phase information obtained from the original mixture. This paper describes the nature and origin of the associated artifacts and proposes alternative techniques for resynthesising the separated signals. A comparison of each technique is then presented

MURAL - Maynooth University Research Archive Library