5,269 research outputs found

    Audio Analysis/synthesis System

    Get PDF
    A method and apparatus for the automatic analysis, synthesis and modification of audio signals, based on an overlap-add sinusoidal model, is disclosed. Automatic analysis of amplitude, frequency and phase parameters of the model is achieved using an analysis-by-synthesis procedure which incorporates successive approximation, yielding synthetic waveforms which are very good approximations to the original waveforms and are perceptually identical to the original sounds. A generalized overlap-add sinusoidal model is introduced which can modify audio signals without objectionable artifacts. In addition, a new approach to pitch-scale modification allows for the use of arbitrary spectral envelope estimates and addresses the problems of high-frequency loss and noise amplification encountered with prior art methods. The overlap-add synthesis method provides the ability to synthesize sounds with computational efficiency rivaling that of synthesis using the discrete short-time Fourier transform (DSTFT) while eliminating the modification artifacts associated with that method.Georgia Tech Research Corporatio

    A variable rate speech compressor for mobile applications

    Get PDF
    One of the most promising speech coder at the bit rate of 9.6 to 4.8 kbits/s is CELP. Code Excited Linear Prediction (CELP) has been dominating 9.6 to 4.8 kbits/s region during the past 3 to 4 years. Its set back however, is its expensive implementation. As an alternative to CELP, the Base-Band CELP (CELP-BB) was developed which produced good quality speech comparable to CELP and a single chip implementable complexity as reported previously. Its robustness was also improved to tolerate errors up to 1.0 pct. and maintain intelligibility up to 5.0 pct. and more. Although, CELP-BB produces good quality speech at around 4.8 kbits/s, it has a fundamental problem when updating the pitch filter memory. A sub-optimal solution is proposed for this problem. Below 4.8 kbits/s, however, CELP-BB suffers from noticeable quantization noise as a result of the large vector dimensions used. Efficient representation of speech below 4.8 kbits/s is reported by introducing Sinusoidal Transform Coding (STC) to represent the LPC excitation which is called Sine Wave Excited LPC (SWELP). In this case, natural sounding good quality synthetic speech is obtained at around 2.4 kbits/s

    Analysis, Visualization, and Transformation of Audio Signals Using Dictionary-based Methods

    Get PDF
    date-added: 2014-01-07 09:15:58 +0000 date-modified: 2014-01-07 09:15:58 +0000date-added: 2014-01-07 09:15:58 +0000 date-modified: 2014-01-07 09:15:58 +000

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner
    • …
    corecore