Search CORE

375 research outputs found

Scalable and perceptual audio compression

Author: Raad Mohammed
Publication venue: School of Electrical, Computer and Telecommunications Engineering
Publication date: 01/01/2003
Field of study

This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

Research Online

Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)

Author: Huck R. W.
Rafferty William
Reekie D. Hugh M.
Publication venue
Publication date
Field of study

Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression

NASA Technical Reports Server

New techniques in signal coding

Author: Watkins Craig Robert
Publication venue
Publication date: 30/05/2018
Field of study

The Australian National University

Time and frequency domain algorithms for speech coding

Author: Francis S.C. Yeoh (7201616)
Publication venue
Publication date: 01/01/1983
Field of study

The promise of digital hardware economies (due to recent advances in VLSI technology), has focussed much attention on more complex and sophisticated speech coding algorithms which offer improved quality at relatively low bit rates. This thesis describes the results (obtained from computer simulations) of research into various efficient (time and frequency domain) speech encoders operating at a transmission bit rate of 16 Kbps. In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM) systems employing both forward and backward adaptive prediction were examined. A number of algorithms were proposed and evaluated, including several variants of the Stochastic Approximation Predictor (SAP). A Backward Block Adaptive (BBA) predictor was also developed and found to outperform the conventional stochastic methods, even though its complexity in terms of signal processing requirements is lower. A simplified Adaptive Predictive Coder (APC) employing a single tap pitch predictor considered next provided a slight improvement in performance over ADPCM, but with rather greater complexity. The ultimate test of any speech coding system is the perceptual performance of the received speech. Recent research has indicated that this may be enhanced by suitable control of the noise spectrum according to the theory of auditory masking. Various noise shaping ADPCM configurations were examined, and it was demonstrated that a proposed pre-/post-filtering arrangement which exploits advantageously the predictor-quantizer interaction, leads to the best subjective performance in both forward and backward prediction systems. Adaptive quantization is instrumental to the performance of ADPCM systems. Both the forward adaptive quantizer (AQF) and the backward oneword memory adaptation (AQJ) were examined. In addition, a novel method of decreasing quantization noise in ADPCM-AQJ coders, which involves the application of correction to the decoded speech samples, provided reduced output noise across the spectrum, with considerable high frequency noise suppression. More powerful (and inevitably more complex) frequency domain speech coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder (SBC) offer good quality speech at 16 Kbps. To reduce complexity and coding delay, whilst retaining the advantage of sub-band coding, a novel transform based split-band coder (TSBC) was developed and found to compare closely in performance with the SBC. To prevent the heavy side information requirement associated with a large number of bands in split-band coding schemes from impairing coding accuracy, without forgoing the efficiency provided by adaptive bit allocation, a method employing AQJs to code the sub-band signals together with vector quantization of the bit allocation patterns was also proposed. Finally, 'pipeline' methods of bit allocation and step size estimation (using the Fast Fourier Transform (FFT) on the input signal) were examined. Such methods, although less accurate, are nevertheless useful in limiting coding delay associated with SRC schemes employing Quadrature Mirror Filters (QMF)

Loughborough University Institutional Repository

Engineering evaluations and studies. Volume 3: Exhibit C

Author: Cheng U.
Dodds J. G.
Holmes J. K.
Huth G. K.
Iwasaki R. S.
Maronde R. G.
Nilsen P. W.
Polydoros A.
Roberts D.
Udalov S.
Publication venue
Publication date
Field of study

High rate multiplexes asymmetry and jitter, data-dependent amplitude variations, and transition density are discussed

NASA Technical Reports Server

Orthogonal transform feasibility study

Author: Robinson G. S.
Publication venue
Publication date
Field of study

The application of various orthogonal transformations to communication was investigated, with particular emphasis placed on speech and visual signal processing. The fundamentals of the one- and two-dimensional orthogonal transforms and their application to speech and visual signals are treated in detail

NASA Technical Reports Server

Improved compactly computable objective measures for predicting the acceptiability of speech communications systems

Author: Barnwell Thomas Pinkney
Publication venue: Georgia Institute of Technology
Publication date: 01/01/1983
Field of study

Issued as Monthly status reports [1-7], and Final report, Project no. E-21-61

Scholarly Materials And Research @ Georgia Tech

Emotion Recognition from Speech with Acoustic, Non-Linear and Wavelet-based Features Extracted in Different Acoustic Conditions

Author: Vásquez Correa Juan Camilo
Publication venue: Medellín, Colombia
Publication date: 01/01/2016
Field of study

ABSTRACT: In the last years, there has a great progress in automatic speech recognition. The challenge now it is not only recognize the semantic content in the speech but also the called "paralinguistic" aspects of the speech, including the emotions, and the personality of the speaker. This research work aims in the development of a methodology for the automatic emotion recognition from speech signals in non-controlled noise conditions. For that purpose, different sets of acoustic, non-linear, and wavelet based features are used to characterize emotions in different databases created for such purpose

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblioteca Digital del Sistema de Bibliotecas de la Universidad de Antioquia