Search CORE

22 research outputs found

Phonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures

Author: Bouvrie Jake
Chikkerur Sharat
Ezzat Tony
Kouh Minjoon
Poggio Tomaso
Rifkin Ryan
Schutte Ken
Publication venue
Publication date: 01/01/2007
Field of study

A preliminary set of experiments are described in which a biologically-inspired computer vision system (Serre, Wolf et al. 2005; Serre 2006; Serre, Oliva et al. 2006; Serre, Wolf et al. 2006) designed for visual object recognition was applied to the task of phonetic classification. During learning, the systemprocessed 2-D wideband magnitude spectrograms directly as images, producing a set of 2-D spectrotemporal patch dictionaries at different spectro-temporal positions, orientations, scales, and of varying complexity. During testing, features were computed by comparing the stored patches with patches fromnovel spectrograms. Classification was performed using a regularized least squares classifier (Rifkin, Yeo et al. 2003; Rifkin, Schutte et al. 2007) trained on the features computed by the system. On a 20-class TIMIT vowel classification task, the model features achieved a best result of 58.74% error, compared to 48.57% error using state-of-the-art MFCC-based features trained using the same classifier. This suggests that hierarchical, feed-forward, spectro-temporal patch-based architectures may be useful for phoneticanalysis

CiteSeerX

DSpace@MIT

Autoregressive Modelling of Hilbert Envelopes for Wide-band Audio Coding

Author: Ganapathy Sriram
Garudadri Harinath
Hermansky Hynek
Motlicek Petr
Publication venue
Publication date: 11/02/2010
Field of study

Frequency Domain Linear Prediction (FDLP) represents the technique for approximating temporal envelopes of a signal using autoregressive models. In this paper, we propose a wide-band audio coding system exploiting FDLP. Specifically, FDLP is applied on critically sampled sub-bands to model the Hilbert envelopes. The residual of the linear prediction forms the Hilbert carrier, which is transmitted along with the envelope parameters. This process is reversed at the decoder to reconstruct the signal. In the objective and subjective quality evaluations, the FDLP based audio codec at

66

kbps provides competitive results compared to the state-of-art codecs at similar bit-rates

Infoscience - École polytechnique fédérale de Lausanne

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

Author
Publication venue: Springer
Publication date: 24/05/2009
Field of study

Springer - Publisher Connector

Multi-stream adaptive evidence combination for noise robust ASR

Author: Bourlard Hervé
Glotin Hervé
Hagen Astrid
Morris Andrew
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

In this paper we develop different mathematical models in the framework of the multi-stream paradigm for noise robust ASR, and discuss their close relationship with human speech perception. Largely inspired by Fletcher's "product-of-errors" rule in psychoacoustics, multi-band ASR aims for robustness to data mismatch through the exploitation of spectral redundancy, while making minimum assumptions about noise type. Previous ASR tests have shown that independent sub-band processing can lead to decreased recognition performance with clean speech. We have overcome this problem by considering every combination of data sub-bands as an independent data stream. After introducing the background to multi-band ASR, we show how this "full combination" approach can be formalised, in the context of HMM/ANN based ASR, by introducing a latent variable to specify which data sub-bands in each data frame are free from data mismatch. This enables us to decompose the posterior probability for each phoneme into a reliability weighted integral over all possible positions of clean data. This approach offers great potential for adaptation to rapidly changing and unpredictable noise

Infoscience - École polytechnique fédérale de Lausanne

A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions

Author: Illina Irina
Sivasankaran Sunit
Vincent Emmanuel
Publication venue: 'Elsevier BV'
Publication date: 08/02/2017
Field of study

International audienceRobustness to reverberation is a key concern for distant-microphone ASR. Various approaches have been proposed, including single-channel or multichannel dereverberation, robust feature extraction, alternative acoustic models, and acoustic model adaptation. However, to the best of our knowledge, a detailed study of these techniques in varied reverberation conditions is still missing in the literature. In this paper, we conduct a series of experiments to assess the impact of various dereverberation and acoustic model adaptation approaches on the ASR performance in the range of reverberation conditions found in real domestic environments. We consider both established approaches such as WPE and newer approaches such as learning hidden unit contribution (LHUC) adaptations, whose performance has not been reported before in this context, and we employ them in combination. Our results indicate that performing weighted prediction error (WPE) dereverberation on a reverberated test speech utterance and decoding using an deep neural network (DNN) acoustic model trained with multi-condition reverberated speech with feature-space maximum likelihood linear regression (fMLLR) transformed features, outperforms more recent approaches and helps significantly reduce the word error rate (WER)

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Multi-stream adaptive evidence combination for noise robust ASR

Author: Allen
Andrew Morris
Astrid Hagen
Bishop
Bourlard
Duda
Fletcher
Hermansky
Hermansky
Hervé Bourlard
Hervé Glotin
Jordan
Kingsbury
Moore
Pickles
Rao
Richard
Steeneken
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Solving Demodulation as an Optimization Problem

Author: Gregory Sell
Malcolm Slaney
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Multi-task hidden Markov modeling of spectrogram feature from radar high-resolution range profiles

Author: AR Webb
B Chen
B Pei
BED Kingsbury
F Zhu
J Chai
J Paisley
J Winn
J Zwart
JL Walker
K Copsey
K Ni
K Ni
L Du
L Du
L Du
L Du
LR Rabiner
M-D Xing
MI Jordan
MJ Beal
MJ Beal
R Caruana
R Vander Heiden
RA Mitchell
SZ Gürbüz
TT Wong
WG Carrara
WG Carrara
X-J Liao
Y Teh
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

New techniques for vibration condition monitoring: Volterra kernel and Kolmogorov-Smirnov

Author: Andrade Francisco Arruda Raposo
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/1999
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.This research presents a complete review of signal processing techniques used, today, in vibration based industrial condition monitoring and diagnostics. It also introduces two novel techniques to this field, namely: the Kolmogorov-Smirnov test and Volterra series, which have not yet been applied to vibration based condition monitoring. The first technique, the Kolmogorov-Smirnov test, relies on a statistical comparison of the cumulative probability distribution functions (CDF) from two time series. It must be emphasised that this is not a moment technique, and it uses the whole CDF, in the comparison process. The second tool suggested in this research is the Volterra series. This is a non-linear signal processing technique, which can be used to model a time series. The parameters of this model are used for condition monitoring applications. Finally, this work also presents a comprehensive comparative study between these new methods and the existing techniques. This study is based on results from numerical and experimental applications of each technique here discussed. The concluding remarks include suggestions on how the novel techniques proposed here can be improved.Brunel University Department of Mechanical Engineering and CAPES, Fundacao Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior

Brunel University Research Archive