13,354 research outputs found
Band-pass filtering of the time sequences of spectral parameters for robust wireless speech recognition
In this paper we address the problem of automatic speech recognition when wireless speech communication systems are involved. In this context, three main sources of distortion should be considered: acoustic environment, speech coding and transmission errors. Whilst the first one has already received a lot of attention, the last two deserve further investigation in our opinion. We have found out that band-pass filtering of the recognition features improves ASR performance when distortions due to these particular communication systems are present. Furthermore, we have evaluated two alternative configurations at different bit error rates (BER) typical of these channels: band-pass filtering the LP-MFCC parameters or a modification of the RASTA-PLP using a sharper low-pass section perform consistently better than LP-MFCC and RASTA-PLP, respectively.Publicad
A Subband-Based SVM Front-End for Robust ASR
This work proposes a novel support vector machine (SVM) based robust
automatic speech recognition (ASR) front-end that operates on an ensemble of
the subband components of high-dimensional acoustic waveforms. The key issues
of selecting the appropriate SVM kernels for classification in frequency
subbands and the combination of individual subband classifiers using ensemble
methods are addressed. The proposed front-end is compared with state-of-the-art
ASR front-ends in terms of robustness to additive noise and linear filtering.
Experiments performed on the TIMIT phoneme classification task demonstrate the
benefits of the proposed subband based SVM front-end: it outperforms the
standard cepstral front-end in the presence of noise and linear filtering for
signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed
front-end with a conventional front-end such as MFCC yields further
improvements over the individual front ends across the full range of noise
levels
A Framework for Bioacoustic Vocalization Analysis Using Hidden Markov Models
Using Hidden Markov Models (HMMs) as a recognition framework for automatic classification of animal vocalizations has a number of benefits, including the ability to handle duration variability through nonlinear time alignment, the ability to incorporate complex language or recognition constraints, and easy extendibility to continuous recognition and detection domains. In this work, we apply HMMs to several different species and bioacoustic tasks using generalized spectral features that can be easily adjusted across species and HMM network topologies suited to each task. This experimental work includes a simple call type classification task using one HMM per vocalization for repertoire analysis of Asian elephants, a language-constrained song recognition task using syllable models as base units for ortolan bunting vocalizations, and a stress stimulus differentiation task in poultry vocalizations using a non-sequential model via a one-state HMM with Gaussian mixtures. Results show strong performance across all tasks and illustrate the flexibility of the HMM framework for a variety of species, vocalization types, and analysis tasks
A Framework for Bioacoustic Vocalization Analysis Using Hidden Markov Models
Using Hidden Markov Models (HMMs) as a recognition framework for automatic classification of animal vocalizations has a number of benefits, including the ability to handle duration variability through nonlinear time alignment, the ability to incorporate complex language or recognition constraints, and easy extendibility to continuous recognition and detection domains. In this work, we apply HMMs to several different species and bioacoustic tasks using generalized spectral features that can be easily adjusted across species and HMM network topologies suited to each task. This experimental work includes a simple call type classification task using one HMM per vocalization for repertoire analysis of Asian elephants, a language-constrained song recognition task using syllable models as base units for ortolan bunting vocalizations, and a stress stimulus differentiation task in poultry vocalizations using a non-sequential model via a one-state HMM with Gaussian mixtures. Results show strong performance across all tasks and illustrate the flexibility of the HMM framework for a variety of species, vocalization types, and analysis tasks
Improving the robustness of the usual fbe-based asr front-end
All speech recognition systems require some form of signal representation that parametrically models the
temporal evolution of the spectral envelope. Current parameterizations involve, either explicitly or implicitly, a
set of energies from frequency bands which are often distributed in a mel scale. The computation of those filterbank
energies (FBE) always includes smoothing of basic spectral measurements and non-linear amplitude
compression. A variety of linear transformations are typically applied to this time-frequency representation prior
to the Hidden Markov Model (HMM) pattern-matching stage of recognition. In the paper, we will discuss some
robustness issues involved in both the computation of the FBEs and the posterior linear transformations,
presenting alternative techniques that can improve robustness in additive noise conditions. In particular, the root
non-linearity, a voicing-dependent FBE computation technique and a time&frequency filtering (tiffing)
technique will be considered. Recognition results for the Aurora database will be shown to illustrate the potential
application of these alternatives techniques for enhancing the robustness of speech recognition systems.Peer ReviewedPostprint (published version
IIR Adaptive Filters for Detection of Gravitational Waves from Coalescing Binaries
In this paper we propose a new strategy for gravitational waves detection
from coalescing binaries, using IIR Adaptive Line Enhancer (ALE) filters. This
strategy is a classical hierarchical strategy in which the ALE filters have the
role of triggers, used to select data chunks which may contain gravitational
events, to be further analyzed with more refined optimal techniques, like the
the classical Matched Filter Technique. After a direct comparison of the
performances of ALE filters with the Wiener-Komolgoroff optimum filters
(matched filters), necessary to discuss their performance and to evaluate the
statistical limitation in their use as triggers, we performed a series of
tests, demonstrating that these filters are quite promising both for the
relatively small computational power needed and for the robustness of the
algorithms used. The performed tests have shown a weak point of ALE filters,
that we fixed by introducing a further strategy, based on a dynamic bank of ALE
filters, running simultaneously, but started after fixed delay times. The
results of this global trigger strategy seems to be very promising, and can be
already used in the present interferometers, since it has the great advantage
of requiring a quite small computational power and can easily run in real-time,
in parallel with other data analysis algorithms.Comment: Accepted at SPIE: "Astronomical Telescopes and Instrumentation". 9
pages, 3 figure
- …