249 research outputs found
Bio-inspired broad-class phonetic labelling
Recent studies have shown that the correct labeling of phonetic classes may help current Automatic Speech Recognition (ASR) when combined with classical parsing automata based on Hidden Markov Models (HMM).Through the present paper a method for Phonetic Class Labeling (PCL) based on bio-inspired speech processing is described. The methodology is based in the automatic detection of formants and formant trajectories after a careful separation of the vocal and glottal components of speech and in the operation of CF (Characteristic Frequency) neurons in the cochlear nucleus and cortical complex of the human auditory apparatus. Examples of phonetic class labeling are given and the applicability of the method to Speech Processing is discussed
Model-based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids
Speech intelligibility is often severely degraded among hearing impaired
individuals in situations such as the cocktail party scenario. The performance
of the current hearing aid technology has been observed to be limited in these
scenarios. In this paper, we propose a binaural speech enhancement framework
that takes into consideration the speech production model. The enhancement
framework proposed here is based on the Kalman filter that allows us to take
the speech production dynamics into account during the enhancement process. The
usage of a Kalman filter requires the estimation of clean speech and noise
short term predictor (STP) parameters, and the clean speech pitch parameters.
In this work, a binaural codebook-based method is proposed for estimating the
STP parameters, and a directional pitch estimator based on the harmonic model
and maximum likelihood principle is used to estimate the pitch parameters. The
proposed method for estimating the STP and pitch parameters jointly uses the
information from left and right ears, leading to a more robust estimation of
the filter parameters. Objective measures such as PESQ and STOI have been used
to evaluate the enhancement framework in different acoustic scenarios
representative of the cocktail party scenario. We have also conducted
subjective listening tests on a set of nine normal hearing subjects, to
evaluate the performance in terms of intelligibility and quality improvement.
The listening tests show that the proposed algorithm, even with access to only
a single channel noisy observation, significantly improves the overall speech
quality, and the speech intelligibility by up to 15%.Comment: after revisio
A channel-selection criterion for suppressing reverberation in cochlear implants
This is the published version, also available here: http://dx.doi.org/10.1121/1.3559683.Little is known about the extent to which reverberation affects speech intelligibility by cochlear implant (CI) listeners. Experiment 1 assessed CI usersâ performance using Institute of Electrical and Electronics Engineers (IEEE) sentences corrupted with varying degrees of reverberation. Reverberation times of 0.30, 0.60, 0.80, and 1.0 s were used. Results indicated that for all subjects tested, speech intelligibility decreased exponentially with an increase in reverberation time. A decaying-exponential model provided an excellent fit to the data. Experiment 2 evaluated (offline) a speech coding strategy for reverberation suppression using a channel-selection criterion based on the signal-to-reverberant ratio (SRR) of individual frequency channels. The SRR reflects implicitly the ratio of the energies of the signal originating from the early (and direct) reflections and the signal originating from the late reflections. Channels with SRR larger than a preset threshold were selected, while channels with SRR smaller than the threshold were zeroed out. Results in a highly reverberant scenario indicated that the proposed strategy led to substantial gains (over 60 percentage points) in speech intelligibility over the subjectsâ daily strategy. Further analysis indicated that the proposed channel-selection criterion reduces the temporal envelope smearing effects introduced by reverberation and also diminishes the self-masking effects responsible for flattened formants
- âŚ