12,220 research outputs found
New methods for robust speech recognition
Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement
are developed for a robust speech recognition system.
The methods of feature extraction and end-point detection are based on
wavelet analysis or subband analysis of the speech signal. Two new sets of speech
feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter
sets are based on subband analysis. The SUBLSF feature parameters are obtained
via linear predictive analysis on subbands. These speech feature parameters
can produce better results than the full-band parameters when the noise is
colored. The SUBCEP parameters are based on wavelet analysis or equivalently
the multirate subband analysis of the speech signal. The SUBCEP parameters
also provide robust recognition performance by appropriately deemphasizing the
frequency bands corrupted by noise. It is experimentally observed that the
subband analysis based feature parameters are more robust than the commonly
used full-band analysis based parameters in the presence of car noise.
The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed
for Q-stable random processes. Adaptive noise cancelation techniques are used to
reduce the mismacth between training and testing conditions of the recognition
system over telephone lines.
Another important problem in isolated speech recognition is to determine
the boundaries of the speech utterances or words. Precise boundary detection
of utterances improves the performance of speech recognition systems. A new
distance measure based on the subband energy levels is introduced for endpoint
detection.Erzin, EnginPh.D
Sonification of guidance data during road crossing for people with visual impairments or blindness
In the last years several solutions were proposed to support people with
visual impairments or blindness during road crossing. These solutions focus on
computer vision techniques for recognizing pedestrian crosswalks and computing
their relative position from the user. Instead, this contribution addresses a
different problem; the design of an auditory interface that can effectively
guide the user during road crossing. Two original auditory guiding modes based
on data sonification are presented and compared with a guiding mode based on
speech messages.
Experimental evaluation shows that there is no guiding mode that is best
suited for all test subjects. The average time to align and cross is not
significantly different among the three guiding modes, and test subjects
distribute their preferences for the best guiding mode almost uniformly among
the three solutions. From the experiments it also emerges that higher effort is
necessary for decoding the sonified instructions if compared to the speech
instructions, and that test subjects require frequent `hints' (in the form of
speech messages). Despite this, more than 2/3 of test subjects prefer one of
the two guiding modes based on sonification. There are two main reasons for
this: firstly, with speech messages it is harder to hear the sound of the
environment, and secondly sonified messages convey information about the
"quantity" of the expected movement
Noise exposure and auditory thresholds of German airline pilots: a cross-sectional study
Objective: The cockpit workplace of airline pilots is a noisy environment. This study examines the hearing thresholds of pilots with respect to ambient noise and communication sound.
Methods: The hearing of 487 German pilots was analysed by audiometry in the frequency range of 125 Hz16 kHz in varying age groups. Cockpit noise (free-field) data and communication sound (acoustic manikin) measurements were evaluated.Results The ambient noise levels in cockpits were found to be between 74 and 80 dB(A), and the sound pressure levels under the headset were found to be between 84 and 88 dB(A).The leftright threshold differences at 3, 4 and 6 kHz show evidence of impaired hearing at the left ear, which worsens by age.In the age groups 40/=40 years the mean differences at 3 kHz are 2/3 dB, at 4 kHz 2/4 dB and at 6 kHz 1/6 dB.In the pilot group which used mostly the left ear for communication tasks (43 of 45 are in the older age group) the mean difference at 3?kHz is 6?dB, at 4 kHz 7 dB and at 6 kHz 10 dB. The pilots who used the headset only at the right ear also show worse hearing at the left ear of 2 dB at 3 kHz, 3 dB at 4 kHz and at 6 kHz. The frequency-corrected exposure levels under the headset are 711 dB(A) higher than the ambient noise with an averaged signal-to-noise ratio for communication of about 10 dB(A).
Conclusions: The left ear seems to be more susceptible to hearing loss than the right ear. Active noise reduction systems allow for a reduced sound level for the communication signal below the upper exposure action value of 85 dB(A) and allow for a more relaxed working environment for pilots
- …