Search CORE

12,220 research outputs found

New methods for robust speech recognition

Author: Erzin Engin
Publication venue: Bilkent University
Publication date: 01/01/1995
Field of study

Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement are developed for a robust speech recognition system. The methods of feature extraction and end-point detection are based on wavelet analysis or subband analysis of the speech signal. Two new sets of speech feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter sets are based on subband analysis. The SUBLSF feature parameters are obtained via linear predictive analysis on subbands. These speech feature parameters can produce better results than the full-band parameters when the noise is colored. The SUBCEP parameters are based on wavelet analysis or equivalently the multirate subband analysis of the speech signal. The SUBCEP parameters also provide robust recognition performance by appropriately deemphasizing the frequency bands corrupted by noise. It is experimentally observed that the subband analysis based feature parameters are more robust than the commonly used full-band analysis based parameters in the presence of car noise. The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed for Q-stable random processes. Adaptive noise cancelation techniques are used to reduce the mismacth between training and testing conditions of the recognition system over telephone lines. Another important problem in isolated speech recognition is to determine the boundaries of the speech utterances or words. Precise boundary detection of utterances improves the performance of speech recognition systems. A new distance measure based on the subband energy levels is introduced for endpoint detection.Erzin, EnginPh.D

Bilkent University Institutional Repository

Sonification of guidance data during road crossing for people with visual impairments or blindness

Author: Ahmetovic Dragan
Bernareggi Cristian
Gerino Andrea
Mascetti Sergio
Picinali Lorenzo
Publication venue
Publication date: 24/06/2015
Field of study

In the last years several solutions were proposed to support people with visual impairments or blindness during road crossing. These solutions focus on computer vision techniques for recognizing pedestrian crosswalks and computing their relative position from the user. Instead, this contribution addresses a different problem; the design of an auditory interface that can effectively guide the user during road crossing. Two original auditory guiding modes based on data sonification are presented and compared with a guiding mode based on speech messages. Experimental evaluation shows that there is no guiding mode that is best suited for all test subjects. The average time to align and cross is not significantly different among the three guiding modes, and test subjects distribute their preferences for the best guiding mode almost uniformly among the three solutions. From the experiments it also emerges that higher effort is necessary for decoding the sonified instructions if compared to the speech instructions, and that test subjects require frequent `hints' (in the form of speech messages). Despite this, more than 2/3 of test subjects prefer one of the two guiding modes based on sonification. There are two main reasons for this: firstly, with speech messages it is harder to hear the sound of the environment, and secondly sonified messages convey information about the "quantity" of the expected movement

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

Spiral - Imperial College Digital Repository

Estimation of glottal closure instants in voiced speech using the DYPSA algorithm

Author: Brookes M
Gudnason J
Kounoudes A
Naylor PA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Published versio

Spiral - Imperial College Digital Repository

Noise exposure and auditory thresholds of German airline pilots: a cross-sectional study

Author: Müller Reinhard
Schneider Joachim
Publication venue: FB 11 - Medizin. Medizin
Publication date: 01/01/2017
Field of study

Objective: The cockpit workplace of airline pilots is a noisy environment. This study examines the hearing thresholds of pilots with respect to ambient noise and communication sound. Methods: The hearing of 487 German pilots was analysed by audiometry in the frequency range of 125 Hz16 kHz in varying age groups. Cockpit noise (free-field) data and communication sound (acoustic manikin) measurements were evaluated.Results The ambient noise levels in cockpits were found to be between 74 and 80 dB(A), and the sound pressure levels under the headset were found to be between 84 and 88 dB(A).The leftright threshold differences at 3, 4 and 6 kHz show evidence of impaired hearing at the left ear, which worsens by age.In the age groups 40/=40 years the mean differences at 3 kHz are 2/3 dB, at 4 kHz 2/4 dB and at 6 kHz 1/6 dB.In the pilot group which used mostly the left ear for communication tasks (43 of 45 are in the older age group) the mean difference at 3?kHz is 6?dB, at 4 kHz 7 dB and at 6 kHz 10 dB. The pilots who used the headset only at the right ear also show worse hearing at the left ear of 2 dB at 3 kHz, 3 dB at 4 kHz and at 6 kHz. The frequency-corrected exposure levels under the headset are 711 dB(A) higher than the ambient noise with an averaged signal-to-noise ratio for communication of about 10 dB(A). Conclusions: The left ear seems to be more susceptible to hearing loss than the right ear. Active noise reduction systems allow for a reduced sound level for the communication signal below the upper exposure action value of 85 dB(A) and allow for a more relaxed working environment for pilots

Giessener Elektronische Bibliothek