12,220 research outputs found

    New methods for robust speech recognition

    Get PDF
    Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement are developed for a robust speech recognition system. The methods of feature extraction and end-point detection are based on wavelet analysis or subband analysis of the speech signal. Two new sets of speech feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter sets are based on subband analysis. The SUBLSF feature parameters are obtained via linear predictive analysis on subbands. These speech feature parameters can produce better results than the full-band parameters when the noise is colored. The SUBCEP parameters are based on wavelet analysis or equivalently the multirate subband analysis of the speech signal. The SUBCEP parameters also provide robust recognition performance by appropriately deemphasizing the frequency bands corrupted by noise. It is experimentally observed that the subband analysis based feature parameters are more robust than the commonly used full-band analysis based parameters in the presence of car noise. The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed for Q-stable random processes. Adaptive noise cancelation techniques are used to reduce the mismacth between training and testing conditions of the recognition system over telephone lines. Another important problem in isolated speech recognition is to determine the boundaries of the speech utterances or words. Precise boundary detection of utterances improves the performance of speech recognition systems. A new distance measure based on the subband energy levels is introduced for endpoint detection.Erzin, EnginPh.D

    Sonification of guidance data during road crossing for people with visual impairments or blindness

    Get PDF
    In the last years several solutions were proposed to support people with visual impairments or blindness during road crossing. These solutions focus on computer vision techniques for recognizing pedestrian crosswalks and computing their relative position from the user. Instead, this contribution addresses a different problem; the design of an auditory interface that can effectively guide the user during road crossing. Two original auditory guiding modes based on data sonification are presented and compared with a guiding mode based on speech messages. Experimental evaluation shows that there is no guiding mode that is best suited for all test subjects. The average time to align and cross is not significantly different among the three guiding modes, and test subjects distribute their preferences for the best guiding mode almost uniformly among the three solutions. From the experiments it also emerges that higher effort is necessary for decoding the sonified instructions if compared to the speech instructions, and that test subjects require frequent `hints' (in the form of speech messages). Despite this, more than 2/3 of test subjects prefer one of the two guiding modes based on sonification. There are two main reasons for this: firstly, with speech messages it is harder to hear the sound of the environment, and secondly sonified messages convey information about the "quantity" of the expected movement

    Estimation of glottal closure instants in voiced speech using the DYPSA algorithm

    Get PDF
    Published versio

    Noise exposure and auditory thresholds of German airline pilots: a cross-sectional study

    Get PDF
    Objective: The cockpit workplace of airline pilots is a noisy environment. This study examines the hearing thresholds of pilots with respect to ambient noise and communication sound. Methods: The hearing of 487 German pilots was analysed by audiometry in the frequency range of 125 Hz–16 kHz in varying age groups. Cockpit noise (free-field) data and communication sound (acoustic manikin) measurements were evaluated.Results The ambient noise levels in cockpits were found to be between 74 and 80 dB(A), and the sound pressure levels under the headset were found to be between 84 and 88 dB(A).The left–right threshold differences at 3, 4 and 6 kHz show evidence of impaired hearing at the left ear, which worsens by age.In the age groups 40/=40 years the mean differences at 3 kHz are 2/3 dB, at 4 kHz 2/4 dB and at 6 kHz 1/6 dB.In the pilot group which used mostly the left ear for communication tasks (43 of 45 are in the older age group) the mean difference at 3?kHz is 6?dB, at 4 kHz 7 dB and at 6 kHz 10 dB. The pilots who used the headset only at the right ear also show worse hearing at the left ear of 2 dB at 3 kHz, 3 dB at 4 kHz and at 6 kHz. The frequency-corrected exposure levels under the headset are 7–11 dB(A) higher than the ambient noise with an averaged signal-to-noise ratio for communication of about 10 dB(A). Conclusions: The left ear seems to be more susceptible to hearing loss than the right ear. Active noise reduction systems allow for a reduced sound level for the communication signal below the upper exposure action value of 85 dB(A) and allow for a more relaxed working environment for pilots
    corecore