197 research outputs found

    IIR modeling of interpositional transfer functions with a genetic algorithm aided by an adaptive filter for the purpose of altering free-field sound localization

    Get PDF
    The psychoacoustic process of sound localization is a system of complex analysis. Scientists have found evidence that both binaural and monaural cues are responsible for determining the angles of elevation and azimuth which represent a sound source. Engineers have successfully used these cues to build mathematical localization systems. Research has indicated that spectral cues play an important role in 3-d localization. Therefore, it seems conceivable to design a filtering system which can alter the localization of a sound source, either for correctional purposes or listener preference. Such filters, known as Interpositional Transfer Functions, can be formed from division in the z-domain of Head-related Transfer Functions. HRTF’s represent the free-field response of the human body to sound processed by the ears. In filtering applications, the use of IIR filters is often favored over that of FIR filters due to their preservation of resolution while minimizing the number of required coefficients. Several methods exist for creating IIR filters from their representative FIR counterparts. For complicated filters, genetic algorithms (GAs) have proven effective. The research summarized in this thesis combines the past efforts of researchers in the fields of sound localization, genetic algorithms, and adaptive filtering. It represents the initial stage in the development of a practical system for future hardware implementation which uses a genetic algorithm as a driving engine. Under ideal conditions, an IIR filter design system has been demonstrated to successfully model several IPTF pairs which alter sound localization when applied to non-minimum phase HRTF’s obtained from free-field measurement

    Deep Learning for Distant Speech Recognition

    Full text link
    Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence. Among the other achievements, building computers that understand speech represents a crucial leap towards intelligent machines. Despite the great efforts of the past decades, however, a natural and robust human-machine speech interaction still appears to be out of reach, especially when users interact with a distant microphone in noisy and reverberant environments. The latter disturbances severely hamper the intelligibility of a speech signal, making Distant Speech Recognition (DSR) one of the major open challenges in the field. This thesis addresses the latter scenario and proposes some novel techniques, architectures, and algorithms to improve the robustness of distant-talking acoustic models. We first elaborate on methodologies for realistic data contamination, with a particular emphasis on DNN training with simulated data. We then investigate on approaches for better exploiting speech contexts, proposing some original methodologies for both feed-forward and recurrent neural networks. Lastly, inspired by the idea that cooperation across different DNNs could be the key for counteracting the harmful effects of noise and reverberation, we propose a novel deep learning paradigm called network of deep neural networks. The analysis of the original concepts were based on extensive experimental validations conducted on both real and simulated data, considering different corpora, microphone configurations, environments, noisy conditions, and ASR tasks.Comment: PhD Thesis Unitn, 201

    Shape and Deformation Analysis of the Human Ear Canal

    Get PDF

    Sequential grouping constraints on across-channel auditory processing

    Get PDF

    The 4th Conference of PhD Students in Computer Science

    Get PDF

    Acoustical measurements on stages of nine U.S. concert halls

    Get PDF
    • …
    corecore