2,863 research outputs found

    Deep Long Short-Term Memory Adaptive Beamforming Networks For Multichannel Robust Speech Recognition

    Full text link
    Far-field speech recognition in noisy and reverberant conditions remains a challenging problem despite recent deep learning breakthroughs. This problem is commonly addressed by acquiring a speech signal from multiple microphones and performing beamforming over them. In this paper, we propose to use a recurrent neural network with long short-term memory (LSTM) architecture to adaptively estimate real-time beamforming filter coefficients to cope with non-stationary environmental noise and dynamic nature of source and microphones positions which results in a set of timevarying room impulse responses. The LSTM adaptive beamformer is jointly trained with a deep LSTM acoustic model to predict senone labels. Further, we use hidden units in the deep LSTM acoustic model to assist in predicting the beamforming filter coefficients. The proposed system achieves 7.97% absolute gain over baseline systems with no beamforming on CHiME-3 real evaluation set.Comment: in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP

    Features of hearing: applications of machine learning to uncover the building blocks of hearing

    Get PDF
    Recent advances in machine learning have instigated a renewed interest in using machine learning approaches to better understand human sensory processing. This line of research is particularly interesting for speech research since speech comprehension is uniquely human, which complicates obtaining detailed neural recordings. In this thesis, I explore how machine learning can be used to uncover new knowledge about the auditory system, with a focus on discovering robust auditory features. The resulting increased understanding of the noise robustness of human hearing may help to better assist those with hearing loss and improve Automatic Speech Recognition (ASR) systems. First, I show how computational neuroscience and machine learning can be combined to generate hypotheses about auditory features. I introduce a neural feature detection model with a modest number of parameters that is compatible with auditory physiology. By testing feature detector variants in a speech classification task, I confirm the importance of both well-studied and lesser-known auditory features. Second, I investigate whether ASR software is a good candidate model of the human auditory system. By comparing several state-of-the-art ASR systems to the results from humans on a range of psychometric experiments, I show that these ASR systems diverge markedly from humans in at least some psychometric tests. This implies that none of these systems act as a strong proxy for human speech recognition, although some may be useful when asking more narrowly defined questions. For neuroscientists, this thesis exemplifies how machine learning can be used to generate new hypotheses about human hearing, while also highlighting the caveats of investigating systems that may work fundamentally differently from the human brain. For machine learning engineers, I point to tangible directions for improving ASR systems. To motivate the continued cross-fertilization between these fields, a toolbox that allows researchers to assess new ASR systems has been released.Open Acces

    Lookup tables to compute high energy cosmic ray induced atmospheric ionization and changes in atmospheric chemistry

    Full text link
    A variety of events such as gamma-ray bursts and supernovae may expose the Earth to an increased flux of high-energy cosmic rays, with potentially important effects on the biosphere. Existing atmospheric chemistry software does not have the capability of incorporating the effects of substantial cosmic ray flux above 10 GeV . An atmospheric code, the NASA-Goddard Space Flight Center two-dimensional (latitude, altitude) time-dependent atmospheric model (NGSFC), is used to study atmospheric chemistry changes. Using CORSIKA, we have created tables that can be used to compute high energy cosmic ray (10 GeV - 1 PeV) induced atmospheric ionization and also, with the use of the NGSFC code, can be used to simulate the resulting atmospheric chemistry changes. We discuss the tables, their uses, weaknesses, and strengths.Comment: In press: Journal of Cosmology and Astroparticle Physics. 6 figures, 3 tables, two associated data files. Major revisions, including results of a greatly expanded computation, clarification and updated references. In the future we will expand the table to at least EeV levels
    • …
    corecore