31,174 research outputs found

    Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models

    Full text link
    In this paper, we describe how to efficiently implement an acoustic room simulator to generate large-scale simulated data for training deep neural networks. Even though Google Room Simulator in [1] was shown to be quite effective in reducing the Word Error Rates (WERs) for far-field applications by generating simulated far-field training sets, it requires a very large number of Fast Fourier Transforms (FFTs) of large size. Room Simulator in [1] used approximately 80 percent of Central Processing Unit (CPU) usage in our CPU + Graphics Processing Unit (GPU) training architecture [2]. In this work, we implement an efficient OverLap Addition (OLA) based filtering using the open-source FFTW3 library. Further, we investigate the effects of the Room Impulse Response (RIR) lengths. Experimentally, we conclude that we can cut the tail portions of RIRs whose power is less than 20 dB below the maximum power without sacrificing the speech recognition accuracy. However, we observe that cutting RIR tail more than this threshold harms the speech recognition accuracy for rerecorded test sets. Using these approaches, we were able to reduce CPU usage for the room simulator portion down to 9.69 percent in CPU/GPU training architecture. Profiling result shows that we obtain 22.4 times speed-up on a single machine and 37.3 times speed up on Google's distributed training infrastructure.Comment: Published at INTERSPEECH 2018. (https://www.isca-speech.org/archive/Interspeech_2018/abstracts/2566.html

    Asynchrony in image analysis: using the luminance-to-response-latency relationship to improve segmentation

    Get PDF
    We deal with the probiem of segmenting static images, a procedure known to be difficult in the case of very noisy patterns, The proposed approach rests on the transformation of a static image into a data flow in which the first image points to be processed are the brighter ones. This solution, inspired by human perception, in which strong luminances elicit reactions from the visual system before weaker ones, has led to the notion of asynchronous processing. The asynchronous processing of image points has required the design of a specific architecture that exploits time differences in the processing of information. The results otained when very noisy images are segmented demonstrate the strengths of this architecture; they also suggest extensions of the approach to other computer vision problem

    Understanding Mobile Search Task Relevance and User Behaviour in Context

    Full text link
    Improvements in mobile technologies have led to a dramatic change in how and when people access and use information, and is having a profound impact on how users address their daily information needs. Smart phones are rapidly becoming our main method of accessing information and are frequently used to perform `on-the-go' search tasks. As research into information retrieval continues to evolve, evaluating search behaviour in context is relatively new. Previous research has studied the effects of context through either self-reported diary studies or quantitative log analysis; however, neither approach is able to accurately capture context of use at the time of searching. In this study, we aim to gain a better understanding of task relevance and search behaviour via a task-based user study (n=31) employing a bespoke Android app. The app allowed us to accurately capture the user's context when completing tasks at different times of the day over the period of a week. Through analysis of the collected data, we gain a better understanding of how using smart phones on the go impacts search behaviour, search performance and task relevance and whether or not the actual context is an important factor.Comment: To appear in CHIIR 2019 in Glasgow, U

    Independent component approach to the analysis of EEG and MEG recordings

    Get PDF
    Multichannel recordings of the electromagnetic fields emerging from neural currents in the brain generate large amounts of data. Suitable feature extraction methods are, therefore, useful to facilitate the representation and interpretation of the data. Recently developed independent component analysis (ICA) has been shown to be an efficient tool for artifact identification and extraction from electroencephalographic (EEG) and magnetoen- cephalographic (MEG) recordings. In addition, ICA has been ap- plied to the analysis of brain signals evoked by sensory stimuli. This paper reviews our recent results in this field

    The reentry hypothesis: The putative interaction of the frontal eye field, ventrolateral prefrontal cortex, and areas V4, IT for attention and eye movement

    Get PDF
    Attention is known to play a key role in perception, including action selection, object recognition and memory. Despite findings revealing competitive interactions among cell populations, attention remains difficult to explain. The central purpose of this paper is to link up a large number of findings in a single computational approach. Our simulation results suggest that attention can be well explained on a network level involving many areas of the brain. We argue that attention is an emergent phenomenon that arises from reentry and competitive interactions. We hypothesize that guided visual search requires the usage of an object-specific template in prefrontal cortex to sensitize V4 and IT cells whose preferred stimuli match the target template. This induces a feature-specific bias and provides guidance for eye movements. Prior to an eye movement, a spatially organized reentry from occulomotor centers, specifically the movement cells of the frontal eye field, occurs and modulates the gain of V4 and IT cells. The processes involved are elucidated by quantitatively comparing the time course of simulated neural activity with experimental data. Using visual search tasks as an example, we provide clear and empirically testable predictions for the participation of IT, V4 and the frontal eye field in attention. Finally, we explain a possible physiological mechanism that can lead to non-flat search slopes as the result of a slow, parallel discrimination process
    • …
    corecore