31,174 research outputs found
Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models
In this paper, we describe how to efficiently implement an acoustic room
simulator to generate large-scale simulated data for training deep neural
networks. Even though Google Room Simulator in [1] was shown to be quite
effective in reducing the Word Error Rates (WERs) for far-field applications by
generating simulated far-field training sets, it requires a very large number
of Fast Fourier Transforms (FFTs) of large size. Room Simulator in [1] used
approximately 80 percent of Central Processing Unit (CPU) usage in our CPU +
Graphics Processing Unit (GPU) training architecture [2]. In this work, we
implement an efficient OverLap Addition (OLA) based filtering using the
open-source FFTW3 library. Further, we investigate the effects of the Room
Impulse Response (RIR) lengths. Experimentally, we conclude that we can cut the
tail portions of RIRs whose power is less than 20 dB below the maximum power
without sacrificing the speech recognition accuracy. However, we observe that
cutting RIR tail more than this threshold harms the speech recognition accuracy
for rerecorded test sets. Using these approaches, we were able to reduce CPU
usage for the room simulator portion down to 9.69 percent in CPU/GPU training
architecture. Profiling result shows that we obtain 22.4 times speed-up on a
single machine and 37.3 times speed up on Google's distributed training
infrastructure.Comment: Published at INTERSPEECH 2018.
(https://www.isca-speech.org/archive/Interspeech_2018/abstracts/2566.html
Asynchrony in image analysis: using the luminance-to-response-latency relationship to improve segmentation
We deal with the probiem of segmenting static images, a procedure known to be difficult in the case of very
noisy patterns, The proposed approach rests on the transformation of a static image into a data flow in which
the first image points to be processed are the brighter ones. This solution, inspired by human perception, in
which strong luminances elicit reactions from the visual system before weaker ones, has led to the notion of
asynchronous processing. The asynchronous processing of image points has required the design of a specific
architecture that exploits time differences in the processing of information. The results otained when very
noisy images are segmented demonstrate the strengths of this architecture; they also suggest extensions of
the approach to other computer vision problem
Understanding Mobile Search Task Relevance and User Behaviour in Context
Improvements in mobile technologies have led to a dramatic change in how and
when people access and use information, and is having a profound impact on how
users address their daily information needs. Smart phones are rapidly becoming
our main method of accessing information and are frequently used to perform
`on-the-go' search tasks. As research into information retrieval continues to
evolve, evaluating search behaviour in context is relatively new. Previous
research has studied the effects of context through either self-reported diary
studies or quantitative log analysis; however, neither approach is able to
accurately capture context of use at the time of searching. In this study, we
aim to gain a better understanding of task relevance and search behaviour via a
task-based user study (n=31) employing a bespoke Android app. The app allowed
us to accurately capture the user's context when completing tasks at different
times of the day over the period of a week. Through analysis of the collected
data, we gain a better understanding of how using smart phones on the go
impacts search behaviour, search performance and task relevance and whether or
not the actual context is an important factor.Comment: To appear in CHIIR 2019 in Glasgow, U
Independent component approach to the analysis of EEG and MEG recordings
Multichannel recordings of the electromagnetic fields
emerging from neural currents in the brain generate large amounts
of data. Suitable feature extraction methods are, therefore, useful
to facilitate the representation and interpretation of the data.
Recently developed independent component analysis (ICA) has
been shown to be an efficient tool for artifact identification and
extraction from electroencephalographic (EEG) and magnetoen-
cephalographic (MEG) recordings. In addition, ICA has been ap-
plied to the analysis of brain signals evoked by sensory stimuli. This
paper reviews our recent results in this field
The reentry hypothesis: The putative interaction of the frontal eye field, ventrolateral prefrontal cortex, and areas V4, IT for attention and eye movement
Attention is known to play a key role in perception, including action selection, object recognition and memory. Despite findings revealing competitive interactions among cell populations, attention remains difficult to explain. The central purpose of this paper is to link up a large number of findings in a single computational approach. Our simulation results suggest that attention can be well explained on a network level involving many areas of the brain. We argue that attention is an emergent phenomenon that arises from reentry and competitive interactions. We hypothesize that guided visual search requires the usage of an object-specific template in prefrontal cortex to sensitize V4 and IT cells whose preferred stimuli match the target template. This induces a feature-specific bias and provides guidance for eye movements. Prior to an eye movement, a spatially organized reentry from occulomotor centers, specifically the movement cells of the frontal eye field, occurs and modulates the gain of V4 and IT cells. The processes involved are elucidated by quantitatively comparing the time course of simulated neural activity with experimental data. Using visual search tasks as an example, we provide clear and empirically testable predictions for the participation of IT, V4 and the frontal eye field in attention. Finally, we explain a possible physiological mechanism that can lead to non-flat search slopes as the result of a slow, parallel discrimination process
Recommended from our members
Midbrain Dopamine Neurons Signal Belief in Choice Accuracy during a Perceptual Decision
Central to the organization of behavior is the ability to predict the values of outcomes to guide choices. The accuracy of such predictions is honed by a teaching signal that indicates how incorrect a prediction was (âreward prediction error,â RPE). In several reinforcement learning contexts, such as Pavlovian conditioning and decisions guided by reward history, this RPE signal is provided by midbrain dopamine neurons. In many situations, however, the stimuli predictive of outcomes are perceptually ambiguous. Perceptual uncertainty is known to influence choices, but it has been unclear whether or how dopamine neurons factor it into their teaching signal. To cope with uncertainty, we extended a reinforcement learning model with a belief state about the perceptually ambiguous stimulus; this model generates an estimate of the probability of choice correctness, termed decision confidence. We show that dopamine responses in monkeys performing a perceptually ambiguous decision task comply with the modelâs predictions. Consequently, dopamine responses did not simply reflect a stimulusâ average expected reward value but were predictive of the trial-to-trial fluctuations in perceptual accuracy. These confidence-dependent dopamine responses emerged prior to monkeysâ choice initiation, raising the possibility that dopamine impacts impending decisions, in addition to encoding a post-decision teaching signal. Finally, by manipulating reward size, we found that dopamine neurons reflect both the upcoming reward size and the confidence in achieving it. Together, our results show that dopamine responses convey teaching signals that are also appropriate for perceptual decisions
- âŚ