13,896 research outputs found
Deep Spoken Keyword Spotting:An Overview
Spoken keyword spotting (KWS) deals with the identification of keywords in
audio streams and has become a fast-growing technology thanks to the paradigm
shift introduced by deep learning a few years ago. This has allowed the rapid
embedding of deep KWS in a myriad of small electronic devices with different
purposes like the activation of voice assistants. Prospects suggest a sustained
growth in terms of social use of this technology. Thus, it is not surprising
that deep KWS has become a hot research topic among speech scientists, who
constantly look for KWS performance improvement and computational complexity
reduction. This context motivates this paper, in which we conduct a literature
review into deep spoken KWS to assist practitioners and researchers who are
interested in this technology. Specifically, this overview has a comprehensive
nature by covering a thorough analysis of deep KWS systems (which includes
speech features, acoustic modeling and posterior handling), robustness methods,
applications, datasets, evaluation metrics, performance of deep KWS systems and
audio-visual KWS. The analysis performed in this paper allows us to identify a
number of directions for future research, including directions adopted from
automatic speech recognition research and directions that are unique to the
problem of spoken KWS
Neural activity classification with machine learning models trained on interspike interval series data
The flow of information through the brain is reflected by the activity
patterns of neural cells. Indeed, these firing patterns are widely used as
input data to predictive models that relate stimuli and animal behavior to the
activity of a population of neurons. However, relatively little attention was
paid to single neuron spike trains as predictors of cell or network properties
in the brain. In this work, we introduce an approach to neuronal spike train
data mining which enables effective classification and clustering of neuron
types and network activity states based on single-cell spiking patterns. This
approach is centered around applying state-of-the-art time series
classification/clustering methods to sequences of interspike intervals recorded
from single neurons. We demonstrate good performance of these methods in tasks
involving classification of neuron type (e.g. excitatory vs. inhibitory cells)
and/or neural circuit activity state (e.g. awake vs. REM sleep vs. nonREM sleep
states) on an open-access cortical spiking activity dataset
Recommended from our members
An end-to-end framework for real-time automatic sleep stage classification.
Sleep staging is a fundamental but time consuming process in any sleep laboratory. To greatly speed up sleep staging without compromising accuracy, we developed a novel framework for performing real-time automatic sleep stage classification. The client-server architecture adopted here provides an end-to-end solution for anonymizing and efficiently transporting polysomnography data from the client to the server and for receiving sleep stages in an interoperable fashion. The framework intelligently partitions the sleep staging task between the client and server in a way that multiple low-end clients can work with one server, and can be deployed both locally as well as over the cloud. The framework was tested on four datasets comprising ≈1700 polysomnography records (≈12000 hr of recordings) collected from adolescents, young, and old adults, involving healthy persons as well as those with medical conditions. We used two independent validation datasets: one comprising patients from a sleep disorders clinic and the other incorporating patients with Parkinson's disease. Using this system, an entire night's sleep was staged with an accuracy on par with expert human scorers but much faster (≈5 s compared with 30-60 min). To illustrate the utility of such real-time sleep staging, we used it to facilitate the automatic delivery of acoustic stimuli at targeted phase of slow-sleep oscillations to enhance slow-wave sleep
Holographic Detection and Reduction of Wind Noise
Many devices that include built-in microphone(s) are used in windy situations. Wind noise degrades the quality of audio detected by the microphone(s), causes microphone signal saturation at high wind speeds, causes nonlinear acoustic echo, and reduces the performance of acoustic echo cancellation (AEC). Applications such as voice‐trigger, automatic speech recognition (ASR), and voice over internet protocol (VoIP) communication are negatively impacted by such degradation.
This disclosure describes cost‐effective and robust techniques to detect and reduce wind noise. The described techniques deliver optimum removal and detection results by processing the audio signal in a holographic way by dealing with all related domains including time, frequency, and 3D space. This approach can improve the audio detection performance of any device that incorporates the techniques and can thereby improve the user experience of various applications such as voice-trigger, speech recognition, voice communication, event detection, etc. even on devices that have limited computational capability
Sleep Stage Classification: A Deep Learning Approach
Sleep occupies significant part of human life. The diagnoses of sleep related disorders are of great importance. To record specific physical and electrical activities of the brain and body, a multi-parameter test, called polysomnography (PSG), is normally used. The visual process of sleep stage classification is time consuming, subjective and costly. To improve the accuracy and efficiency of the sleep stage classification, automatic classification algorithms were developed.
In this research work, we focused on pre-processing (filtering boundaries and de-noising algorithms) and classification steps of automatic sleep stage classification. The main motivation for this work was to develop a pre-processing and classification framework to clean the input EEG signal without manipulating the original data thus enhancing the learning stage of deep learning classifiers.
For pre-processing EEG signals, a lossless adaptive artefact removal method was proposed. Rather than other works that used artificial noise, we used real EEG data contaminated with EOG and EMG for evaluating the proposed method. The proposed adaptive algorithm led to a significant enhancement in the overall classification accuracy. In the classification area, we evaluated the performance of the most common sleep stage classifiers using a comprehensive set of features extracted from PSG signals. Considering the challenges and limitations of conventional methods, we proposed two deep learning-based methods for classification of sleep stages based on Stacked Sparse AutoEncoder (SSAE) and Convolutional Neural Network (CNN). The proposed methods performed more efficiently by eliminating the need for conventional feature selection and feature extraction steps respectively. Moreover, although our systems were trained with lower number of samples compared to the similar studies, they were able to achieve state of art accuracy and higher overall sensitivity
- …