138 research outputs found
Розробка основних елементів інформаційної технології аналізу мімічних проявів для систем інтерактивного вивчення жестової мови
У роботі розглянуто основні елементи інформаційної технології аналізу мімічних проявів для системи інтерактивного вивчення жестової мови. Описано основні компоненти інформаційної технології, її експериментальна реалізація. Проведено аналіз впливу кількості ознак, обсягу навчальної вибірки і типу класифікатора на кількість помилок 1 і 2 роду. Удосконалено існуючі алгоритми ідентифікації мімічних проявів шляхом вибору оптимального набору конструктів мультикласифікаторів.This article describes the elements of information technology for analysis of facial expressions to apply in interactive study of sign language. The main elements of information technology, its structure and experimental implementation has been discussed. Analysis was carried out in order to identify how the classifier error rate depends on type of classifier, number of feature and teach set. Optimal constructs of classifiers were proposed, giving drastic improvement of existing algorithms
DeepCough: A Deep Convolutional Neural Network in A Wearable Cough Detection System
In this paper, we present a system that employs a wearable acoustic sensor
and a deep convolutional neural network for detecting coughs. We evaluate the
performance of our system on 14 healthy volunteers and compare it to that of
other cough detection systems that have been reported in the literature.
Experimental results show that our system achieves a classification sensitivity
of 95.1% and a specificity of 99.5%.Comment: BioCAS-201
Speaker recognition by means of restricted Boltzmann machine adaptation
Restricted Boltzmann Machines (RBMs) have shown success in speaker recognition. In this paper, RBMs are investigated in a framework comprising a universal model training and model adaptation. Taking advantage of RBM unsupervised learning algorithm, a global model is trained based on all available background data. This general speaker-independent model, referred to as URBM, is further adapted to the data of a specific speaker to build speaker-dependent model. In order to show its effectiveness, we have applied this framework to two different tasks. It has been used to discriminatively model target and impostor spectral features for classification. It has been also utilized to produce a vector-based representation for speakers. This vector-based representation, similar to i-vector, can be further used for speaker recognition using either cosine scoring or Probabilistic Linear Discriminant Analysis (PLDA). The evaluation is performed on the core test condition of the NIST SRE 2006 database.Peer ReviewedPostprint (author's final draft
Multiscale approaches to music audio feature learning
Content-based music information retrieval tasks are typically solved with a two-stage approach: features are extracted from music audio signals, and are then used as input to a regressor or classifier. These features can be engineered or learned from data. Although the former approach was dominant in the past, feature learning has started to receive more attention from the MIR community in recent years. Recent results in feature learning indicate that simple algorithms such as K-means can be very effective, sometimes surpassing more complicated approaches based on restricted Boltzmann machines, autoencoders or sparse coding. Furthermore, there has been increased interest in multiscale representations of music audio recently. Such representations are more versatile because music audio exhibits structure on multiple timescales, which are relevant for different MIR tasks to varying degrees. We develop and compare three approaches to multiscale audio feature learning using the spherical K-means algorithm. We evaluate them in an automatic tagging task and a similarity metric learning task on the Magnatagatune dataset
End-to-end Phoneme Sequence Recognition using Convolutional Neural Networks
Most phoneme recognition state-of-the-art systems rely on a classical neural
network classifiers, fed with highly tuned features, such as MFCC or PLP
features. Recent advances in ``deep learning'' approaches questioned such
systems, but while some attempts were made with simpler features such as
spectrograms, state-of-the-art systems still rely on MFCCs. This might be
viewed as a kind of failure from deep learning approaches, which are often
claimed to have the ability to train with raw signals, alleviating the need of
hand-crafted features. In this paper, we investigate a convolutional neural
network approach for raw speech signals. While convolutional architectures got
tremendous success in computer vision or text processing, they seem to have
been let down in the past recent years in the speech processing field. We show
that it is possible to learn an end-to-end phoneme sequence classifier system
directly from raw signal, with similar performance on the TIMIT and WSJ
datasets than existing systems based on MFCC, questioning the need of complex
hand-crafted features on large datasets.Comment: NIPS Deep Learning Workshop, 201
Neonatal Seizure Detection using Convolutional Neural Networks
This study presents a novel end-to-end architecture that learns hierarchical
representations from raw EEG data using fully convolutional deep neural
networks for the task of neonatal seizure detection. The deep neural network
acts as both feature extractor and classifier, allowing for end-to-end
optimization of the seizure detector. The designed system is evaluated on a
large dataset of continuous unedited multi-channel neonatal EEG totaling 835
hours and comprising of 1389 seizures. The proposed deep architecture, with
sample-level filters, achieves an accuracy that is comparable to the
state-of-the-art SVM-based neonatal seizure detector, which operates on a set
of carefully designed hand-crafted features. The fully convolutional
architecture allows for the localization of EEG waveforms and patterns that
result in high seizure probabilities for further clinical examination.Comment: IEEE International Workshop on Machine Learning for Signal Processin
- …