3,556 research outputs found
A Comparative Study of Machine Learning Models for Tabular Data Through Challenge of Monitoring Parkinson's Disease Progression Using Voice Recordings
People with Parkinson's disease must be regularly monitored by their
physician to observe how the disease is progressing and potentially adjust
treatment plans to mitigate the symptoms. Monitoring the progression of the
disease through a voice recording captured by the patient at their own home can
make the process faster and less stressful. Using a dataset of voice recordings
of 42 people with early-stage Parkinson's disease over a time span of 6 months,
we applied multiple machine learning techniques to find a correlation between
the voice recording and the patient's motor UPDRS score. We approached this
problem using a multitude of both regression and classification techniques.
Much of this paper is dedicated to mapping the voice data to motor UPDRS scores
using regression techniques in order to obtain a more precise value for unknown
instances. Through this comparative study of variant machine learning methods,
we realized some old machine learning methods like trees outperform cutting
edge deep learning models on numerous tabular datasets.Comment: Accepted at "HIMS'20 - The 6th Int'l Conf on Health Informatics and
Medical Systems"; https://americancse.org/events/csce2020/conferences/hims2
An Empirical Evaluation of Zero Resource Acoustic Unit Discovery
Acoustic unit discovery (AUD) is a process of automatically identifying a
categorical acoustic unit inventory from speech and producing corresponding
acoustic unit tokenizations. AUD provides an important avenue for unsupervised
acoustic model training in a zero resource setting where expert-provided
linguistic knowledge and transcribed speech are unavailable. Therefore, to
further facilitate zero-resource AUD process, in this paper, we demonstrate
acoustic feature representations can be significantly improved by (i)
performing linear discriminant analysis (LDA) in an unsupervised self-trained
fashion, and (ii) leveraging resources of other languages through building a
multilingual bottleneck (BN) feature extractor to give effective cross-lingual
generalization. Moreover, we perform comprehensive evaluations of AUD efficacy
on multiple downstream speech applications, and their correlated performance
suggests that AUD evaluations are feasible using different alternative language
resources when only a subset of these evaluation resources can be available in
typical zero resource applications.Comment: 5 pages, 1 figure; Accepted for publication at ICASSP 201
Encoding of phonology in a recurrent neural model of grounded speech
We study the representation and encoding of phonemes in a recurrent neural
network model of grounded speech. We use a model which processes images and
their spoken descriptions, and projects the visual and auditory representations
into the same semantic space. We perform a number of analyses on how
information about individual phonemes is encoded in the MFCC features extracted
from the speech signal, and the activations of the layers of the model. Via
experiments with phoneme decoding and phoneme discrimination we show that
phoneme representations are most salient in the lower layers of the model,
where low-level signals are processed at a fine-grained level, although a large
amount of phonological information is retain at the top recurrent layer. We
further find out that the attention mechanism following the top recurrent layer
significantly attenuates encoding of phonology and makes the utterance
embeddings much more invariant to synonymy. Moreover, a hierarchical clustering
of phoneme representations learned by the network shows an organizational
structure of phonemes similar to those proposed in linguistics.Comment: Accepted at CoNLL 201
- …