1,007 research outputs found
Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?
When convolutional neural networks are used to tackle learning problems based
on music or, more generally, time series data, raw one-dimensional data are
commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients,
which are then used as input to the actual neural network. In this
contribution, we investigate, both theoretically and experimentally, the
influence of this pre-processing step on the network's performance and pose the
question, whether replacing it by applying adaptive or learned filters directly
to the raw data, can improve learning success. The theoretical results show
that approximately reproducing mel-spectrogram coefficients by applying
adaptive filters and subsequent time-averaging is in principle possible. We
also conducted extensive experimental work on the task of singing voice
detection in music. The results of these experiments show that for
classification based on Convolutional Neural Networks the features obtained
from adaptive filter banks followed by time-averaging perform better than the
canonical Fourier-transform-based mel-spectrogram coefficients. Alternative
adaptive approaches with center frequencies or time-averaging lengths learned
from training data perform equally well.Comment: Completely revised version; 21 pages, 4 figure
EEG Classification based on Image Configuration in Social Anxiety Disorder
The problem of detecting the presence of Social Anxiety Disorder (SAD) using
Electroencephalography (EEG) for classification has seen limited study and is
addressed with a new approach that seeks to exploit the knowledge of EEG sensor
spatial configuration. Two classification models, one which ignores the
configuration (model 1) and one that exploits it with different interpolation
methods (model 2), are studied. Performance of these two models is examined for
analyzing 34 EEG data channels each consisting of five frequency bands and
further decomposed with a filter bank. The data are collected from 64 subjects
consisting of healthy controls and patients with SAD. Validity of our
hypothesis that model 2 will significantly outperform model 1 is borne out in
the results, with accuracy -- higher for model 2 for each machine
learning algorithm we investigated. Convolutional Neural Networks (CNN) were
found to provide much better performance than SVM and kNNs
Novel Fourier Quadrature Transforms and Analytic Signal Representations for Nonlinear and Non-stationary Time Series Analysis
The Hilbert transform (HT) and associated Gabor analytic signal (GAS)
representation are well-known and widely used mathematical formulations for
modeling and analysis of signals in various applications. In this study, like
the HT, to obtain quadrature component of a signal, we propose the novel
discrete Fourier cosine quadrature transforms (FCQTs) and discrete Fourier sine
quadrature transforms (FSQTs), designated as Fourier quadrature transforms
(FQTs). Using these FQTs, we propose sixteen Fourier-Singh analytic signal
(FSAS) representations with following properties: (1) real part of eight FSAS
representations is the original signal and imaginary part is the FCQT of the
real part, (2) imaginary part of eight FSAS representations is the original
signal and real part is the FSQT of the real part, (3) like the GAS, Fourier
spectrum of the all FSAS representations has only positive frequencies, however
unlike the GAS, the real and imaginary parts of the proposed FSAS
representations are not orthogonal to each other. The Fourier decomposition
method (FDM) is an adaptive data analysis approach to decompose a signal into a
set of small number of Fourier intrinsic band functions which are AM-FM
components. This study also proposes a new formulation of the FDM using the
discrete cosine transform (DCT) with the GAS and FSAS representations, and
demonstrate its efficacy for improved time-frequency-energy representation and
analysis of nonlinear and non-stationary time series.Comment: 22 pages, 13 figure
Data-driven multivariate and multiscale methods for brain computer interface
This thesis focuses on the development of data-driven multivariate and multiscale methods
for brain computer interface (BCI) systems. The electroencephalogram (EEG), the
most convenient means to measure neurophysiological activity due to its noninvasive nature,
is mainly considered. The nonlinearity and nonstationarity inherent in EEG and its
multichannel recording nature require a new set of data-driven multivariate techniques to
estimate more accurately features for enhanced BCI operation. Also, a long term goal
is to enable an alternative EEG recording strategy for achieving long-term and portable
monitoring.
Empirical mode decomposition (EMD) and local mean decomposition (LMD), fully
data-driven adaptive tools, are considered to decompose the nonlinear and nonstationary
EEG signal into a set of components which are highly localised in time and frequency. It
is shown that the complex and multivariate extensions of EMD, which can exploit common
oscillatory modes within multivariate (multichannel) data, can be used to accurately
estimate and compare the amplitude and phase information among multiple sources, a
key for the feature extraction of BCI system. A complex extension of local mean decomposition
is also introduced and its operation is illustrated on two channel neuronal
spike streams. Common spatial pattern (CSP), a standard feature extraction technique
for BCI application, is also extended to complex domain using the augmented complex
statistics. Depending on the circularity/noncircularity of a complex signal, one of the
complex CSP algorithms can be chosen to produce the best classification performance
between two different EEG classes.
Using these complex and multivariate algorithms, two cognitive brain studies are
investigated for more natural and intuitive design of advanced BCI systems. Firstly, a Yarbus-style auditory selective attention experiment is introduced to measure the user
attention to a sound source among a mixture of sound stimuli, which is aimed at improving
the usefulness of hearing instruments such as hearing aid. Secondly, emotion experiments
elicited by taste and taste recall are examined to determine the pleasure and displeasure
of a food for the implementation of affective computing. The separation between two
emotional responses is examined using real and complex-valued common spatial pattern
methods.
Finally, we introduce a novel approach to brain monitoring based on EEG recordings
from within the ear canal, embedded on a custom made hearing aid earplug. The new
platform promises the possibility of both short- and long-term continuous use for standard
brain monitoring and interfacing applications
- …