2,264 research outputs found
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
Detecting single-trial EEG evoked potential using a wavelet domain linear mixed model: application to error potentials classification
Objective. The main goal of this work is to develop a model for multi-sensor
signals such as MEG or EEG signals, that accounts for the inter-trial
variability, suitable for corresponding binary classification problems. An
important constraint is that the model be simple enough to handle small size
and unbalanced datasets, as often encountered in BCI type experiments.
Approach. The method involves linear mixed effects statistical model, wavelet
transform and spatial filtering, and aims at the characterization of localized
discriminant features in multi-sensor signals. After discrete wavelet transform
and spatial filtering, a projection onto the relevant wavelet and spatial
channels subspaces is used for dimension reduction. The projected signals are
then decomposed as the sum of a signal of interest (i.e. discriminant) and
background noise, using a very simple Gaussian linear mixed model. Main
results. Thanks to the simplicity of the model, the corresponding parameter
estimation problem is simplified. Robust estimates of class-covariance matrices
are obtained from small sample sizes and an effective Bayes plug-in classifier
is derived. The approach is applied to the detection of error potentials in
multichannel EEG data, in a very unbalanced situation (detection of rare
events). Classification results prove the relevance of the proposed approach in
such a context. Significance. The combination of linear mixed model, wavelet
transform and spatial filtering for EEG classification is, to the best of our
knowledge, an original approach, which is proven to be effective. This paper
improves on earlier results on similar problems, and the three main ingredients
all play an important role
A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification
One of the biggest challenges of acoustic scene classification (ASC) is to
find proper features to better represent and characterize environmental sounds.
Environmental sounds generally involve more sound sources while exhibiting less
structure in temporal spectral representations. However, the background of an
acoustic scene exhibits temporal homogeneity in acoustic properties, suggesting
it could be characterized by distribution statistics rather than temporal
details. In this work, we investigated using auditory summary statistics as the
feature for ASC tasks. The inspiration comes from a recent neuroscience study,
which shows the human auditory system tends to perceive sound textures through
time-averaged statistics. Based on these statistics, we further proposed to use
linear discriminant analysis to eliminate redundancies among these statistics
while keeping the discriminative information, providing an extreme com-pact
representation for acoustic scenes. Experimental results show the outstanding
performance of the proposed feature over the conventional handcrafted features.Comment: Accepted as a conference paper of Interspeech 201
Graph Filters for Signal Processing and Machine Learning on Graphs
Filters are fundamental in extracting information from data. For time series
and image data that reside on Euclidean domains, filters are the crux of many
signal processing and machine learning techniques, including convolutional
neural networks. Increasingly, modern data also reside on networks and other
irregular domains whose structure is better captured by a graph. To process and
learn from such data, graph filters account for the structure of the underlying
data domain. In this article, we provide a comprehensive overview of graph
filters, including the different filtering categories, design strategies for
each type, and trade-offs between different types of graph filters. We discuss
how to extend graph filters into filter banks and graph neural networks to
enhance the representational power; that is, to model a broader variety of
signal classes, data patterns, and relationships. We also showcase the
fundamental role of graph filters in signal processing and machine learning
applications. Our aim is that this article provides a unifying framework for
both beginner and experienced researchers, as well as a common understanding
that promotes collaborations at the intersections of signal processing, machine
learning, and application domains
Optimization of data-driven filterbank for automatic speaker verification
Most of the speech processing applications use triangular filters spaced in
mel-scale for feature extraction. In this paper, we propose a new data-driven
filter design method which optimizes filter parameters from a given speech
data. First, we introduce a frame-selection based approach for developing
speech-signal-based frequency warping scale. Then, we propose a new method for
computing the filter frequency responses by using principal component analysis
(PCA). The main advantage of the proposed method over the recently introduced
deep learning based methods is that it requires very limited amount of
unlabeled speech-data. We demonstrate that the proposed filterbank has more
speaker discriminative power than commonly used mel filterbank as well as
existing data-driven filterbank. We conduct automatic speaker verification
(ASV) experiments with different corpora using various classifier back-ends. We
show that the acoustic features created with proposed filterbank are better
than existing mel-frequency cepstral coefficients (MFCCs) and
speech-signal-based frequency cepstral coefficients (SFCCs) in most cases. In
the experiments with VoxCeleb1 and popular i-vector back-end, we observe 9.75%
relative improvement in equal error rate (EER) over MFCCs. Similarly, the
relative improvement is 4.43% with recently introduced x-vector system. We
obtain further improvement using fusion of the proposed method with standard
MFCC-based approach.Comment: Published in Digital Signal Processing journal (Elsevier
Data-driven multivariate and multiscale methods for brain computer interface
This thesis focuses on the development of data-driven multivariate and multiscale methods
for brain computer interface (BCI) systems. The electroencephalogram (EEG), the
most convenient means to measure neurophysiological activity due to its noninvasive nature,
is mainly considered. The nonlinearity and nonstationarity inherent in EEG and its
multichannel recording nature require a new set of data-driven multivariate techniques to
estimate more accurately features for enhanced BCI operation. Also, a long term goal
is to enable an alternative EEG recording strategy for achieving long-term and portable
monitoring.
Empirical mode decomposition (EMD) and local mean decomposition (LMD), fully
data-driven adaptive tools, are considered to decompose the nonlinear and nonstationary
EEG signal into a set of components which are highly localised in time and frequency. It
is shown that the complex and multivariate extensions of EMD, which can exploit common
oscillatory modes within multivariate (multichannel) data, can be used to accurately
estimate and compare the amplitude and phase information among multiple sources, a
key for the feature extraction of BCI system. A complex extension of local mean decomposition
is also introduced and its operation is illustrated on two channel neuronal
spike streams. Common spatial pattern (CSP), a standard feature extraction technique
for BCI application, is also extended to complex domain using the augmented complex
statistics. Depending on the circularity/noncircularity of a complex signal, one of the
complex CSP algorithms can be chosen to produce the best classification performance
between two different EEG classes.
Using these complex and multivariate algorithms, two cognitive brain studies are
investigated for more natural and intuitive design of advanced BCI systems. Firstly, a Yarbus-style auditory selective attention experiment is introduced to measure the user
attention to a sound source among a mixture of sound stimuli, which is aimed at improving
the usefulness of hearing instruments such as hearing aid. Secondly, emotion experiments
elicited by taste and taste recall are examined to determine the pleasure and displeasure
of a food for the implementation of affective computing. The separation between two
emotional responses is examined using real and complex-valued common spatial pattern
methods.
Finally, we introduce a novel approach to brain monitoring based on EEG recordings
from within the ear canal, embedded on a custom made hearing aid earplug. The new
platform promises the possibility of both short- and long-term continuous use for standard
brain monitoring and interfacing applications
- …