2,062 research outputs found
Discriminative Tandem Features for HMM-based EEG Classification
AbstractâWe investigate the use of discriminative feature extractors in tandem configuration with generative EEG classification system. Existing studies on dynamic EEG classification typically use hidden Markov models (HMMs) which lack discriminative capability. In this paper, a linear and a non-linear classifier are discriminatively trained to produce complementary input features to the conventional HMM system. Two sets of tandem features are derived from linear discriminant analysis (LDA) projection output and multilayer perceptron (MLP) class-posterior probability, before appended to the standard autoregressive (AR) features. Evaluation on a two-class motor-imagery classification task shows that both the proposed tandem features yield consistent gains over the AR baseline, resulting in significant relative improvement of 6.2% and 11.2 % for the LDA and MLP features respectively. We also explore portability of these features across different subjects. Index Terms- Artificial neural network-hidden Markov models, EEG classification, brain-computer-interface (BCI)
2-D Prony-Huang Transform: A New Tool for 2-D Spectral Analysis
This work proposes an extension of the 1-D Hilbert Huang transform for the
analysis of images. The proposed method consists in (i) adaptively decomposing
an image into oscillating parts called intrinsic mode functions (IMFs) using a
mode decomposition procedure, and (ii) providing a local spectral analysis of
the obtained IMFs in order to get the local amplitudes, frequencies, and
orientations. For the decomposition step, we propose two robust 2-D mode
decompositions based on non-smooth convex optimization: a "Genuine 2-D"
approach, that constrains the local extrema of the IMFs, and a "Pseudo 2-D"
approach, which constrains separately the extrema of lines, columns, and
diagonals. The spectral analysis step is based on Prony annihilation property
that is applied on small square patches of the IMFs. The resulting 2-D
Prony-Huang transform is validated on simulated and real data.Comment: 24 pages, 7 figure
A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition
This article provides a unifying Bayesian network view on various approaches
for acoustic model adaptation, missing feature, and uncertainty decoding that
are well-known in the literature of robust automatic speech recognition. The
representatives of these classes can often be deduced from a Bayesian network
that extends the conventional hidden Markov models used in speech recognition.
These extensions, in turn, can in many cases be motivated from an underlying
observation model that relates clean and distorted feature vectors. By
converting the observation models into a Bayesian network representation, we
formulate the corresponding compensation rules leading to a unified view on
known derivations as well as to new formulations for certain approaches. The
generic Bayesian perspective provided in this contribution thus highlights
structural differences and similarities between the analyzed approaches
Stereophonic noise reduction using a combined sliding subspace projection and adaptive signal enhancement
A novel stereophonic noise reduction method is proposed. This method is based upon a combination of a subspace approach realized in a sliding window operation and two-channel adaptive signal enhancing. The signal obtained from the signal subspace is used as the input signal to the adaptive signal enhancer for each channel, instead of noise, as in the ordinary adaptive noise canceling scheme. Simulation results based upon real stereophonic speech contaminated by noise components show that the proposed method gives improved enhancement quality in terms of both segmental gain and cepstral distance performance indices in comparison with conventional nonlinear spectral subtraction approaches
Similarity-and-Independence-Aware Beamformer: Method for Target Source Extraction using Magnitude Spectrogram as Reference
This study presents a novel method for source extraction, referred to as the
similarity-and-independence-aware beamformer (SIBF). The SIBF extracts the
target signal using a rough magnitude spectrogram as the reference signal. The
advantage of the SIBF is that it can obtain an accurate target signal, compared
to the spectrogram generated by target-enhancing methods such as the speech
enhancement based on deep neural networks (DNNs). For the extraction, we extend
the framework of the deflationary independent component analysis, by
considering the similarity between the reference and extracted target, as well
as the mutual independence of all potential sources. To solve the extraction
problem by maximum-likelihood estimation, we introduce two source model types
that can reflect the similarity. The experimental results from the CHiME3
dataset show that the target signal extracted by the SIBF is more accurate than
the reference signal generated by the DNN.
Index Terms: semiblind source separation, similarity-and-independence-aware
beamformer, deflationary independent component analysis, source modelComment: Accepted in INTERSPEECH 202
Convexity in source separation: Models, geometry, and algorithms
Source separation or demixing is the process of extracting multiple
components entangled within a signal. Contemporary signal processing presents a
host of difficult source separation problems, from interference cancellation to
background subtraction, blind deconvolution, and even dictionary learning.
Despite the recent progress in each of these applications, advances in
high-throughput sensor technology place demixing algorithms under pressure to
accommodate extremely high-dimensional signals, separate an ever larger number
of sources, and cope with more sophisticated signal and mixing models. These
difficulties are exacerbated by the need for real-time action in automated
decision-making systems.
Recent advances in convex optimization provide a simple framework for
efficiently solving numerous difficult demixing problems. This article provides
an overview of the emerging field, explains the theory that governs the
underlying procedures, and surveys algorithms that solve them efficiently. We
aim to equip practitioners with a toolkit for constructing their own demixing
algorithms that work, as well as concrete intuition for why they work
End-to-End Probabilistic Inference for Nonstationary Audio Analysis
Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a spectral mixture Gaussian process model with nonstationary priors over the amplitude variance parameters. Further, we formulate this nonlinear model's state space representation, making it amenable to infinite-horizon Gaussian process regression with approximate inference via expectation propagation, which scales linearly in the number of time steps and quadratically in the state dimensionality. By doing so, we are able to process audio signals with hundreds of thousands of data points. We demonstrate, on various tasks with empirical data, how this inference scheme outperforms more standard techniques that rely on extended Kalman filtering
- âŠ