2,062 research outputs found

    Discriminative Tandem Features for HMM-based EEG Classification

    Get PDF
    Abstract—We investigate the use of discriminative feature extractors in tandem configuration with generative EEG classification system. Existing studies on dynamic EEG classification typically use hidden Markov models (HMMs) which lack discriminative capability. In this paper, a linear and a non-linear classifier are discriminatively trained to produce complementary input features to the conventional HMM system. Two sets of tandem features are derived from linear discriminant analysis (LDA) projection output and multilayer perceptron (MLP) class-posterior probability, before appended to the standard autoregressive (AR) features. Evaluation on a two-class motor-imagery classification task shows that both the proposed tandem features yield consistent gains over the AR baseline, resulting in significant relative improvement of 6.2% and 11.2 % for the LDA and MLP features respectively. We also explore portability of these features across different subjects. Index Terms- Artificial neural network-hidden Markov models, EEG classification, brain-computer-interface (BCI)

    <strong>Non-Gaussian, Non-stationary and Nonlinear Signal Processing Methods - with Applications to Speech Processing and Channel Estimation</strong>

    Get PDF

    2-D Prony-Huang Transform: A New Tool for 2-D Spectral Analysis

    Full text link
    This work proposes an extension of the 1-D Hilbert Huang transform for the analysis of images. The proposed method consists in (i) adaptively decomposing an image into oscillating parts called intrinsic mode functions (IMFs) using a mode decomposition procedure, and (ii) providing a local spectral analysis of the obtained IMFs in order to get the local amplitudes, frequencies, and orientations. For the decomposition step, we propose two robust 2-D mode decompositions based on non-smooth convex optimization: a "Genuine 2-D" approach, that constrains the local extrema of the IMFs, and a "Pseudo 2-D" approach, which constrains separately the extrema of lines, columns, and diagonals. The spectral analysis step is based on Prony annihilation property that is applied on small square patches of the IMFs. The resulting 2-D Prony-Huang transform is validated on simulated and real data.Comment: 24 pages, 7 figure

    A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

    Full text link
    This article provides a unifying Bayesian network view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules leading to a unified view on known derivations as well as to new formulations for certain approaches. The generic Bayesian perspective provided in this contribution thus highlights structural differences and similarities between the analyzed approaches

    Stereophonic noise reduction using a combined sliding subspace projection and adaptive signal enhancement

    Get PDF
    A novel stereophonic noise reduction method is proposed. This method is based upon a combination of a subspace approach realized in a sliding window operation and two-channel adaptive signal enhancing. The signal obtained from the signal subspace is used as the input signal to the adaptive signal enhancer for each channel, instead of noise, as in the ordinary adaptive noise canceling scheme. Simulation results based upon real stereophonic speech contaminated by noise components show that the proposed method gives improved enhancement quality in terms of both segmental gain and cepstral distance performance indices in comparison with conventional nonlinear spectral subtraction approaches

    Similarity-and-Independence-Aware Beamformer: Method for Target Source Extraction using Magnitude Spectrogram as Reference

    Full text link
    This study presents a novel method for source extraction, referred to as the similarity-and-independence-aware beamformer (SIBF). The SIBF extracts the target signal using a rough magnitude spectrogram as the reference signal. The advantage of the SIBF is that it can obtain an accurate target signal, compared to the spectrogram generated by target-enhancing methods such as the speech enhancement based on deep neural networks (DNNs). For the extraction, we extend the framework of the deflationary independent component analysis, by considering the similarity between the reference and extracted target, as well as the mutual independence of all potential sources. To solve the extraction problem by maximum-likelihood estimation, we introduce two source model types that can reflect the similarity. The experimental results from the CHiME3 dataset show that the target signal extracted by the SIBF is more accurate than the reference signal generated by the DNN. Index Terms: semiblind source separation, similarity-and-independence-aware beamformer, deflationary independent component analysis, source modelComment: Accepted in INTERSPEECH 202

    Convexity in source separation: Models, geometry, and algorithms

    Get PDF
    Source separation or demixing is the process of extracting multiple components entangled within a signal. Contemporary signal processing presents a host of difficult source separation problems, from interference cancellation to background subtraction, blind deconvolution, and even dictionary learning. Despite the recent progress in each of these applications, advances in high-throughput sensor technology place demixing algorithms under pressure to accommodate extremely high-dimensional signals, separate an ever larger number of sources, and cope with more sophisticated signal and mixing models. These difficulties are exacerbated by the need for real-time action in automated decision-making systems. Recent advances in convex optimization provide a simple framework for efficiently solving numerous difficult demixing problems. This article provides an overview of the emerging field, explains the theory that governs the underlying procedures, and surveys algorithms that solve them efficiently. We aim to equip practitioners with a toolkit for constructing their own demixing algorithms that work, as well as concrete intuition for why they work

    End-to-End Probabilistic Inference for Nonstationary Audio Analysis

    Get PDF
    Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a spectral mixture Gaussian process model with nonstationary priors over the amplitude variance parameters. Further, we formulate this nonlinear model's state space representation, making it amenable to infinite-horizon Gaussian process regression with approximate inference via expectation propagation, which scales linearly in the number of time steps and quadratically in the state dimensionality. By doing so, we are able to process audio signals with hundreds of thousands of data points. We demonstrate, on various tasks with empirical data, how this inference scheme outperforms more standard techniques that rely on extended Kalman filtering
    • 

    corecore