8,181 research outputs found

    Discriminative Tandem Features for HMM-based EEG Classification

    Get PDF
    Abstract—We investigate the use of discriminative feature extractors in tandem configuration with generative EEG classification system. Existing studies on dynamic EEG classification typically use hidden Markov models (HMMs) which lack discriminative capability. In this paper, a linear and a non-linear classifier are discriminatively trained to produce complementary input features to the conventional HMM system. Two sets of tandem features are derived from linear discriminant analysis (LDA) projection output and multilayer perceptron (MLP) class-posterior probability, before appended to the standard autoregressive (AR) features. Evaluation on a two-class motor-imagery classification task shows that both the proposed tandem features yield consistent gains over the AR baseline, resulting in significant relative improvement of 6.2% and 11.2 % for the LDA and MLP features respectively. We also explore portability of these features across different subjects. Index Terms- Artificial neural network-hidden Markov models, EEG classification, brain-computer-interface (BCI)

    Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System

    Full text link
    In this paper, we explore the encoding/pooling layer and loss function in the end-to-end speaker and language recognition system. First, a unified and interpretable end-to-end system for both speaker and language recognition is developed. It accepts variable-length input and produces an utterance level result. In the end-to-end system, the encoding layer plays a role in aggregating the variable-length input sequence into an utterance level representation. Besides the basic temporal average pooling, we introduce a self-attentive pooling layer and a learnable dictionary encoding layer to get the utterance level representation. In terms of loss function for open-set speaker verification, to get more discriminative speaker embedding, center loss and angular softmax loss is introduced in the end-to-end system. Experimental results on Voxceleb and NIST LRE 07 datasets show that the performance of end-to-end learning system could be significantly improved by the proposed encoding layer and loss function.Comment: Accepted for Speaker Odyssey 201

    NPLDA: A Deep Neural PLDA Model for Speaker Verification

    Full text link
    The state-of-art approach for speaker verification consists of a neural network based embedding extractor along with a backend generative model such as the Probabilistic Linear Discriminant Analysis (PLDA). In this work, we propose a neural network approach for backend modeling in speaker recognition. The likelihood ratio score of the generative PLDA model is posed as a discriminative similarity function and the learnable parameters of the score function are optimized using a verification cost. The proposed model, termed as neural PLDA (NPLDA), is initialized using the generative PLDA model parameters. The loss function for the NPLDA model is an approximation of the minimum detection cost function (DCF). The speaker recognition experiments using the NPLDA model are performed on the speaker verificiation task in the VOiCES datasets as well as the SITW challenge dataset. In these experiments, the NPLDA model optimized using the proposed loss function improves significantly over the state-of-art PLDA based speaker verification system.Comment: Published in Odyssey 2020, the Speaker and Language Recognition Workshop (VOiCES Special Session). Link to GitHub Implementation: https://github.com/iiscleap/NeuralPlda. arXiv admin note: substantial text overlap with arXiv:2001.0703

    Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Tracking

    Get PDF
    With efficient appearance learning models, Discriminative Correlation Filter (DCF) has been proven to be very successful in recent video object tracking benchmarks and competitions. However, the existing DCF paradigm suffers from two major issues, i.e., spatial boundary effect and temporal filter degradation. To mitigate these challenges, we propose a new DCF-based tracking method. The key innovations of the proposed method include adaptive spatial feature selection and temporal consistent constraints, with which the new tracker enables joint spatial-temporal filter learning in a lower dimensional discriminative manifold. More specifically, we apply structured spatial sparsity constraints to multi-channel filers. Consequently, the process of learning spatial filters can be approximated by the lasso regularisation. To encourage temporal consistency, the filter model is restricted to lie around its historical value and updated locally to preserve the global structure in the manifold. Last, a unified optimisation framework is proposed to jointly select temporal consistency preserving spatial features and learn discriminative filters with the augmented Lagrangian method. Qualitative and quantitative evaluations have been conducted on a number of well-known benchmarking datasets such as OTB2013, OTB50, OTB100, Temple-Colour, UAV123 and VOT2018. The experimental results demonstrate the superiority of the proposed method over the state-of-the-art approaches
    corecore