293 research outputs found
Text-Independent Speaker Verification Using 3D Convolutional Neural Networks
In this paper, a novel method using 3D Convolutional Neural Network (3D-CNN)
architecture has been proposed for speaker verification in the text-independent
setting. One of the main challenges is the creation of the speaker models. Most
of the previously-reported approaches create speaker models based on averaging
the extracted features from utterances of the speaker, which is known as the
d-vector system. In our paper, we propose an adaptive feature learning by
utilizing the 3D-CNNs for direct speaker model creation in which, for both
development and enrollment phases, an identical number of spoken utterances per
speaker is fed to the network for representing the speakers' utterances and
creation of the speaker model. This leads to simultaneously capturing the
speaker-related information and building a more robust system to cope with
within-speaker variation. We demonstrate that the proposed method significantly
outperforms the traditional d-vector verification system. Moreover, the
proposed system can also be an alternative to the traditional d-vector system
which is a one-shot speaker modeling system by utilizing 3D-CNNs.Comment: Accepted to be published in IEEE International Conference on
Multimedia and Expo (ICME) 201
Task-Driven Dictionary Learning for Hyperspectral Image Classification with Structured Sparsity Constraints
Sparse representation models a signal as a linear combination of a small
number of dictionary atoms. As a generative model, it requires the dictionary
to be highly redundant in order to ensure both a stable high sparsity level and
a low reconstruction error for the signal. However, in practice, this
requirement is usually impaired by the lack of labelled training samples.
Fortunately, previous research has shown that the requirement for a redundant
dictionary can be less rigorous if simultaneous sparse approximation is
employed, which can be carried out by enforcing various structured sparsity
constraints on the sparse codes of the neighboring pixels. In addition,
numerous works have shown that applying a variety of dictionary learning
methods for the sparse representation model can also improve the classification
performance. In this paper, we highlight the task-driven dictionary learning
algorithm, which is a general framework for the supervised dictionary learning
method. We propose to enforce structured sparsity priors on the task-driven
dictionary learning method in order to improve the performance of the
hyperspectral classification. Our approach is able to benefit from both the
advantages of the simultaneous sparse representation and those of the
supervised dictionary learning. We enforce two different structured sparsity
priors, the joint and Laplacian sparsity, on the task-driven dictionary
learning method and provide the details of the corresponding optimization
algorithms. Experiments on numerous popular hyperspectral images demonstrate
that the classification performance of our approach is superior to sparse
representation classifier with structured priors or the task-driven dictionary
learning method
Semi-supervised Multi-sensor Classification via Consensus-based Multi-View Maximum Entropy Discrimination
In this paper, we consider multi-sensor classification when there is a large
number of unlabeled samples. The problem is formulated under the multi-view
learning framework and a Consensus-based Multi-View Maximum Entropy
Discrimination (CMV-MED) algorithm is proposed. By iteratively maximizing the
stochastic agreement between multiple classifiers on the unlabeled dataset, the
algorithm simultaneously learns multiple high accuracy classifiers. We
demonstrate that our proposed method can yield improved performance over
previous multi-view learning approaches by comparing performance on three real
multi-sensor data sets.Comment: 5 pages, 4 figures, Accepted in 40th IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP 15
- …
