1,322 research outputs found
Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition
We propose to model the acoustic space of deep neural network (DNN)
class-conditional posterior probabilities as a union of low-dimensional
subspaces. To that end, the training posteriors are used for dictionary
learning and sparse coding. Sparse representation of the test posteriors using
this dictionary enables projection to the space of training data. Relying on
the fact that the intrinsic dimensions of the posterior subspaces are indeed
very small and the matrix of all posteriors belonging to a class has a very low
rank, we demonstrate how low-dimensional structures enable further enhancement
of the posteriors and rectify the spurious errors due to mismatch conditions.
The enhanced acoustic modeling method leads to improvements in continuous
speech recognition task using hybrid DNN-HMM (hidden Markov model) framework in
both clean and noisy conditions, where upto 15.4% relative reduction in word
error rate (WER) is achieved
Learning Sparse Adversarial Dictionaries For Multi-Class Audio Classification
Audio events are quite often overlapping in nature, and more prone to noise
than visual signals. There has been increasing evidence for the superior
performance of representations learned using sparse dictionaries for
applications like audio denoising and speech enhancement. This paper
concentrates on modifying the traditional reconstructive dictionary learning
algorithms, by incorporating a discriminative term into the objective function
in order to learn class-specific adversarial dictionaries that are good at
representing samples of their own class at the same time poor at representing
samples belonging to any other class. We quantitatively demonstrate the
effectiveness of our learned dictionaries as a stand-alone solution for both
binary as well as multi-class audio classification problems.Comment: Accepted in Asian Conference of Pattern Recognition (ACPR-2017
DFDL: Discriminative Feature-oriented Dictionary Learning for Histopathological Image Classification
In histopathological image analysis, feature extraction for classification is
a challenging task due to the diversity of histology features suitable for each
problem as well as presence of rich geometrical structure. In this paper, we
propose an automatic feature discovery framework for extracting discriminative
class-specific features and present a low-complexity method for classification
and disease grading in histopathology. Essentially, our Discriminative
Feature-oriented Dictionary Learning (DFDL) method learns class-specific
features which are suitable for representing samples from the same class while
are poorly capable of representing samples from other classes. Experiments on
three challenging real-world image databases: 1) histopathological images of
intraductal breast lesions, 2) mammalian lung images provided by the Animal
Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor
images from The Cancer Genome Atlas (TCGA) database, show the significance of
DFDL model in a variety problems over state-of-the-art methodsComment: Accepted to IEEE International Symposium on Biomedical Imaging
(ISBI), 201
Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
Sparsity-based representations have recently led to notable results in
various visual recognition tasks. In a separate line of research, Riemannian
manifolds have been shown useful for dealing with features and models that do
not lie in Euclidean spaces. With the aim of building a bridge between the two
realms, we address the problem of sparse coding and dictionary learning over
the space of linear subspaces, which form Riemannian structures known as
Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into
the space of symmetric matrices by an isometric mapping. This in turn enables
us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we
propose closed-form solutions for learning a Grassmann dictionary, atom by
atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann
sparse coding and dictionary learning algorithms through embedding into Hilbert
spaces.
Experiments on several classification tasks (gender recognition, gesture
classification, scene analysis, face recognition, action recognition and
dynamic texture classification) show that the proposed approaches achieve
considerable improvements in discrimination accuracy, in comparison to
state-of-the-art methods such as kernelized Affine Hull Method and
graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio
- …