2,142 research outputs found
Fast Dictionary Learning for Sparse Representations of Speech Signals
© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Published version: IEEE Journal of Selected Topics in Signal Processing 5(5): 1025-1031, Sep 2011. DOI: 10.1109/JSTSP.2011.2157892
Decoding the Encoding of Functional Brain Networks: an fMRI Classification Comparison of Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA), and Sparse Coding Algorithms
Brain networks in fMRI are typically identified using spatial independent
component analysis (ICA), yet mathematical constraints such as sparse coding
and positivity both provide alternate biologically-plausible frameworks for
generating brain networks. Non-negative Matrix Factorization (NMF) would
suppress negative BOLD signal by enforcing positivity. Spatial sparse coding
algorithms ( Regularized Learning and K-SVD) would impose local
specialization and a discouragement of multitasking, where the total observed
activity in a single voxel originates from a restricted number of possible
brain networks.
The assumptions of independence, positivity, and sparsity to encode
task-related brain networks are compared; the resulting brain networks for
different constraints are used as basis functions to encode the observed
functional activity at a given time point. These encodings are decoded using
machine learning to compare both the algorithms and their assumptions, using
the time series weights to predict whether a subject is viewing a video,
listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects.
For classifying cognitive activity, the sparse coding algorithm of
Regularized Learning consistently outperformed 4 variations of ICA across
different numbers of networks and noise levels (p0.001). The NMF algorithms,
which suppressed negative BOLD signal, had the poorest accuracy. Within each
algorithm, encodings using sparser spatial networks (containing more
zero-valued voxels) had higher classification accuracy (p0.001). The success
of sparse coding algorithms may suggest that algorithms which enforce sparse
coding, discourage multitasking, and promote local specialization may capture
better the underlying source processes than those which allow inexhaustible
local processes such as ICA
Task-Driven Dictionary Learning
Modeling data with linear combinations of a few elements from a learned
dictionary has been the focus of much recent research in machine learning,
neuroscience and signal processing. For signals such as natural images that
admit such sparse representations, it is now well established that these models
are well suited to restoration tasks. In this context, learning the dictionary
amounts to solving a large-scale matrix factorization problem, which can be
done efficiently with classical optimization tools. The same approach has also
been used for learning features from data for other purposes, e.g., image
classification, but tuning the dictionary in a supervised way for these tasks
has proven to be more difficult. In this paper, we present a general
formulation for supervised dictionary learning adapted to a wide variety of
tasks, and present an efficient algorithm for solving the corresponding
optimization problem. Experiments on handwritten digit classification, digital
art identification, nonlinear inverse image problems, and compressed sensing
demonstrate that our approach is effective in large-scale settings, and is well
suited to supervised and semi-supervised classification, as well as regression
tasks for data that admit sparse representations.Comment: final draft post-refereein
Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition
We propose to model the acoustic space of deep neural network (DNN)
class-conditional posterior probabilities as a union of low-dimensional
subspaces. To that end, the training posteriors are used for dictionary
learning and sparse coding. Sparse representation of the test posteriors using
this dictionary enables projection to the space of training data. Relying on
the fact that the intrinsic dimensions of the posterior subspaces are indeed
very small and the matrix of all posteriors belonging to a class has a very low
rank, we demonstrate how low-dimensional structures enable further enhancement
of the posteriors and rectify the spurious errors due to mismatch conditions.
The enhanced acoustic modeling method leads to improvements in continuous
speech recognition task using hybrid DNN-HMM (hidden Markov model) framework in
both clean and noisy conditions, where upto 15.4% relative reduction in word
error rate (WER) is achieved
- …