4,913 research outputs found
Learning Dynamic Feature Selection for Fast Sequential Prediction
We present paired learning and inference algorithms for significantly
reducing computation and increasing speed of the vector dot products in the
classifiers that are at the heart of many NLP components. This is accomplished
by partitioning the features into a sequence of templates which are ordered
such that high confidence can often be reached using only a small fraction of
all features. Parameter estimation is arranged to maximize accuracy and early
confidence in this sequence. Our approach is simpler and better suited to NLP
than other related cascade methods. We present experiments in left-to-right
part-of-speech tagging, named entity recognition, and transition-based
dependency parsing. On the typical benchmarking datasets we can preserve POS
tagging accuracy above 97% and parsing LAS above 88.5% both with over a
five-fold reduction in run-time, and NER F1 above 88 with more than 2x increase
in speed.Comment: Appears in The 53rd Annual Meeting of the Association for
Computational Linguistics, Beijing, China, July 201
Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection
We study the problem of selecting a subset of k random variables from a large
set, in order to obtain the best linear prediction of another variable of
interest. This problem can be viewed in the context of both feature selection
and sparse approximation. We analyze the performance of widely used greedy
heuristics, using insights from the maximization of submodular functions and
spectral analysis. We introduce the submodularity ratio as a key quantity to
help understand why greedy algorithms perform well even when the variables are
highly correlated. Using our techniques, we obtain the strongest known
approximation guarantees for this problem, both in terms of the submodularity
ratio and the smallest k-sparse eigenvalue of the covariance matrix. We further
demonstrate the wide applicability of our techniques by analyzing greedy
algorithms for the dictionary selection problem, and significantly improve the
previously known guarantees. Our theoretical analysis is complemented by
experiments on real-world and synthetic data sets; the experiments show that
the submodularity ratio is a stronger predictor of the performance of greedy
algorithms than other spectral parameters
OMP-type Algorithm with Structured Sparsity Patterns for Multipath Radar Signals
A transmitted, unknown radar signal is observed at the receiver through more
than one path in additive noise. The aim is to recover the waveform of the
intercepted signal and to simultaneously estimate the direction of arrival
(DOA). We propose an approach exploiting the parsimonious time-frequency
representation of the signal by applying a new OMP-type algorithm for
structured sparsity patterns. An important issue is the scalability of the
proposed algorithm since high-dimensional models shall be used for radar
signals. Monte-Carlo simulations for modulated signals illustrate the good
performance of the method even for low signal-to-noise ratios and a gain of 20
dB for the DOA estimation compared to some elementary method
Ultra-high Dimensional Multiple Output Learning With Simultaneous Orthogonal Matching Pursuit: A Sure Screening Approach
We propose a novel application of the Simultaneous Orthogonal Matching
Pursuit (S-OMP) procedure for sparsistant variable selection in ultra-high
dimensional multi-task regression problems. Screening of variables, as
introduced in \cite{fan08sis}, is an efficient and highly scalable way to
remove many irrelevant variables from the set of all variables, while retaining
all the relevant variables. S-OMP can be applied to problems with hundreds of
thousands of variables and once the number of variables is reduced to a
manageable size, a more computationally demanding procedure can be used to
identify the relevant variables for each of the regression outputs. To our
knowledge, this is the first attempt to utilize relatedness of multiple outputs
to perform fast screening of relevant variables. As our main theoretical
contribution, we prove that, asymptotically, S-OMP is guaranteed to reduce an
ultra-high number of variables to below the sample size without losing true
relevant variables. We also provide formal evidence that a modified Bayesian
information criterion (BIC) can be used to efficiently determine the number of
iterations in S-OMP. We further provide empirical evidence on the benefit of
variable selection using multiple regression outputs jointly, as opposed to
performing variable selection for each output separately. The finite sample
performance of S-OMP is demonstrated on extensive simulation studies, and on a
genetic association mapping problem. Adaptive Lasso; Greedy forward
regression; Orthogonal matching pursuit; Multi-output regression; Multi-task
learning; Simultaneous orthogonal matching pursuit; Sure screening; Variable
selectio
Audio Inpainting
(c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Transactions on Audio, Speech and Language Processing 20(3): 922-932, Mar 2012. DOI: 10.1090/TASL.2011.2168211
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
- …