3,168 research outputs found
S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification
This paper investigates the problem of active learning for binary label
prediction on a graph. We introduce a simple and label-efficient algorithm
called S2 for this task. At each step, S2 selects the vertex to be labeled
based on the structure of the graph and all previously gathered labels.
Specifically, S2 queries for the label of the vertex that bisects the *shortest
shortest* path between any pair of oppositely labeled vertices. We present a
theoretical estimate of the number of queries S2 needs in terms of a novel
parametrization of the complexity of binary functions on graphs. We also
present experimental results demonstrating the performance of S2 on both real
and synthetic data. While other graph-based active learning algorithms have
shown promise in practice, our algorithm is the first with both good
performance and theoretical guarantees. Finally, we demonstrate the
implications of the S2 algorithm to the theory of nonparametric active
learning. In particular, we show that S2 achieves near minimax optimal excess
risk for an important class of nonparametric classification problems.Comment: A version of this paper appears in the Conference on Learning Theory
(COLT) 201
Distilled Sensing: Adaptive Sampling for Sparse Detection and Estimation
Adaptive sampling results in dramatic improvements in the recovery of sparse
signals in white Gaussian noise. A sequential adaptive sampling-and-refinement
procedure called Distilled Sensing (DS) is proposed and analyzed. DS is a form
of multi-stage experimental design and testing. Because of the adaptive nature
of the data collection, DS can detect and localize far weaker signals than
possible from non-adaptive measurements. In particular, reliable detection and
localization (support estimation) using non-adaptive samples is possible only
if the signal amplitudes grow logarithmically with the problem dimension. Here
it is shown that using adaptive sampling, reliable detection is possible
provided the amplitude exceeds a constant, and localization is possible when
the amplitude exceeds any arbitrarily slowly growing function of the dimension.Comment: 23 pages, 2 figures. Revision includes minor clarifications, along
with more illustrative experimental results (cf. Figure 2
Multiscale likelihood analysis and complexity penalized estimation
We describe here a framework for a certain class of multiscale likelihood
factorizations wherein, in analogy to a wavelet decomposition of an L^2
function, a given likelihood function has an alternative representation as a
product of conditional densities reflecting information in both the data and
the parameter vector localized in position and scale. The framework is
developed as a set of sufficient conditions for the existence of such
factorizations, formulated in analogy to those underlying a standard
multiresolution analysis for wavelets, and hence can be viewed as a
multiresolution analysis for likelihoods. We then consider the use of these
factorizations in the task of nonparametric, complexity penalized likelihood
estimation. We study the risk properties of certain thresholding and
partitioning estimators, and demonstrate their adaptivity and near-optimality,
in a minimax sense over a broad range of function spaces, based on squared
Hellinger distance as a loss function. In particular, our results provide an
illustration of how properties of classical wavelet-based estimators can be
obtained in a single, unified framework that includes models for continuous,
count and categorical data types
- …