3,211 research outputs found
Hierarchical Subquery Evaluation for Active Learning on a Graph
To train good supervised and semi-supervised object classifiers, it is
critical that we not waste the time of the human experts who are providing the
training labels. Existing active learning strategies can have uneven
performance, being efficient on some datasets but wasteful on others, or
inconsistent just between runs on the same dataset. We propose perplexity based
graph construction and a new hierarchical subquery evaluation algorithm to
combat this variability, and to release the potential of Expected Error
Reduction.
Under some specific circumstances, Expected Error Reduction has been one of
the strongest-performing informativeness criteria for active learning. Until
now, it has also been prohibitively costly to compute for sizeable datasets. We
demonstrate our highly practical algorithm, comparing it to other active
learning measures on classification datasets that vary in sparsity,
dimensionality, and size. Our algorithm is consistent over multiple runs and
achieves high accuracy, while querying the human expert for labels at a
frequency that matches their desired time budget.Comment: CVPR 201
A Survey on Metric Learning for Feature Vectors and Structured Data
The need for appropriate ways to measure the distance or similarity between
data is ubiquitous in machine learning, pattern recognition and data mining,
but handcrafting such good metrics for specific problems is generally
difficult. This has led to the emergence of metric learning, which aims at
automatically learning a metric from data and has attracted a lot of interest
in machine learning and related fields for the past ten years. This survey
paper proposes a systematic review of the metric learning literature,
highlighting the pros and cons of each approach. We pay particular attention to
Mahalanobis distance metric learning, a well-studied and successful framework,
but additionally present a wide range of methods that have recently emerged as
powerful alternatives, including nonlinear metric learning, similarity learning
and local metric learning. Recent trends and extensions, such as
semi-supervised metric learning, metric learning for histogram data and the
derivation of generalization guarantees, are also covered. Finally, this survey
addresses metric learning for structured data, in particular edit distance
learning, and attempts to give an overview of the remaining challenges in
metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved
presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new
method
Linear Dimensionality Reduction for Margin-Based Classification: High-Dimensional Data and Sensor Networks
Low-dimensional statistics of measurements play an important role in detection problems, including those encountered in sensor networks. In this work, we focus on learning low-dimensional linear statistics of high-dimensional measurement data along with decision rules defined in the low-dimensional space in the case when the probability density of the measurements and class labels is not given, but a training set of samples from this distribution is given. We pose a joint optimization problem for linear dimensionality reduction and margin-based classification, and develop a coordinate descent algorithm on the Stiefel manifold for its solution. Although the coordinate descent is not guaranteed to find the globally optimal solution, crucially, its alternating structure enables us to extend it for sensor networks with a message-passing approach requiring little communication. Linear dimensionality reduction prevents overfitting when learning from finite training data. In the sensor network setting, dimensionality reduction not only prevents overfitting, but also reduces power consumption due to communication. The learned reduced-dimensional space and decision rule is shown to be consistent and its Rademacher complexity is characterized. Experimental results are presented for a variety of datasets, including those from existing sensor networks, demonstrating the potential of our methodology in comparison with other dimensionality reduction approaches.National Science Foundation (U.S.). Graduate Research Fellowship ProgramUnited States. Army Research Office (MURI funded through ARO Grant W911NF-06-1-0076)United States. Air Force Office of Scientific Research (Award FA9550-06-1-0324)Shell International Exploration and Production B.V
- …