65 research outputs found
Positive Semidefinite Metric Learning with Boosting
The learning of appropriate distance metrics is a critical problem in image
classification and retrieval. In this work, we propose a boosting-based
technique, termed \BoostMetric, for learning a Mahalanobis distance metric. One
of the primary difficulties in learning such a metric is to ensure that the
Mahalanobis matrix remains positive semidefinite. Semidefinite programming is
sometimes used to enforce this constraint, but does not scale well.
\BoostMetric is instead based on a key observation that any positive
semidefinite matrix can be decomposed into a linear positive combination of
trace-one rank-one matrices. \BoostMetric thus uses rank-one positive
semidefinite matrices as weak learners within an efficient and scalable
boosting-based learning process. The resulting method is easy to implement,
does not require tuning, and can accommodate various types of constraints.
Experiments on various datasets show that the proposed algorithm compares
favorably to those state-of-the-art methods in terms of classification accuracy
and running time.Comment: 11 pages, Twenty-Third Annual Conference on Neural Information
Processing Systems (NIPS 2009), Vancouver, Canad
Optimizing Ranking Measures for Compact Binary Code Learning
Hashing has proven a valuable tool for large-scale information retrieval.
Despite much success, existing hashing methods optimize over simple objectives
such as the reconstruction error or graph Laplacian related loss functions,
instead of the performance evaluation criteria of interest---multivariate
performance measures such as the AUC and NDCG. Here we present a general
framework (termed StructHash) that allows one to directly optimize multivariate
performance measures. The resulting optimization problem can involve
exponentially or infinitely many variables and constraints, which is more
challenging than standard structured output learning. To solve the StructHash
optimization problem, we use a combination of column generation and
cutting-plane techniques. We demonstrate the generality of StructHash by
applying it to ranking prediction and image retrieval, and show that it
outperforms a few state-of-the-art hashing methods.Comment: Appearing in Proc. European Conference on Computer Vision 201
An Efficient Dual Approach to Distance Metric Learning
Distance metric learning is of fundamental interest in machine learning
because the distance metric employed can significantly affect the performance
of many learning methods. Quadratic Mahalanobis metric learning is a popular
approach to the problem, but typically requires solving a semidefinite
programming (SDP) problem, which is computationally expensive. Standard
interior-point SDP solvers typically have a complexity of (with
the dimension of input data), and can thus only practically solve problems
exhibiting less than a few thousand variables. Since the number of variables is
, this implies a limit upon the size of problem that can
practically be solved of around a few hundred dimensions. The complexity of the
popular quadratic Mahalanobis metric learning approach thus limits the size of
problem to which metric learning can be applied. Here we propose a
significantly more efficient approach to the metric learning problem based on
the Lagrange dual formulation of the problem. The proposed formulation is much
simpler to implement, and therefore allows much larger Mahalanobis metric
learning problems to be solved. The time complexity of the proposed method is
, which is significantly lower than that of the SDP approach.
Experiments on a variety of datasets demonstrate that the proposed method
achieves an accuracy comparable to the state-of-the-art, but is applicable to
significantly larger problems. We also show that the proposed method can be
applied to solve more general Frobenius-norm regularized SDP problems
approximately
Totally Corrective Multiclass Boosting with Binary Weak Learners
In this work, we propose a new optimization framework for multiclass boosting
learning. In the literature, AdaBoost.MO and AdaBoost.ECC are the two
successful multiclass boosting algorithms, which can use binary weak learners.
We explicitly derive these two algorithms' Lagrange dual problems based on
their regularized loss functions. We show that the Lagrange dual formulations
enable us to design totally-corrective multiclass algorithms by using the
primal-dual optimization technique. Experiments on benchmark data sets suggest
that our multiclass boosting can achieve a comparable generalization capability
with state-of-the-art, but the convergence speed is much faster than stage-wise
gradient descent boosting. In other words, the new totally corrective
algorithms can maximize the margin more aggressively.Comment: 11 page
On Similarities between Inference in Game Theory and Machine Learning
In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We first show that the standard update rule of mean field variational learning is analogous to a Cournot adjustment within game theory. By analogy with fictitious play, we then suggest an improved update rule, and show that this results in fictitious variational play, an improved mean field variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in fictitious play, namely dynamic fictitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution)
- ā¦