361,855 research outputs found
Exact and efficient top-K inference for multi-target prediction by querying separable linear relational models
Many complex multi-target prediction problems that concern large target
spaces are characterised by a need for efficient prediction strategies that
avoid the computation of predictions for all targets explicitly. Examples of
such problems emerge in several subfields of machine learning, such as
collaborative filtering, multi-label classification, dyadic prediction and
biological network inference. In this article we analyse efficient and exact
algorithms for computing the top- predictions in the above problem settings,
using a general class of models that we refer to as separable linear relational
models. We show how to use those inference algorithms, which are modifications
of well-known information retrieval methods, in a variety of machine learning
settings. Furthermore, we study the possibility of scoring items incompletely,
while still retaining an exact top-K retrieval. Experimental results in several
application domains reveal that the so-called threshold algorithm is very
scalable, performing often many orders of magnitude more efficiently than the
naive approach
Information-based complexity, feedback and dynamics in convex programming
We study the intrinsic limitations of sequential convex optimization through
the lens of feedback information theory. In the oracle model of optimization,
an algorithm queries an {\em oracle} for noisy information about the unknown
objective function, and the goal is to (approximately) minimize every function
in a given class using as few queries as possible. We show that, in order for a
function to be optimized, the algorithm must be able to accumulate enough
information about the objective. This, in turn, puts limits on the speed of
optimization under specific assumptions on the oracle and the type of feedback.
Our techniques are akin to the ones used in statistical literature to obtain
minimax lower bounds on the risks of estimation procedures; the notable
difference is that, unlike in the case of i.i.d. data, a sequential
optimization algorithm can gather observations in a {\em controlled} manner, so
that the amount of information at each step is allowed to change in time. In
particular, we show that optimization algorithms often obey the law of
diminishing returns: the signal-to-noise ratio drops as the optimization
algorithm approaches the optimum. To underscore the generality of the tools, we
use our approach to derive fundamental lower bounds for a certain active
learning problem. Overall, the present work connects the intuitive notions of
information in optimization, experimental design, estimation, and active
learning to the quantitative notion of Shannon information.Comment: final version; to appear in IEEE Transactions on Information Theor
RandomBoost: Simplified Multi-class Boosting through Randomization
We propose a novel boosting approach to multi-class classification problems,
in which multiple classes are distinguished by a set of random projection
matrices in essence. The approach uses random projections to alleviate the
proliferation of binary classifiers typically required to perform multi-class
classification. The result is a multi-class classifier with a single
vector-valued parameter, irrespective of the number of classes involved. Two
variants of this approach are proposed. The first method randomly projects the
original data into new spaces, while the second method randomly projects the
outputs of learned weak classifiers. These methods are not only conceptually
simple but also effective and easy to implement. A series of experiments on
synthetic, machine learning and visual recognition data sets demonstrate that
our proposed methods compare favorably to existing multi-class boosting
algorithms in terms of both the convergence rate and classification accuracy.Comment: 15 page
Online Local Learning via Semidefinite Programming
In many online learning problems we are interested in predicting local
information about some universe of items. For example, we may want to know
whether two items are in the same cluster rather than computing an assignment
of items to clusters; we may want to know which of two teams will win a game
rather than computing a ranking of teams. Although finding the optimal
clustering or ranking is typically intractable, it may be possible to predict
the relationships between items as well as if you could solve the global
optimization problem exactly.
Formally, we consider an online learning problem in which a learner
repeatedly guesses a pair of labels (l(x), l(y)) and receives an adversarial
payoff depending on those labels. The learner's goal is to receive a payoff
nearly as good as the best fixed labeling of the items. We show that a simple
algorithm based on semidefinite programming can obtain asymptotically optimal
regret in the case where the number of possible labels is O(1), resolving an
open problem posed by Hazan, Kale, and Shalev-Schwartz. Our main technical
contribution is a novel use and analysis of the log determinant regularizer,
exploiting the observation that log det(A + I) upper bounds the entropy of any
distribution with covariance matrix A.Comment: 10 page
- …