16,615 research outputs found
On Correcting Inputs: Inverse Optimization for Online Structured Prediction
Algorithm designers typically assume that the input data is correct, and then
proceed to find "optimal" or "sub-optimal" solutions using this input data.
However this assumption of correct data does not always hold in practice,
especially in the context of online learning systems where the objective is to
learn appropriate feature weights given some training samples. Such scenarios
necessitate the study of inverse optimization problems where one is given an
input instance as well as a desired output and the task is to adjust the input
data so that the given output is indeed optimal. Motivated by learning
structured prediction models, in this paper we consider inverse optimization
with a margin, i.e., we require the given output to be better than all other
feasible outputs by a desired margin. We consider such inverse optimization
problems for maximum weight matroid basis, matroid intersection, perfect
matchings, minimum cost maximum flows, and shortest paths and derive the first
known results for such problems with a non-zero margin. The effectiveness of
these algorithmic approaches to online learning for structured prediction is
also discussed.Comment: Conference version to appear in FSTTCS, 201
Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards
In the classic multi-armed bandits problem, the goal is to have a policy for
dynamically operating arms that each yield stochastic rewards with unknown
means. The key metric of interest is regret, defined as the gap between the
expected total reward accumulated by an omniscient player that knows the reward
means for each arm, and the expected total reward accumulated by the given
policy. The policies presented in prior work have storage, computation and
regret all growing linearly with the number of arms, which is not scalable when
the number of arms is large. We consider in this work a broad class of
multi-armed bandits with dependent arms that yield rewards as a linear
combination of a set of unknown parameters. For this general framework, we
present efficient policies that are shown to achieve regret that grows
logarithmically with time, and polynomially in the number of unknown parameters
(even though the number of dependent arms may grow exponentially). Furthermore,
these policies only require storage that grows linearly in the number of
unknown parameters. We show that this generalization is broadly applicable and
useful for many interesting tasks in networks that can be formulated as
tractable combinatorial optimization problems with linear objective functions,
such as maximum weight matching, shortest path, and minimum spanning tree
computations
Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization
Creating impact in real-world settings requires artificial intelligence
techniques to span the full pipeline from data, to predictive models, to
decisions. These components are typically approached separately: a machine
learning model is first trained via a measure of predictive accuracy, and then
its predictions are used as input into an optimization algorithm which produces
a decision. However, the loss function used to train the model may easily be
misaligned with the end goal, which is to make the best decisions possible.
Hand-tuning the loss function to align with optimization is a difficult and
error-prone process (which is often skipped entirely).
We focus on combinatorial optimization problems and introduce a general
framework for decision-focused learning, where the machine learning model is
directly trained in conjunction with the optimization algorithm to produce
high-quality decisions. Technically, our contribution is a means of integrating
common classes of discrete optimization problems into deep learning or other
predictive models, which are typically trained via gradient descent. The main
idea is to use a continuous relaxation of the discrete problem to propagate
gradients through the optimization procedure. We instantiate this framework for
two broad classes of combinatorial problems: linear programs and submodular
maximization. Experimental results across a variety of domains show that
decision-focused learning often leads to improved optimization performance
compared to traditional methods. We find that standard measures of accuracy are
not a reliable proxy for a predictive model's utility in optimization, and our
method's ability to specify the true goal as the model's training objective
yields substantial dividends across a range of decision problems.Comment: Full version of paper accepted at AAAI 201
Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions
We investigate three related and important problems connected to machine
learning: approximating a submodular function everywhere, learning a submodular
function (in a PAC-like setting [53]), and constrained minimization of
submodular functions. We show that the complexity of all three problems depends
on the 'curvature' of the submodular function, and provide lower and upper
bounds that refine and improve previous results [3, 16, 18, 52]. Our proof
techniques are fairly generic. We either use a black-box transformation of the
function (for approximation and learning), or a transformation of algorithms to
use an appropriate surrogate function (for minimization). Curiously, curvature
has been known to influence approximations for submodular maximization [7, 55],
but its effect on minimization, approximation and learning has hitherto been
open. We complete this picture, and also support our theoretical claims by
empirical results.Comment: 21 pages. A shorter version appeared in Advances of NIPS-201
Matroid Bandits: Fast Combinatorial Optimization with Learning
A matroid is a notion of independence in combinatorial optimization which is
closely related to computational efficiency. In particular, it is well known
that the maximum of a constrained modular function can be found greedily if and
only if the constraints are associated with a matroid. In this paper, we bring
together the ideas of bandits and matroids, and propose a new class of
combinatorial bandits, matroid bandits. The objective in these problems is to
learn how to maximize a modular function on a matroid. This function is
stochastic and initially unknown. We propose a practical algorithm for solving
our problem, Optimistic Matroid Maximization (OMM); and prove two upper bounds,
gap-dependent and gap-free, on its regret. Both bounds are sublinear in time
and at most linear in all other quantities of interest. The gap-dependent upper
bound is tight and we prove a matching lower bound on a partition matroid
bandit. Finally, we evaluate our method on three real-world problems and show
that it is practical
- …