288 research outputs found
Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization
Creating impact in real-world settings requires artificial intelligence
techniques to span the full pipeline from data, to predictive models, to
decisions. These components are typically approached separately: a machine
learning model is first trained via a measure of predictive accuracy, and then
its predictions are used as input into an optimization algorithm which produces
a decision. However, the loss function used to train the model may easily be
misaligned with the end goal, which is to make the best decisions possible.
Hand-tuning the loss function to align with optimization is a difficult and
error-prone process (which is often skipped entirely).
We focus on combinatorial optimization problems and introduce a general
framework for decision-focused learning, where the machine learning model is
directly trained in conjunction with the optimization algorithm to produce
high-quality decisions. Technically, our contribution is a means of integrating
common classes of discrete optimization problems into deep learning or other
predictive models, which are typically trained via gradient descent. The main
idea is to use a continuous relaxation of the discrete problem to propagate
gradients through the optimization procedure. We instantiate this framework for
two broad classes of combinatorial problems: linear programs and submodular
maximization. Experimental results across a variety of domains show that
decision-focused learning often leads to improved optimization performance
compared to traditional methods. We find that standard measures of accuracy are
not a reliable proxy for a predictive model's utility in optimization, and our
method's ability to specify the true goal as the model's training objective
yields substantial dividends across a range of decision problems.Comment: Full version of paper accepted at AAAI 201
Tailored Presolve Techniques in Branch-and-Bound Method for Fast Mixed-Integer Optimal Control Applications
Mixed-integer model predictive control (MI-MPC) can be a powerful tool for
modeling hybrid control systems. In case of a linear-quadratic objective in
combination with linear or piecewise-linear system dynamics and inequality
constraints, MI-MPC needs to solve a mixed-integer quadratic program (MIQP) at
each sampling time step. This paper presents a collection of block-sparse
presolve techniques to efficiently remove decision variables, and to remove or
tighten inequality constraints, tailored to mixed-integer optimal control
problems (MIOCP). In addition, we describe a novel heuristic approach based on
an iterative presolve algorithm to compute a feasible but possibly suboptimal
MIQP solution. We present benchmarking results for a C code implementation of
the proposed BB-ASIPM solver, including a branch-and-bound (B&B) method with
the proposed tailored presolve techniques and an active-set based interior
point method (ASIPM), compared against multiple state-of-the-art MIQP solvers
on a case study of motion planning with obstacle avoidance constraints.
Finally, we demonstrate the computational performance of the BB-ASIPM solver on
the dSPACE Scalexio real-time embedded hardware using a second case study of
stabilization for an underactuated cart-pole with soft contacts.Comment: 27 pages, 7 figures, 2 tables, submitted to journal of Optimal
Control Applications and Method
DeepPermNet: Visual Permutation Learning
We present a principled approach to uncover the structure of visual data by
solving a novel deep learning task coined visual permutation learning. The goal
of this task is to find the permutation that recovers the structure of data
from shuffled versions of it. In the case of natural images, this task boils
down to recovering the original image from patches shuffled by an unknown
permutation matrix. Unfortunately, permutation matrices are discrete, thereby
posing difficulties for gradient-based methods. To this end, we resort to a
continuous approximation of these matrices using doubly-stochastic matrices
which we generate from standard CNN predictions using Sinkhorn iterations.
Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet,
an end-to-end CNN model for this task. The utility of DeepPermNet is
demonstrated on two challenging computer vision problems, namely, (i) relative
attributes learning and (ii) self-supervised representation learning. Our
results show state-of-the-art performance on the Public Figures and OSR
benchmarks for (i) and on the classification and segmentation tasks on the
PASCAL VOC dataset for (ii).Comment: Accepted in IEEE International Conference on Computer Vision and
Pattern Recognition CVPR 201
Learning efficiently with approximate inference via dual losses
Many structured prediction tasks involve
complex models where inference is computationally intractable, but where it can be well
approximated using a linear programming
relaxation. Previous approaches for learning for structured prediction (e.g., cutting-
plane, subgradient methods, perceptron) repeatedly make predictions for some of the
data points. These approaches are computationally demanding because each prediction
involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea
is to instead solve the dual of the structured
prediction loss. We formulate the learning
task as a convex minimization over both the
weights and the dual variables corresponding
to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as
stochastic subgradient and cutting-plane
- …