Search CORE

288 research outputs found

Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization

Author: Dilkina Bistra
Tambe Milind
Wilder Bryan
Publication venue
Publication date: 20/11/2018
Field of study

Creating impact in real-world settings requires artificial intelligence techniques to span the full pipeline from data, to predictive models, to decisions. These components are typically approached separately: a machine learning model is first trained via a measure of predictive accuracy, and then its predictions are used as input into an optimization algorithm which produces a decision. However, the loss function used to train the model may easily be misaligned with the end goal, which is to make the best decisions possible. Hand-tuning the loss function to align with optimization is a difficult and error-prone process (which is often skipped entirely). We focus on combinatorial optimization problems and introduce a general framework for decision-focused learning, where the machine learning model is directly trained in conjunction with the optimization algorithm to produce high-quality decisions. Technically, our contribution is a means of integrating common classes of discrete optimization problems into deep learning or other predictive models, which are typically trained via gradient descent. The main idea is to use a continuous relaxation of the discrete problem to propagate gradients through the optimization procedure. We instantiate this framework for two broad classes of combinatorial problems: linear programs and submodular maximization. Experimental results across a variety of domains show that decision-focused learning often leads to improved optimization performance compared to traditional methods. We find that standard measures of accuracy are not a reliable proxy for a predictive model's utility in optimization, and our method's ability to specify the true goal as the model's training objective yields substantial dividends across a range of decision problems.Comment: Full version of paper accepted at AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Tailored Presolve Techniques in Branch-and-Bound Method for Fast Mixed-Integer Optimal Control Applications

Author: Di Cairano Stefano
Quirynen Rien
Publication venue
Publication date: 22/11/2022
Field of study

Mixed-integer model predictive control (MI-MPC) can be a powerful tool for modeling hybrid control systems. In case of a linear-quadratic objective in combination with linear or piecewise-linear system dynamics and inequality constraints, MI-MPC needs to solve a mixed-integer quadratic program (MIQP) at each sampling time step. This paper presents a collection of block-sparse presolve techniques to efficiently remove decision variables, and to remove or tighten inequality constraints, tailored to mixed-integer optimal control problems (MIOCP). In addition, we describe a novel heuristic approach based on an iterative presolve algorithm to compute a feasible but possibly suboptimal MIQP solution. We present benchmarking results for a C code implementation of the proposed BB-ASIPM solver, including a branch-and-bound (B&B) method with the proposed tailored presolve techniques and an active-set based interior point method (ASIPM), compared against multiple state-of-the-art MIQP solvers on a case study of motion planning with obstacle avoidance constraints. Finally, we demonstrate the computational performance of the BB-ASIPM solver on the dSPACE Scalexio real-time embedded hardware using a second case study of stabilization for an underactuated cart-pole with soft contacts.Comment: 27 pages, 7 figures, 2 tables, submitted to journal of Optimal Control Applications and Method

arXiv.org e-Print Archive

DeepPermNet: Visual Permutation Learning

Author: Cherian Anoop
Cruz Rodrigo Santa
Fernando Basura
Gould Stephen
Publication venue
Publication date: 10/04/2017
Field of study

We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Unfortunately, permutation matrices are discrete, thereby posing difficulties for gradient-based methods. To this end, we resort to a continuous approximation of these matrices using doubly-stochastic matrices which we generate from standard CNN predictions using Sinkhorn iterations. Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on two challenging computer vision problems, namely, (i) relative attributes learning and (ii) self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for (i) and on the classification and segmentation tasks on the PASCAL VOC dataset for (ii).Comment: Accepted in IEEE International Conference on Computer Vision and Pattern Recognition CVPR 201

arXiv.org e-Print Archive

Crossref

Learning efficiently with approximate inference via dual losses

Author: Globerson Amir
Jaakkola Tommi S.
Meshi Ofer
Sontag David Alexander
Publication venue: International Machine Learning Society
Publication date: 01/01/2010
Field of study

Many structured prediction tasks involve complex models where inference is computationally intractable, but where it can be well approximated using a linear programming relaxation. Previous approaches for learning for structured prediction (e.g., cutting- plane, subgradient methods, perceptron) repeatedly make predictions for some of the data points. These approaches are computationally demanding because each prediction involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea is to instead solve the dual of the structured prediction loss. We formulate the learning task as a convex minimization over both the weights and the dual variables corresponding to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as stochastic subgradient and cutting-plane

CiteSeerX

DSpace@MIT