17,065 research outputs found
Task-based End-to-end Model Learning in Stochastic Optimization
With the increasing popularity of machine learning techniques, it has become
common to see prediction algorithms operating within some larger process.
However, the criteria by which we train these algorithms often differ from the
ultimate criteria on which we evaluate them. This paper proposes an end-to-end
approach for learning probabilistic machine learning models in a manner that
directly captures the ultimate task-based objective for which they will be
used, within the context of stochastic programming. We present three
experimental evaluations of the proposed approach: a classical inventory stock
problem, a real-world electrical grid scheduling task, and a real-world energy
storage arbitrage task. We show that the proposed approach can outperform both
traditional modeling and purely black-box policy optimization approaches in
these applications.Comment: In NIPS 2017. Code available at
https://github.com/locuslab/e2e-model-learnin
Fast L1-Minimization Algorithm for Sparse Approximation Based on an Improved LPNN-LCA framework
The aim of sparse approximation is to estimate a sparse signal according to
the measurement matrix and an observation vector. It is widely used in data
analytics, image processing, and communication, etc. Up to now, a lot of
research has been done in this area, and many off-the-shelf algorithms have
been proposed. However, most of them cannot offer a real-time solution. To some
extent, this shortcoming limits its application prospects. To address this
issue, we devise a novel sparse approximation algorithm based on Lagrange
programming neural network (LPNN), locally competitive algorithm (LCA), and
projection theorem. LPNN and LCA are both analog neural network which can help
us get a real-time solution. The non-differentiable objective function can be
solved by the concept of LCA. Utilizing the projection theorem, we further
modify the dynamics and proposed a new system with global asymptotic stability.
Simulation results show that the proposed sparse approximation method has the
real-time solutions with satisfactory MSEs
Solving the L1 regularized least square problem via a box-constrained smooth minimization
In this paper, an equivalent smooth minimization for the L1 regularized least
square problem is proposed. The proposed problem is a convex box-constrained
smooth minimization which allows applying fast optimization methods to find its
solution. Further, it is investigated that the property "the dual of dual is
primal" holds for the L1 regularized least square problem. A solver for the
smooth problem is proposed, and its affinity to the proximal gradient is shown.
Finally, the experiments on L1 and total variation regularized problems are
performed, and the corresponding results are reported.Comment: 5 page
A Dual Approach to Scalable Verification of Deep Networks
This paper addresses the problem of formally verifying desirable properties
of neural networks, i.e., obtaining provable guarantees that neural networks
satisfy specifications relating their inputs and outputs (robustness to bounded
norm adversarial perturbations, for example). Most previous work on this topic
was limited in its applicability by the size of the network, network
architecture and the complexity of properties to be verified. In contrast, our
framework applies to a general class of activation functions and specifications
on neural network inputs and outputs. We formulate verification as an
optimization problem (seeking to find the largest violation of the
specification) and solve a Lagrangian relaxation of the optimization problem to
obtain an upper bound on the worst case violation of the specification being
verified. Our approach is anytime i.e. it can be stopped at any time and a
valid bound on the maximum violation can be obtained. We develop specialized
verification algorithms with provable tightness guarantees under special
assumptions and demonstrate the practical significance of our general
verification approach on a variety of verification tasks
Optimization under Uncertainty in the Era of Big Data and Deep Learning: When Machine Learning Meets Mathematical Programming
This paper reviews recent advances in the field of optimization under
uncertainty via a modern data lens, highlights key research challenges and
promise of data-driven optimization that organically integrates machine
learning and mathematical programming for decision-making under uncertainty,
and identifies potential research opportunities. A brief review of classical
mathematical programming techniques for hedging against uncertainty is first
presented, along with their wide spectrum of applications in Process Systems
Engineering. A comprehensive review and classification of the relevant
publications on data-driven distributionally robust optimization, data-driven
chance constrained program, data-driven robust optimization, and data-driven
scenario-based optimization is then presented. This paper also identifies
fertile avenues for future research that focuses on a closed-loop data-driven
optimization framework, which allows the feedback from mathematical programming
to machine learning, as well as scenario-based optimization leveraging the
power of deep learning techniques. Perspectives on online learning-based
data-driven multistage optimization with a learning-while-optimizing scheme is
presented
Machine learning approach to chance-constrained problems: An algorithm based on the stochastic gradient descent
We consider chance-constrained problems with discrete random distribution. We
aim for problems with a large number of scenarios. We propose a novel method
based on the stochastic gradient descent method which performs updates of the
decision variable based only on considering a few scenarios. We modify it to
handle the non-separable objective. Complexity analysis and a comparison with
the standard (batch) gradient descent method is provided. We give three
examples with non-convex data and show that our method provides a good solution
fast even when the number of scenarios is large
Differentiable MPC for End-to-end Planning and Control
We present foundations for using Model Predictive Control (MPC) as a
differentiable policy class for reinforcement learning in continuous state and
action spaces. This provides one way of leveraging and combining the advantages
of model-free and model-based approaches. Specifically, we differentiate
through MPC by using the KKT conditions of the convex approximation at a fixed
point of the controller. Using this strategy, we are able to learn the cost and
dynamics of a controller via end-to-end learning. Our experiments focus on
imitation learning in the pendulum and cartpole domains, where we learn the
cost and dynamics terms of an MPC policy class. We show that our MPC policies
are significantly more data-efficient than a generic neural network and that
our method is superior to traditional system identification in a setting where
the expert is unrealizable.Comment: NeurIPS 201
OptNet: Differentiable Optimization as a Layer in Neural Networks
This paper presents OptNet, a network architecture that integrates
optimization problems (here, specifically in the form of quadratic programs) as
individual layers in larger end-to-end trainable deep networks. These layers
encode constraints and complex dependencies between the hidden states that
traditional convolutional and fully-connected layers often cannot capture. In
this paper, we explore the foundations for such an architecture: we show how
techniques from sensitivity analysis, bilevel optimization, and implicit
differentiation can be used to exactly differentiate through these layers and
with respect to layer parameters; we develop a highly efficient solver for
these layers that exploits fast GPU-based batch solves within a primal-dual
interior point method, and which provides backpropagation gradients with
virtually no additional cost on top of the solve; and we highlight the
application of these approaches in several problems. In one notable example, we
show that the method is capable of learning to play mini-Sudoku (4x4) given
just input and output games, with no a priori information about the rules of
the game; this highlights the ability of our architecture to learn hard
constraints better than other neural architectures.Comment: ICML 201
The Ontology of Knowledge Based Optimization
Optimization has been becoming a central of studies in mathematic and has
many areas with different applications. However, many themes of optimization
came from different area have not ties closing to origin concepts. This paper
is to address some variants of optimization problems using ontology in order to
building basic of knowledge about optimization, and then using it to enhance
strategy to achieve knowledge based optimization.Comment: 20 pages, Proceedings of International/National Seminar Matematika
dan Terapan (SiManTap), FMIPA Universitas Sumatera Utara, Medan: 11-3
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning
The goal of this tutorial is to introduce key models, algorithms, and open
questions related to the use of optimization methods for solving problems
arising in machine learning. It is written with an INFORMS audience in mind,
specifically those readers who are familiar with the basics of optimization
algorithms, but less familiar with machine learning. We begin by deriving a
formulation of a supervised learning problem and show how it leads to various
optimization problems, depending on the context and underlying assumptions. We
then discuss some of the distinctive features of these optimization problems,
focusing on the examples of logistic regression and the training of deep neural
networks. The latter half of the tutorial focuses on optimization algorithms,
first for convex logistic regression, for which we discuss the use of
first-order methods, the stochastic gradient method, variance reducing
stochastic methods, and second-order methods. Finally, we discuss how these
approaches can be employed to the training of deep neural networks, emphasizing
the difficulties that arise from the complex, nonconvex structure of these
models
- …