1,046 research outputs found

    When Deep Learning Meets Polyhedral Theory: A Survey

    Full text link
    In the past decade, deep learning became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural networks in tasks such as computer vision and natural language processing. Meanwhile, the structure of neural networks converged back to simpler representations based on piecewise constant and piecewise linear functions such as the Rectified Linear Unit (ReLU), which became the most commonly used type of activation function in neural networks. That made certain types of network structure \unicode{x2014}such as the typical fully-connected feedforward neural network\unicode{x2014} amenable to analysis through polyhedral theory and to the application of methodologies such as Linear Programming (LP) and Mixed-Integer Linear Programming (MILP) for a variety of purposes. In this paper, we survey the main topics emerging from this fast-paced area of work, which bring a fresh perspective to understanding neural networks in more detail as well as to applying linear optimization techniques to train, verify, and reduce the size of such networks

    Optimization of low-cost integration of wind and solar power in multi-node electricity systems: Mathematical modelling and dual solution approaches

    Get PDF
    The global production of electricity contributes significantly to the release of CO2 emissions. Therefore, a transformation of the electricity system is of vital importance in order to restrict global warming. This thesis concerns modelling and methodology of electricity systems which contain a large share of variable renewable electricity generation (i.e. wind and solar power).The two models developed in this thesis concern optimization of long-term investments in the electricity system. They aim at minimizing investment and production costs under electricity production constraints, using different spatial resolutions and technical detail, while meeting the electricity demand. These models are very large in nature due to the 1) high temporal resolution needed to capture the wind and solar variations while maintaining chronology in time, and 2) need to cover a large geographical scope in order to represent strategies to manage these variations (e.g.\ electricity trade). Thus, different decomposition methods are applied to reduce computation times. We develop three different decomposition methods: Lagrangian relaxation combined with variable splitting solved using either i) a subgradient algorithm or ii) an ADMM algorithm, and iii) a heuristic decomposition using a consensus algorithm. In all three cases, the decomposition is done with respect to the temporal resolution by dividing the year into 2-week periods. The decomposition methods are tested and evaluated for cases involving regions with different energy mixes and conditions for wind and solar power. Numerical results show faster computation times compared to the non-decomposed models and capacity investment options similar to the optimal solutions given by the latter models. However, the reduction in computation time may not be sufficient to motivate the increase in complexity and uncertainty of the decomposed models

    Mixed-integer linearity in nonlinear optimization: a trust region approach

    Full text link
    Bringing together nonlinear optimization with mixed-integer linear constraints enables versatile modeling, but poses significant computational challenges. We investigate a method to solve these problems based on sequential mixed-integer linearization with trust region safeguard, computing feasible iterates via calls to a generic mixed-integer linear solver. Convergence to critical, possibly suboptimal, feasible points is established for arbitrary starting points. Finally, we present numerical applications in nonsmooth optimal control and optimal network design and operation.Comment: 17 pages, 3 figures, 2 table

    A Safe Approximation Based on Mixed-Integer Optimization for Non-Convex Distributional Robustness Governed by Univariate Indicator Functions

    Full text link
    In this work, we present algorithmically tractable safe approximations of distributionally robust optimization (DRO) problems. The considered ambiguity sets can exploit information on moments as well as confidence sets. Typically, reformulation approaches using duality theory need to make strong assumptions on the structure of the underlying constraints, such as convexity in the decisions or concavity in the uncertainty. In contrast, here we present a duality-based reformulation approach for DRO problems, where the objective of the adverserial is allowed to depend on univariate indicator functions. This renders the problem nonlinear and nonconvex. In order to be able to reformulate the semiinfinite constraints nevertheless, an exact reformulation is presented that is approximated by a discretized counterpart. The approximation is realized as a mixed-integer linear problem that yields sufficient conditions for distributional robustness of the original problem. Furthermore, it is proven that with increasingly fine discretizations, the discretized reformulation converges to the original distributionally robust problem. The approach is made concrete for a challenging, fundamental task in particle separation that appears in material design. Computational results for realistic settings show that the safe approximation yields robust solutions of high-quality and can be computed within short time.Comment: 28 pages, 7 figure

    Numerical Methods for Convex Multistage Stochastic Optimization

    Full text link
    Optimization problems involving sequential decisions in a stochastic environment were studied in Stochastic Programming (SP), Stochastic Optimal Control (SOC) and Markov Decision Processes (MDP). In this paper we mainly concentrate on SP and SOC modelling approaches. In these frameworks there are natural situations when the considered problems are convex. Classical approach to sequential optimization is based on dynamic programming. It has the problem of the so-called ``Curse of Dimensionality", in that its computational complexity increases exponentially with increase of dimension of state variables. Recent progress in solving convex multistage stochastic problems is based on cutting planes approximations of the cost-to-go (value) functions of dynamic programming equations. Cutting planes type algorithms in dynamical settings is one of the main topics of this paper. We also discuss Stochastic Approximation type methods applied to multistage stochastic optimization problems. From the computational complexity point of view, these two types of methods seem to be complimentary to each other. Cutting plane type methods can handle multistage problems with a large number of stages, but a relatively smaller number of state (decision) variables. On the other hand, stochastic approximation type methods can only deal with a small number of stages, but a large number of decision variables

    Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning

    Full text link
    Optimizing noisy functions online, when evaluating the objective requires experiments on a deployed system, is a crucial task arising in manufacturing, robotics and many others. Often, constraints on safe inputs are unknown ahead of time, and we only obtain noisy information, indicating how close we are to violating the constraints. Yet, safety must be guaranteed at all times, not only for the final output of the algorithm. We introduce a general approach for seeking a stationary point in high dimensional non-linear stochastic optimization problems in which maintaining safety during learning is crucial. Our approach called LB-SGD is based on applying stochastic gradient descent (SGD) with a carefully chosen adaptive step size to a logarithmic barrier approximation of the original problem. We provide a complete convergence analysis of non-convex, convex, and strongly-convex smooth constrained problems, with first-order and zeroth-order feedback. Our approach yields efficient updates and scales better with dimensionality compared to existing approaches. We empirically compare the sample complexity and the computational cost of our method with existing safe learning approaches. Beyond synthetic benchmarks, we demonstrate the effectiveness of our approach on minimizing constraint violation in policy search tasks in safe reinforcement learning (RL).Comment: 36 pages, 9 pages of appendi

    Moving Horizon Estimation for the Two-tank System

    Get PDF
    This thesis presents the application and evaluation of Moving Horizon Estimation (MHE) for the nonlinear two-tank system. MHE is an iterative optimization-based approach that continuously updates the estimates of the states by solving an optimization problem over a fixed-size, receding horizon. Linear and nonlinear MHE-based estimators are designed and implemented in Matlab for evaluation in simulation environment and Simulink for on-line realization and validation. The linear and nonlinear MHE are evaluated in comparison with the Kalman and Extended Kalman filter through extensive simulations and experimental validation, assessing their accuracy, efficiency, and overall performance. The results of the two-tank state and unmeasured disturbance estimation shows the benefit of the MHE

    Multi-Fidelity Bayesian Optimization for Efficient Materials Design

    Get PDF
    Materials design is a process of identifying compositions and structures to achieve desirable properties. Usually, costly experiments or simulations are required to evaluate the objective function for a design solution. Therefore, one of the major challenges is how to reduce the cost associated with sampling and evaluating the objective. Bayesian optimization is a new global optimization method which can increase the sampling efficiency with the guidance of the surrogate of the objective. In this work, a new acquisition function, called consequential improvement, is proposed for simultaneous selection of the solution and fidelity level of sampling. With the new acquisition function, the subsequent iteration is considered for potential selections at low-fidelity levels, because evaluations at the highest fidelity level are usually required to provide reliable objective values. To reduce the number of samples required to train the surrogate for molecular design, a new recursive hierarchical similarity metric is proposed. The new similarity metric quantifies the differences between molecules at multiple levels of hierarchy simultaneously based on the connections between multiscale descriptions of the structures. The new methodologies are demonstrated with simulation-based design of materials and structures based on fully atomistic and coarse-grained molecular dynamics simulations, and finite-element analysis. The new similarity metric is demonstrated in the design of tactile sensors and biodegradable oligomers. The multi-fidelity Bayesian optimization method is also illustrated with the multiscale design of a piezoelectric transducer by concurrently optimizing the atomic composition of the aluminum titanium nitride ceramic and the device’s porous microstructure at the micrometer scale.Ph.D

    Structural optimization in steel structures, algorithms and applications

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Entropic Gromov-Wasserstein Distances: Stability, Algorithms, and Distributional Limits

    Full text link
    The Gromov-Wasserstein (GW) distance quantifies discrepancy between metric measure spaces, but suffers from computational hardness. The entropic Gromov-Wasserstein (EGW) distance serves as a computationally efficient proxy for the GW distance. Recently, it was shown that the quadratic GW and EGW distances admit variational forms that tie them to the well-understood optimal transport (OT) and entropic OT (EOT) problems. By leveraging this connection, we derive two notions of stability for the EGW problem with the quadratic or inner product cost. The first stability notion enables us to establish convexity and smoothness of the objective in this variational problem. This results in the first efficient algorithms for solving the EGW problem that are subject to formal guarantees in both the convex and non-convex regimes. The second stability notion is used to derive a comprehensive limit distribution theory for the empirical EGW distance and, under additional conditions, asymptotic normality, bootstrap consistency, and semiparametric efficiency thereof.Comment: 66 pages, 3 figure
    • …
    corecore