1,046 research outputs found
When Deep Learning Meets Polyhedral Theory: A Survey
In the past decade, deep learning became the prevalent methodology for
predictive modeling thanks to the remarkable accuracy of deep neural networks
in tasks such as computer vision and natural language processing. Meanwhile,
the structure of neural networks converged back to simpler representations
based on piecewise constant and piecewise linear functions such as the
Rectified Linear Unit (ReLU), which became the most commonly used type of
activation function in neural networks. That made certain types of network
structure \unicode{x2014}such as the typical fully-connected feedforward
neural network\unicode{x2014} amenable to analysis through polyhedral theory
and to the application of methodologies such as Linear Programming (LP) and
Mixed-Integer Linear Programming (MILP) for a variety of purposes. In this
paper, we survey the main topics emerging from this fast-paced area of work,
which bring a fresh perspective to understanding neural networks in more detail
as well as to applying linear optimization techniques to train, verify, and
reduce the size of such networks
Optimization of low-cost integration of wind and solar power in multi-node electricity systems: Mathematical modelling and dual solution approaches
The global production of electricity contributes significantly to the release of CO2 emissions. Therefore, a transformation of the electricity system is of vital importance in order to restrict global warming. This thesis concerns modelling and methodology of electricity systems which contain a large share of variable renewable electricity generation (i.e. wind and solar power).The two models developed in this thesis concern optimization of long-term investments in the electricity system. They aim at minimizing investment and production costs under electricity production constraints, using different spatial resolutions and technical detail, while meeting the electricity demand. These models are very large in nature due to the 1) high temporal resolution needed to capture the wind and solar variations while maintaining chronology in time, and 2) need to cover a large geographical scope in order to represent strategies to manage these variations (e.g.\ electricity trade). Thus, different decomposition methods are applied to reduce computation times. We develop three different decomposition methods: Lagrangian relaxation combined with variable splitting solved using either i) a subgradient algorithm or ii) an ADMM algorithm, and iii) a heuristic decomposition using a consensus algorithm. In all three cases, the decomposition is done with respect to the temporal resolution by dividing the year into 2-week periods. The decomposition methods are tested and evaluated for cases involving regions with different energy mixes and conditions for wind and solar power. Numerical results show faster computation times compared to the non-decomposed models and capacity investment options similar to the optimal solutions given by the latter models. However, the reduction in computation time may not be sufficient to motivate the increase in complexity and uncertainty of the decomposed models
Mixed-integer linearity in nonlinear optimization: a trust region approach
Bringing together nonlinear optimization with mixed-integer linear
constraints enables versatile modeling, but poses significant computational
challenges. We investigate a method to solve these problems based on sequential
mixed-integer linearization with trust region safeguard, computing feasible
iterates via calls to a generic mixed-integer linear solver. Convergence to
critical, possibly suboptimal, feasible points is established for arbitrary
starting points. Finally, we present numerical applications in nonsmooth
optimal control and optimal network design and operation.Comment: 17 pages, 3 figures, 2 table
A Safe Approximation Based on Mixed-Integer Optimization for Non-Convex Distributional Robustness Governed by Univariate Indicator Functions
In this work, we present algorithmically tractable safe approximations of
distributionally robust optimization (DRO) problems. The considered ambiguity
sets can exploit information on moments as well as confidence sets. Typically,
reformulation approaches using duality theory need to make strong assumptions
on the structure of the underlying constraints, such as convexity in the
decisions or concavity in the uncertainty. In contrast, here we present a
duality-based reformulation approach for DRO problems, where the objective of
the adverserial is allowed to depend on univariate indicator functions. This
renders the problem nonlinear and nonconvex. In order to be able to reformulate
the semiinfinite constraints nevertheless, an exact reformulation is presented
that is approximated by a discretized counterpart. The approximation is
realized as a mixed-integer linear problem that yields sufficient conditions
for distributional robustness of the original problem. Furthermore, it is
proven that with increasingly fine discretizations, the discretized
reformulation converges to the original distributionally robust problem. The
approach is made concrete for a challenging, fundamental task in particle
separation that appears in material design. Computational results for realistic
settings show that the safe approximation yields robust solutions of
high-quality and can be computed within short time.Comment: 28 pages, 7 figure
Numerical Methods for Convex Multistage Stochastic Optimization
Optimization problems involving sequential decisions in a stochastic
environment were studied in Stochastic Programming (SP), Stochastic Optimal
Control (SOC) and Markov Decision Processes (MDP). In this paper we mainly
concentrate on SP and SOC modelling approaches. In these frameworks there are
natural situations when the considered problems are convex. Classical approach
to sequential optimization is based on dynamic programming. It has the problem
of the so-called ``Curse of Dimensionality", in that its computational
complexity increases exponentially with increase of dimension of state
variables. Recent progress in solving convex multistage stochastic problems is
based on cutting planes approximations of the cost-to-go (value) functions of
dynamic programming equations. Cutting planes type algorithms in dynamical
settings is one of the main topics of this paper. We also discuss Stochastic
Approximation type methods applied to multistage stochastic optimization
problems. From the computational complexity point of view, these two types of
methods seem to be complimentary to each other. Cutting plane type methods can
handle multistage problems with a large number of stages, but a relatively
smaller number of state (decision) variables. On the other hand, stochastic
approximation type methods can only deal with a small number of stages, but a
large number of decision variables
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning
Optimizing noisy functions online, when evaluating the objective requires
experiments on a deployed system, is a crucial task arising in manufacturing,
robotics and many others. Often, constraints on safe inputs are unknown ahead
of time, and we only obtain noisy information, indicating how close we are to
violating the constraints. Yet, safety must be guaranteed at all times, not
only for the final output of the algorithm.
We introduce a general approach for seeking a stationary point in high
dimensional non-linear stochastic optimization problems in which maintaining
safety during learning is crucial. Our approach called LB-SGD is based on
applying stochastic gradient descent (SGD) with a carefully chosen adaptive
step size to a logarithmic barrier approximation of the original problem. We
provide a complete convergence analysis of non-convex, convex, and
strongly-convex smooth constrained problems, with first-order and zeroth-order
feedback. Our approach yields efficient updates and scales better with
dimensionality compared to existing approaches.
We empirically compare the sample complexity and the computational cost of
our method with existing safe learning approaches. Beyond synthetic benchmarks,
we demonstrate the effectiveness of our approach on minimizing constraint
violation in policy search tasks in safe reinforcement learning (RL).Comment: 36 pages, 9 pages of appendi
Moving Horizon Estimation for the Two-tank System
This thesis presents the application and evaluation of Moving Horizon Estimation (MHE) for the nonlinear two-tank system. MHE is an iterative optimization-based approach that continuously updates the estimates of the states by solving an optimization problem over a fixed-size, receding horizon. Linear and nonlinear MHE-based estimators are designed and implemented in Matlab for evaluation in simulation environment and Simulink for on-line realization and validation. The linear and nonlinear MHE are evaluated in comparison with the Kalman and Extended Kalman filter through extensive simulations and experimental validation, assessing their accuracy, efficiency, and overall performance. The results of the two-tank state and unmeasured disturbance estimation shows the benefit of the MHE
Multi-Fidelity Bayesian Optimization for Efficient Materials Design
Materials design is a process of identifying compositions and structures to achieve
desirable properties. Usually, costly experiments or simulations are required to evaluate
the objective function for a design solution. Therefore, one of the major challenges is how
to reduce the cost associated with sampling and evaluating the objective. Bayesian
optimization is a new global optimization method which can increase the sampling
efficiency with the guidance of the surrogate of the objective. In this work, a new
acquisition function, called consequential improvement, is proposed for simultaneous
selection of the solution and fidelity level of sampling. With the new acquisition function,
the subsequent iteration is considered for potential selections at low-fidelity levels, because
evaluations at the highest fidelity level are usually required to provide reliable objective
values. To reduce the number of samples required to train the surrogate for molecular
design, a new recursive hierarchical similarity metric is proposed. The new similarity
metric quantifies the differences between molecules at multiple levels of hierarchy
simultaneously based on the connections between multiscale descriptions of the structures.
The new methodologies are demonstrated with simulation-based design of materials and
structures based on fully atomistic and coarse-grained molecular dynamics simulations,
and finite-element analysis. The new similarity metric is demonstrated in the design of
tactile sensors and biodegradable oligomers. The multi-fidelity Bayesian optimization
method is also illustrated with the multiscale design of a piezoelectric transducer by
concurrently optimizing the atomic composition of the aluminum titanium nitride ceramic
and the device’s porous microstructure at the micrometer scale.Ph.D
Structural optimization in steel structures, algorithms and applications
L'abstract è presente nell'allegato / the abstract is in the attachmen
Entropic Gromov-Wasserstein Distances: Stability, Algorithms, and Distributional Limits
The Gromov-Wasserstein (GW) distance quantifies discrepancy between metric
measure spaces, but suffers from computational hardness. The entropic
Gromov-Wasserstein (EGW) distance serves as a computationally efficient proxy
for the GW distance. Recently, it was shown that the quadratic GW and EGW
distances admit variational forms that tie them to the well-understood optimal
transport (OT) and entropic OT (EOT) problems. By leveraging this connection,
we derive two notions of stability for the EGW problem with the quadratic or
inner product cost. The first stability notion enables us to establish
convexity and smoothness of the objective in this variational problem. This
results in the first efficient algorithms for solving the EGW problem that are
subject to formal guarantees in both the convex and non-convex regimes. The
second stability notion is used to derive a comprehensive limit distribution
theory for the empirical EGW distance and, under additional conditions,
asymptotic normality, bootstrap consistency, and semiparametric efficiency
thereof.Comment: 66 pages, 3 figure
- …