28,323 research outputs found
Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives
We consider a discrete optimization formulation for learning sparse
classifiers, where the outcome depends upon a linear combination of a small
subset of features. Recent work has shown that mixed integer programming (MIP)
can be used to solve (to optimality) -regularized regression problems
at scales much larger than what was conventionally considered possible. Despite
their usefulness, MIP-based global optimization approaches are significantly
slower compared to the relatively mature algorithms for -regularization
and heuristics for nonconvex regularized problems. We aim to bridge this gap in
computation times by developing new MIP-based algorithms for
-regularized classification. We propose two classes of scalable
algorithms: an exact algorithm that can handle features in a
few minutes, and approximate algorithms that can address instances with
in times comparable to the fast -based algorithms. Our
exact algorithm is based on the novel idea of \textsl{integrality generation},
which solves the original problem (with binary variables) via a sequence of
mixed integer programs that involve a small number of binary variables. Our
approximate algorithms are based on coordinate descent and local combinatorial
search. In addition, we present new estimation error bounds for a class of
-regularized estimators. Experiments on real and synthetic data
demonstrate that our approach leads to models with considerably improved
statistical performance (especially, variable selection) when compared to
competing methods.Comment: To appear in JML
Optimization with mixed-integer, complementarity and bilevel constraints with applications to energy and food markets
In this dissertation, we discuss three classes of nonconvex optimization problems, namely, mixed-integer programming, nonlinear complementarity problems, and mixed-integer bilevel programming.
For mixed-integer programming, we identify a class of cutting planes, namely the class of cutting planes derived from lattice-free cross-polytopes, which are proven to provide good approximations to the problem while being efficient to compute. We show that the closure of these cuts gives an approximation that depends only on the ambient dimension and that the cuts can be computed efficiently by explicitly providing an algorithm to compute the cut coefficients in time, as opposed to solving a nearest lattice-vector problem, which could be much harder.
For complementarity problems, we develop a first-order approximation algorithm to efficiently approximate the covariance of the decision in a stochastic complementarity problem. The method can be used to approximate the covariance for large-scale problems by solving a system of linear equations. We also provide bounds to the error incurred in this technique. We then use the technique to analyze policies related to the North American natural gas market.
Further, we use this branch of nonconvex problems in the Ethiopian food market to analyze the regional effects of exogenous shocks on the market. We develop a detailed model of the food production, transportation, trade, storage, and consumption in Ethiopia, and test it against exogenous shocks. These shocks are motivated by the prediction that teff, a food grain whose export is banned now, could become a super grain. We present the regional effects of different government policies in response to this shock.
For mixed-integer bilevel programming, we develop algorithms that run in polynomial time, provided a subset of the input parameters are fixed. Besides the -hardness of the general version of the problem, we show polynomial solvability and -completeness of certain restricted versions of this problem.
Finally, we completely characterize the feasible regions represented by each of these different types of nonconvex optimization problems. We show that the representability of linear complementarity problems, continuous bilevel programs, and polyhedral reverse-convex programs are the same, and they coincide with that of mixed-integer programs if the feasible region is bounded. We also show that the feasible region of any mixed-integer bilevel program is a union of the feasible regions of finitely many mixed-integer programs up to projections and closures
Mathematical optimization in deep learning
Mathematical Optimization plays a pillar role in Machine Learning (ML) and Neural Networks (NN) are amongst the most popular and effective ML architectures and are the subject of a very intense investigation. They have also been proven immensely powerful at solving prediction tasks in areas such as speech recognition, image classification, robotics and quantum physics. In this work we present the problem of training a Deep Neural Network (DNN), specifically the continuous optimization problem arising in Feed-Forward Networks
with Rectified Linear Unit (ReLU) activation. Then we will discuss the inverse problem, presenting a model for a trained DNN as a 0-1 Mixed Integer Linear Program (MILP). Some applications, such as feature visualization and the construction of adversarial examples will be outlined. Computational experiments are reported for both direct and inverse problem. The remainder of the text contains the AMPL codes used for solving the posed problems.La optimización matemática juega un papel fundamental en el aprendizaje automático (AA), y las redes neuronales (NN) se encuentran entre las estructuras más populares y efectivas dentro de este campo. Por ello, son objecto de una intensa investigación. Además, han demostrado ser inmensamente potentes resolviendo tareas de predicción en áreas como reconocimiento automático del habla, clasificación de imágenes, robótica y fÃsica cuántica. En este trabajo, se presenta el problema de entrenar una red neuronal profunda
(DNN), especÃficamente el problema de optimización continua que surge en las redes neuronales prealimentadas (FNN) con rectificador (ReLU) como función de activación. Posteriormente, se discutirá el problema inverso, presentaremos un modelo para una DNN que ya ha sido entrenada como un problema de programación lineal en enteros mixta. Describiremos algunas aplicaciones, como visualización de caracterÃsticas y la construcción de ejemplos maliciosos. Se realizarán los experimentos computacionales para ambos problemas, el directo y el inverso. Los códigos de AMPL para los problemas planteados se encuentran al final del documento.Universidad de Sevilla. Doble Grado en FÃsica y Matemática
On the complexity of nonlinear mixed-integer optimization
This is a survey on the computational complexity of nonlinear mixed-integer
optimization. It highlights a selection of important topics, ranging from
incomputability results that arise from number theory and logic, to recently
obtained fully polynomial time approximation schemes in fixed dimension, and to
strongly polynomial-time algorithms for special cases.Comment: 26 pages, 5 figures; to appear in: Mixed-Integer Nonlinear
Optimization, IMA Volumes, Springer-Verla
The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
We propose a novel high-dimensional linear regression estimator: the Discrete
Dantzig Selector, which minimizes the number of nonzero regression coefficients
subject to a budget on the maximal absolute correlation between the features
and residuals. Motivated by the significant advances in integer optimization
over the past 10-15 years, we present a Mixed Integer Linear Optimization
(MILO) approach to obtain certifiably optimal global solutions to this
nonconvex optimization problem. The current state of algorithmics in integer
optimization makes our proposal substantially more computationally attractive
than the least squares subset selection framework based on integer quadratic
optimization, recently proposed in [8] and the continuous nonconvex quadratic
optimization framework of [33]. We propose new discrete first-order methods,
which when paired with state-of-the-art MILO solvers, lead to good solutions
for the Discrete Dantzig Selector problem for a given computational budget. We
illustrate that our integrated approach provides globally optimal solutions in
significantly shorter computation times, when compared to off-the-shelf MILO
solvers. We demonstrate both theoretically and empirically that in a wide range
of regimes the statistical properties of the Discrete Dantzig Selector are
superior to those of popular -based approaches. We illustrate that
our approach can handle problem instances with p = 10,000 features with
certifiable optimality making it a highly scalable combinatorial variable
selection approach in sparse linear modeling
Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded
Decision trees usefully represent sparse, high dimensional and noisy data.
Having learned a function from this data, we may want to thereafter integrate
the function into a larger decision-making problem, e.g., for picking the best
chemical process catalyst. We study a large-scale, industrially-relevant
mixed-integer nonlinear nonconvex optimization problem involving both
gradient-boosted trees and penalty functions mitigating risk. This
mixed-integer optimization problem with convex penalty terms broadly applies to
optimizing pre-trained regression tree models. Decision makers may wish to
optimize discrete models to repurpose legacy predictive models, or they may
wish to optimize a discrete model that particularly well-represents a data set.
We develop several heuristic methods to find feasible solutions, and an exact,
branch-and-bound algorithm leveraging structural properties of the
gradient-boosted trees and penalty functions. We computationally test our
methods on concrete mixture design instance and a chemical catalysis industrial
instance
- …