28,323 research outputs found

    Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

    Full text link
    We consider a discrete optimization formulation for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to solve (to optimality) ℓ0\ell_0-regularized regression problems at scales much larger than what was conventionally considered possible. Despite their usefulness, MIP-based global optimization approaches are significantly slower compared to the relatively mature algorithms for ℓ1\ell_1-regularization and heuristics for nonconvex regularized problems. We aim to bridge this gap in computation times by developing new MIP-based algorithms for ℓ0\ell_0-regularized classification. We propose two classes of scalable algorithms: an exact algorithm that can handle p≈50,000p\approx 50,000 features in a few minutes, and approximate algorithms that can address instances with p≈106p\approx 10^6 in times comparable to the fast ℓ1\ell_1-based algorithms. Our exact algorithm is based on the novel idea of \textsl{integrality generation}, which solves the original problem (with pp binary variables) via a sequence of mixed integer programs that involve a small number of binary variables. Our approximate algorithms are based on coordinate descent and local combinatorial search. In addition, we present new estimation error bounds for a class of ℓ0\ell_0-regularized estimators. Experiments on real and synthetic data demonstrate that our approach leads to models with considerably improved statistical performance (especially, variable selection) when compared to competing methods.Comment: To appear in JML

    Optimization with mixed-integer, complementarity and bilevel constraints with applications to energy and food markets

    Get PDF
    In this dissertation, we discuss three classes of nonconvex optimization problems, namely, mixed-integer programming, nonlinear complementarity problems, and mixed-integer bilevel programming. For mixed-integer programming, we identify a class of cutting planes, namely the class of cutting planes derived from lattice-free cross-polytopes, which are proven to provide good approximations to the problem while being efficient to compute. We show that the closure of these cuts gives an approximation that depends only on the ambient dimension and that the cuts can be computed efficiently by explicitly providing an algorithm to compute the cut coefficients in O(n2n)O(n2^n) time, as opposed to solving a nearest lattice-vector problem, which could be much harder. For complementarity problems, we develop a first-order approximation algorithm to efficiently approximate the covariance of the decision in a stochastic complementarity problem. The method can be used to approximate the covariance for large-scale problems by solving a system of linear equations. We also provide bounds to the error incurred in this technique. We then use the technique to analyze policies related to the North American natural gas market. Further, we use this branch of nonconvex problems in the Ethiopian food market to analyze the regional effects of exogenous shocks on the market. We develop a detailed model of the food production, transportation, trade, storage, and consumption in Ethiopia, and test it against exogenous shocks. These shocks are motivated by the prediction that teff, a food grain whose export is banned now, could become a super grain. We present the regional effects of different government policies in response to this shock. For mixed-integer bilevel programming, we develop algorithms that run in polynomial time, provided a subset of the input parameters are fixed. Besides the Σ2p\Sigma^p_2-hardness of the general version of the problem, we show polynomial solvability and NPNP-completeness of certain restricted versions of this problem. Finally, we completely characterize the feasible regions represented by each of these different types of nonconvex optimization problems. We show that the representability of linear complementarity problems, continuous bilevel programs, and polyhedral reverse-convex programs are the same, and they coincide with that of mixed-integer programs if the feasible region is bounded. We also show that the feasible region of any mixed-integer bilevel program is a union of the feasible regions of finitely many mixed-integer programs up to projections and closures

    Mathematical optimization in deep learning

    Get PDF
    Mathematical Optimization plays a pillar role in Machine Learning (ML) and Neural Networks (NN) are amongst the most popular and effective ML architectures and are the subject of a very intense investigation. They have also been proven immensely powerful at solving prediction tasks in areas such as speech recognition, image classification, robotics and quantum physics. In this work we present the problem of training a Deep Neural Network (DNN), specifically the continuous optimization problem arising in Feed-Forward Networks with Rectified Linear Unit (ReLU) activation. Then we will discuss the inverse problem, presenting a model for a trained DNN as a 0-1 Mixed Integer Linear Program (MILP). Some applications, such as feature visualization and the construction of adversarial examples will be outlined. Computational experiments are reported for both direct and inverse problem. The remainder of the text contains the AMPL codes used for solving the posed problems.La optimización matemática juega un papel fundamental en el aprendizaje automático (AA), y las redes neuronales (NN) se encuentran entre las estructuras más populares y efectivas dentro de este campo. Por ello, son objecto de una intensa investigación. Además, han demostrado ser inmensamente potentes resolviendo tareas de predicción en áreas como reconocimiento automático del habla, clasificación de imágenes, robótica y física cuántica. En este trabajo, se presenta el problema de entrenar una red neuronal profunda (DNN), específicamente el problema de optimización continua que surge en las redes neuronales prealimentadas (FNN) con rectificador (ReLU) como función de activación. Posteriormente, se discutirá el problema inverso, presentaremos un modelo para una DNN que ya ha sido entrenada como un problema de programación lineal en enteros mixta. Describiremos algunas aplicaciones, como visualización de características y la construcción de ejemplos maliciosos. Se realizarán los experimentos computacionales para ambos problemas, el directo y el inverso. Los códigos de AMPL para los problemas planteados se encuentran al final del documento.Universidad de Sevilla. Doble Grado en Física y Matemática

    On the complexity of nonlinear mixed-integer optimization

    Full text link
    This is a survey on the computational complexity of nonlinear mixed-integer optimization. It highlights a selection of important topics, ranging from incomputability results that arise from number theory and logic, to recently obtained fully polynomial time approximation schemes in fixed dimension, and to strongly polynomial-time algorithms for special cases.Comment: 26 pages, 5 figures; to appear in: Mixed-Integer Nonlinear Optimization, IMA Volumes, Springer-Verla

    The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization

    Full text link
    We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a Mixed Integer Linear Optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed in [8] and the continuous nonconvex quadratic optimization framework of [33]. We propose new discrete first-order methods, which when paired with state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular â„“1\ell_{1}-based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling

    Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded

    Get PDF
    Decision trees usefully represent sparse, high dimensional and noisy data. Having learned a function from this data, we may want to thereafter integrate the function into a larger decision-making problem, e.g., for picking the best chemical process catalyst. We study a large-scale, industrially-relevant mixed-integer nonlinear nonconvex optimization problem involving both gradient-boosted trees and penalty functions mitigating risk. This mixed-integer optimization problem with convex penalty terms broadly applies to optimizing pre-trained regression tree models. Decision makers may wish to optimize discrete models to repurpose legacy predictive models, or they may wish to optimize a discrete model that particularly well-represents a data set. We develop several heuristic methods to find feasible solutions, and an exact, branch-and-bound algorithm leveraging structural properties of the gradient-boosted trees and penalty functions. We computationally test our methods on concrete mixture design instance and a chemical catalysis industrial instance
    • …
    corecore