5,775 research outputs found

    Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded

    Get PDF
    Decision trees usefully represent sparse, high dimensional and noisy data. Having learned a function from this data, we may want to thereafter integrate the function into a larger decision-making problem, e.g., for picking the best chemical process catalyst. We study a large-scale, industrially-relevant mixed-integer nonlinear nonconvex optimization problem involving both gradient-boosted trees and penalty functions mitigating risk. This mixed-integer optimization problem with convex penalty terms broadly applies to optimizing pre-trained regression tree models. Decision makers may wish to optimize discrete models to repurpose legacy predictive models, or they may wish to optimize a discrete model that particularly well-represents a data set. We develop several heuristic methods to find feasible solutions, and an exact, branch-and-bound algorithm leveraging structural properties of the gradient-boosted trees and penalty functions. We computationally test our methods on concrete mixture design instance and a chemical catalysis industrial instance

    Stochastic Training of Neural Networks via Successive Convex Approximations

    Full text link
    This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA) techniques. The basic idea is to iteratively replace the original (non-convex, highly dimensional) learning problem with a sequence of (strongly convex) approximations, which are both accurate and simple to optimize. Differently from similar ideas (e.g., quasi-Newton algorithms), the approximations can be constructed using only first-order information of the neural network function, in a stochastic fashion, while exploiting the overall structure of the learning problem for a faster convergence. We discuss several use cases, based on different choices for the loss function (e.g., squared loss and cross-entropy loss), and for the regularization of the NN's weights. We experiment on several medium-sized benchmark problems, and on a large-scale dataset involving simulated physical data. The results show how the algorithm outperforms state-of-the-art techniques, providing faster convergence to a better minimum. Additionally, we show how the algorithm can be easily parallelized over multiple computational units without hindering its performance. In particular, each computational unit can optimize a tailored surrogate function defined on a randomly assigned subset of the input variables, whose dimension can be selected depending entirely on the available computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and Learning System

    A Unified Framework for the Study of Anti-Windup Designs

    Get PDF
    We present a unified framework for the study of linear time-invariant (LTI) systems subject to control input nonlinearities. The framework is based on the following two-step design paradigm: "Design the linear controller ignoring control input nonlinearities and then add anti-windup bumpless transfer (AWBT) compensation to minimize the adverse eflects of any control input nonlinearities on closed loop performance". The resulting AWBT compensation is applicable to multivariable controllers of arbitrary structure and order. All known LTI anti-windup and/or bumpless transfer compensation schemes are shown to be special cases of this framework. It is shown how this framework can handle standard issues such as the analysis of stability and performance with or without uncertainties in the plant model. The actual analysis of stability and performance, and robustness issues are problems in their own right and hence not detailed here. The main result is the unification of existing schemes for AWBT compensation under a general framework

    Development of the adjoint of GEOS-Chem

    Get PDF
    We present the adjoint of the global chemical transport model GEOS-Chem, focusing on the chemical and thermodynamic relationships between sulfate – ammonium – nitrate aerosols and their gas-phase precursors. The adjoint model is constructed from a combination of manually and automatically derived discrete adjoint algorithms and numerical solutions to continuous adjoint equations. Explicit inclusion of the processes that govern secondary formation of inorganic aerosol is shown to afford efficient calculation of model sensitivities such as the dependence of sulfate and nitrate aerosol concentrations on emissions of SOx, NOx, and NH3. The adjoint model is extensively validated by comparing adjoint to finite difference sensitivities, which are shown to agree within acceptable tolerances; most sets of comparisons have a nearly 1:1 correlation and R2>0.9. We explore the robustness of these results, noting how insufficient observations or nonlinearities in the advection routine can degrade the adjoint model performance. The potential for inverse modeling using the adjoint of GEOS-Chem is assessed in a data assimilation framework through a series of tests using simulated observations, demonstrating the feasibility of exploiting gas- and aerosol-phase measurements for optimizing emission inventories of aerosol precursors
    • …
    corecore