    Global Optimization of Gaussian processes

    Gaussian processes~(Kriging) are interpolating data-driven models that are frequently applied in various disciplines. Often, Gaussian processes are trained on datasets and are subsequently embedded as surrogate models in optimization problems. These optimization problems are nonconvex and global optimization is desired. However, previous literature observed computational burdens limiting deterministic global optimization to Gaussian processes trained on few data points. We propose a reduced-space formulation for deterministic global optimization with trained Gaussian processes embedded. For optimization, the branch-and-bound solver branches only on the degrees of freedom and McCormick relaxations are propagated through explicit Gaussian process models. The approach also leads to significantly smaller and computationally cheaper subproblems for lower and upper bounding. To further accelerate convergence, we derive envelopes of common covariance functions for GPs and tight relaxations of acquisition functions used in Bayesian optimization including expected improvement, probability of improvement, and lower confidence bound. In total, we reduce computational time by orders of magnitude compared to state-of-the-art methods, thus overcoming previous computational burdens. We demonstrate the performance and scaling of the proposed method and apply it to Bayesian optimization with global optimization of the acquisition function and chance-constrained programming. The Gaussian process models, acquisition functions, and training scripts are available open-source within the "MeLOn - Machine Learning Models for Optimization" toolbox~(https://git.rwth-aachen.de/avt.svt/public/MeLOn)

    Optimization of Stochastic Discrete Event Simulation Models

    Many systems in logistics can be adequately modeled using stochastic discrete event simulation models. Often these models are used to find a good or optimal configuration of the system. This implies that optimization algorithms have to be coupled with the models. Optimization of stochastic simulation models is a challenging research topic since the approaches should be efficient, reliable and should provide some guarantee to find at least in the limiting case with a runtime going to infinite the optimal solution with a probability converging to 1. The talk gives an overview on the state of the art in simulation optimization. It shows that hybrid algorithms combining global and local optimization methods are currently the best class of optimization approaches in the area and it outlines the need for the development of software tools including available algorithms

    Fast global convergence of gradient methods for high-dimensional statistical recovery

    Many statistical MM-estimators are based on convex optimization problems formed by the combination of a data-dependent loss function with a norm-based regularizer. We analyze the convergence rates of projected gradient and composite gradient methods for solving such problems, working within a high-dimensional framework that allows the data dimension \pdim to grow with (and possibly exceed) the sample size \numobs. This high-dimensional structure precludes the usual global assumptions---namely, strong convexity and smoothness conditions---that underlie much of classical optimization analysis. We define appropriately restricted versions of these conditions, and show that they are satisfied with high probability for various statistical models. Under these conditions, our theory guarantees that projected gradient descent has a globally geometric rate of convergence up to the \emph{statistical precision} of the model, meaning the typical distance between the true unknown parameter θ∗\theta^* and an optimal solution θ^\hat{\theta}. This result is substantially sharper than previous convergence results, which yielded sublinear convergence, or linear convergence only up to the noise level. Our analysis applies to a wide range of MM-estimators and statistical models, including sparse linear regression using Lasso (ℓ1\ell_1-regularized regression); group Lasso for block sparsity; log-linear models with regularization; low-rank matrix recovery using nuclear norm regularization; and matrix decomposition. Overall, our analysis reveals interesting connections between statistical precision and computational efficiency in high-dimensional estimation
