Search CORE

176,182 research outputs found

Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

Author: Bhouri Mohamed Aziz
Joly Michael
Perdikaris Paris
Sarkar Soumalya
Yu Robert
Publication venue
Publication date: 14/02/2023
Field of study

Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with re-parameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained optimization task involving shape optimization of rotor blades in turbo-machinery.Comment: 18 pages, 8 figure

arXiv.org e-Print Archive

Discrete and Continuous Optimization Based on Hierarchical Artificial Bee Colony Optimizer

Author: Ben Niu
Hanning Chen
Kunyuan Hu
Lianbo Ma
Maowei He
Yunlong Zhu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

This paper presents a novel optimization algorithm, namely, hierarchical artificial bee colony optimization (HABC), to tackle complex high-dimensional problems. In the proposed multilevel model, the higher-level species can be aggregated by the subpopulations from lower level. In the bottom level, each subpopulation employing the canonical ABC method searches the part-dimensional optimum in parallel, which can be constructed into a complete solution for the upper level. At the same time, the comprehensive learning method with crossover and mutation operator is applied to enhance the global search ability between species. Experiments are conducted on a set of 20 continuous and discrete benchmark problems. The experimental results demonstrate remarkable performance of the HABC algorithm when compared with other six evolutionary algorithms

Crossref

Directory of Open Access Journals

Sequential and adaptive Bayesian computation for inference and optimization

Author: Akyildiz Omer Deniz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/03/2019
Field of study

With the advent of cheap and ubiquitous measurement devices, today more data is measured, recorded, and archived in a relatively short span of time than all data recorded throughout history. Moreover, advances in computation have made it possible to model much more complicated phenomena and to use the vast amounts of data to calibrate the resulting high-dimensional models. In this thesis, we are interested in two fundamental problems which are repeatedly being faced in practice as the dimension of the models and datasets are growing steadily: the problem of inference in high-dimensional models and the problem of optimization for problems when the number of data points is very large. The inference problem gets diﬃcult when the model one wants to calibrate and estimate is deﬁned in a high-dimensional space. The behavior of computational algorithms in high-dimensional spaces is complicated and deﬁes intuition. Computational methods which work accurately for inferring low-dimensional models, for example, may fail to generalize the same performance to high-dimensional models. In recent years, due to the signiﬁcant interest in high-dimensional models, there has been a plethora of work in signal processing and machine learning to develop computational methods which are robust in high-dimensional spaces. In particular, the high-dimensional stochastic ﬁltering problem has attracted signiﬁcant attention as it arises in multiple ﬁelds which are of crucial importance such as geophysics, aerospace, control. In particular, a class of algorithms called particle ﬁlters has received attention and become a fruitful ﬁeld of research because of their accuracy and robustness in low-dimensional systems. In short, these methods keep a cloud of particles (samples in a state space), which describe the empirical probability distribution over the state variable of interest. The particle ﬁlters use a model of the phenomenon of interest to propagate and predict the future states and use an observation model to assimilate the observations to correct the state estimates. The most common particle ﬁlter, called the bootstrap particle ﬁlter (BPF), consists of an iterative sampling-weighting-resampling scheme. However, BPFs also largely fail at inferring high-dimensional dynamical systems due to a number of reasons. In this work, we propose a novel particle ﬁlter, named the nudged particle ﬁlter (NuPF), which speciﬁcally aims at improving the performance of particle ﬁlters in high-dimensional systems. The algorithm relies on the idea of nudging, which has been widely used in the geophysics literature to tackle high-dimensional inference problems. In particular, in addition to standard sampling-weighting-resampling steps of the particle ﬁlter, we deﬁne a general nudging step based on the gradient of the likelihoods, which generalize some of the nudging schemes proposed in the literature. This step is based on modifying the particles, generated in the sampling step, using the gradients of the likelihoods. In particular, the nudging step moves a fraction of the particles to the regions under which they have high-likelihoods. This scheme results in signiﬁcantly improved behavior in high-dimensional models. The resulting NuPF is able to track high-dimensional systems successfully. Unlike the proposed nudging schemes in the literature, the NuPF does not rely on Gaussianity assumptions and can be deﬁned for a general likelihood. We analytically prove that, because we only move a fraction of the particles and not all of them, the algorithm has a convergence rate that matches standard Monte Carlo algorithms. More precisely, the NuPF has the same asymptotic convergence guarantees as the bootstrap particle ﬁlter. As a byproduct, we also show that the nudging step improves the robustness of the particle ﬁlter against model misspeciﬁcation. In particular, model misspeciﬁcation occurs when the true data-generating system and the model posed by the user of the algorithm diﬀer signiﬁcantly. In this case, a majority of computational inference methods fail due to the discrepancy between the modeling assumptions and the observed data. We show that the nudging step increases the robustness of particle ﬁlters against model misspeciﬁcation. Specifically, we prove that the NuPF generates particle systems which have provably higher marginal likelihoods compared to the standard bootstrap particle ﬁlter. This theoretical result is attained by showing that the NuPF can be interpreted as a bootstrap particle ﬁlter for a modiﬁed state-space model. Finally, we demonstrate the empirical behavior of the NuPF with several examples. In particular, we show results on high-dimensional linear state-space models, a misspeciﬁed Lorenz 63 model, a high-dimensional Lorenz 96 model, and a misspeciﬁed object tracking model. In all examples, the NuPF infers the states successfully. The second problem, the so-called scability problem in optimization, occurs because of the large number of data points in modern datasets. With the increasing abundance of data, many problems in signal processing, statistical inference, and machine learning turn into a large-scale optimization problems. For example, in signal processing, one might be interested in estimating a sparse signal given a large number of corrupted observations. Similarly, maximum-likelihood inference problems in statistics result in large-scale optimization problems. Another signiﬁcant application domain is machine learning, where all important training methods are deﬁned as optimization problems. To tackle these problems, computational optimization methods developed over the past decades are ineﬃcient since they need to compute function evaluations or gradients over all the data for a single iteration. Because of this reason, a class of optimization methods, termed stochastic optimization methods, have emerged. The algorithms of this class are designed to tackle problems which are deﬁned over a big number of data points. In short, these methods utilize a subsample of the dataset in order to update the parameter estimate and do so iteratively until some convergence criterion is met. However, there is a major diﬃculty that has to be addressed: Although the convergence theory for these algorithms is understood, they can have unstable behavior in practice. In particular, the most commonly used stochastic optimization method, namely the stochastic gradient descent, can diverge easily if its step-size is poorly set. Over the years, practitioners have developed a number of rules of thumb to alleviate stability issues. We argue in this thesis that one way to develop robust stochastic optimization methods is to frame them as inference methods. In particular, we show that stochastic optimization schemes can be recast as inference methods and can be understood as inference algorithms. Framing the problem as an inference problem opens the way to compare these methods to the optimal inference algorithms and understand why they might be failing or producing unstable behavior. In this vein, we show that there is an intrinsic relationship between a class of stochastic optimization methods, called incremental proximal methods, and Kalman (and extended Kalman) ﬁlters. The ﬁltering approach to stochastic optimization results in an automatic calibration of the step-size, which removes the instability problems depending on the step-sizes. The probabilistic interpretation of stochastic optimization problems also paves the way to develop new optimization methods based on strategies which are popular in the inference literature. In particular, one can use a set of sampling methods in order to solve the inference problem and hence obtain the global minimum. In this manner, we propose a parallel sequential Monte Carlo optimizer (PSMCO), which is aiming at solving stochastic optimization problems. The PSMCO is designed as a zeroth order method which does not use gradients. It only uses subsets of the data points in order to move at each iteration. The PSMCO obtains an estimate of a global minimum at each iteration by utilizing a cheap kernel density estimator. We prove that the resulting estimator converges to a global minimum almost surely as the number of Monte Carlo samples tends to inﬁnity. We also empirically demonstrate that the algorithm is able to reconstruct multiple global minima and solve diﬃcult global optimization problems. By further exploiting the relationship between inference and optimization, we also propose a probabilistic and online matrix factorization method, termed the dictionary ﬁlter to solve large-scale matrix factorization problems. Matrix factorization methods have received signiﬁcant interest from the machine learning community due to their expressive representations of high-dimensional data and interpretability of their estimates. As the majority of the matrix factorization methods are deﬁned as optimization problems, they suﬀer from the same issues as stochastic optimization methods. In particular, when using stochastic gradient descent, one might need to try and err many times before deciding to use a step-size. To alleviate these problems, we introduce a matrix-variate probabilistic model for which inference results in a matrix factorization scheme. The scheme is online, in the sense that it only uses a single data point at a time to update the factors. The algorithm bears relationship with optimization schemes, namely with the incremental proximal method deﬁned over a matrix-variate cost function. By way of intuition we developed for the optimization-inference relationship, we devise a model which results in similar update rules for matrix factorization as for the incremental proximal method. However, the probabilistic updates are more stable and eﬃcient. Moreover, the algorithm does not have a step-size parameter to tune, as its role is played by the posterior covariance matrix. We demonstrate the utility of the algorithm on a missing data problem and a video processing problem. We show that the algorithm can be successfully used in machine learning problems and several promising extensions of the method can be constructed easily.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Ricardo Cao Abad.- Secretario: Michael Peter Wiper.- Vocal: Nicholas Paul Whitele

Universidad Carlos III de Madrid e-Archivo

Learning the optimum as a Nash equilibrium

Author: Alemdar N. M.
Ozyildirim S.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2000
Field of study

Cataloged from PDF version of article.This paper shows the computational benefits of a game theoretic approach to optimization of high dimensional control problems. A dynamic noncooperative game framework is adopted to partition the control space and to search the optimum as the equilibrium of a k-person dynamic game played by k-parallel genetic algorithms. When there are multiple inputs, we delegate control authority over a set of control variables exclusively to one player so that k artificially intelligent players explore and communicate to learn the global optimum as the Nash equilibrium. In the case of a single input, each player's decision authority becomes active on exclusive sets of dates-so that k GAs construct the optimal control trajectory as the equilibrium of evolving best-to-date responses. Sample problems are provided to demonstrate the gains in computational speed and accuracy. (C) 2000 Elsevier Science B.V. All rights reserved

Bilkent University Institutional Repository

The reparameterization trick for acquisition functions

Author: Deisenroth MP
Hutter F
Moriconi R
Wilson JT
Publication venue
Publication date: 01/12/2017
Field of study

Bayesian optimization is a sample-efficient approach to solving global optimization problems. Along with a surrogate model, this approach relies on theoretically motivated value heuristics (acquisition functions) to guide the search process. Maximizing acquisition functions yields the best performance; unfortunately, this ideal is difficult to achieve since optimizing acquisition functions per se is frequently non-trivial. This statement is especially true in the parallel setting, where acquisition functions are routinely non-convex, high-dimensional, and intractable. Here, we demonstrate how many popular acquisition functions can be formulated as Gaussian integrals amenable to the reparameterization trick and, ensuingly, gradient-based optimization. Further, we use this reparameterized representation to derive an efficient Monte Carlo estimator for the upper confidence bound acquisition function in the context of parallel selection

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Linear Phase FIR Digital Filter Design Using Differential Evolution Algorithms

Author: Zhong Wei
Publication venue: 'University of Windsor Leddy Library'
Publication date: 13/04/2017
Field of study

Digital filter plays a vital part in digital signal processing field. It has been used in control systems, aerospace, telecommunications, medical applications, speech processing and so on. Digital filters can be divided into infinite impulse response filter (IIF) and finite impulse response filter (FIR). The advantage of FIR is that it can be linear phase using symmetric or anti-symmetry coefficients. Besides traditional methods like windowing function and frequency sampling, optimization methods can be used to design FIR filters. A common method for FIR filter design is to use the Parks-McClellan algorithm. Meanwhile, evolutional algorithm such as Genetic Algorithm (GA), Particle Swarm Optimization (PSO) [2], and Differential Evolution (DE) have shown successes in solving multi-parameters optimization problems. This thesis reports a comparison work on the use of PSO, DE, and two modified DE algorithms from [18] and [19] for designing six types of linear phase FIR filters, consisting of type1 lowpass, highpass, bandpass, and bandstop filters, and type2 lowpass and bandpass filters. Although PSO has been applied in this field for some years, the results of some of the designs, especially for high-dimensional filters, are not good enough when comparing with those of the Parks-McClellan algorithm. DE algorithms use parallel search techniques to explore optimal solutions in a global range. What’s more, when facing higher dimensional filter design problems, through combining the knowledge acquired during the searching process, the DE algorithm shows obvious advantage in both frequency response and computational time

Scholarship at UWindsor

A Parallel Divide-and-Conquer based Evolutionary Algorithm for Large-scale Optimization

Author: Tang Ke
Yang Peng
Yao Xin
Publication venue
Publication date: 06/12/2018
Field of study

Large-scale optimization problems that involve thousands of decision variables have extensively arisen from various industrial areas. As a powerful optimization tool for many real-world applications, evolutionary algorithms (EAs) fail to solve the emerging large-scale problems both effectively and efficiently. In this paper, we propose a novel Divide-and-Conquer (DC) based EA that can not only produce high-quality solution by solving sub-problems separately, but also highly utilizes the power of parallel computing by solving the sub-problems simultaneously. Existing DC-based EAs that were deemed to enjoy the same advantages of the proposed algorithm, are shown to be practically incompatible with the parallel computing scheme, unless some trade-offs are made by compromising the solution quality.Comment: 12 pages, 0 figure

arXiv.org e-Print Archive