29,802 research outputs found
Mixed-Variable Bayesian Optimization
The optimization of expensive to evaluate, black-box, mixed-variable
functions, i.e. functions that have continuous and discrete inputs, is a
difficult and yet pervasive problem in science and engineering. In Bayesian
optimization (BO), special cases of this problem that consider fully continuous
or fully discrete domains have been widely studied. However, few methods exist
for mixed-variable domains and none of them can handle discrete constraints
that arise in many real-world applications. In this paper, we introduce MiVaBo,
a novel BO algorithm for the efficient optimization of mixed-variable functions
combining a linear surrogate model based on expressive feature representations
with Thompson sampling. We propose an effective method to optimize its
acquisition function, a challenging problem for mixed-variable domains, making
MiVaBo the first BO method that can handle complex constraints over the
discrete variables. Moreover, we provide the first convergence analysis of a
mixed-variable BO algorithm. Finally, we show that MiVaBo is significantly more
sample efficient than state-of-the-art mixed-variable BO algorithms on several
hyperparameter tuning tasks, including the tuning of deep generative models.Comment: IJCAI 2020 camera-ready; 17 pages, extended version with
supplementary materia
Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian
An augmented Lagrangian (AL) can convert a constrained optimization problem
into a sequence of simpler (e.g., unconstrained) problems, which are then
usually solved with local solvers. Recently, surrogate-based Bayesian
optimization (BO) sub-solvers have been successfully deployed in the AL
framework for a more global search in the presence of inequality constraints;
however, a drawback was that expected improvement (EI) evaluations relied on
Monte Carlo. Here we introduce an alternative slack variable AL, and show that
in this formulation the EI may be evaluated with library routines. The slack
variables furthermore facilitate equality as well as inequality constraints,
and mixtures thereof. We show how our new slack "ALBO" compares favorably to
the original. Its superiority over conventional alternatives is reinforced on
several mixed constraint examples.Comment: 24 pages, 5 figure
Hybrid Models for Mixed Variables in Bayesian Optimization
This paper presents a new type of hybrid models for Bayesian optimization
(BO) adept at managing mixed variables, encompassing both quantitative
(continuous and integer) and qualitative (categorical) types. Our proposed new
hybrid models merge Monte Carlo Tree Search structure (MCTS) for categorical
variables with Gaussian Processes (GP) for continuous ones. Addressing
efficiency in searching phase, we juxtapose the original (frequentist) upper
confidence bound tree search (UCTS) and the Bayesian Dirichlet search
strategies, showcasing the tree architecture's integration into Bayesian
optimization. Central to our innovation in surrogate modeling phase is online
kernel selection for mixed-variable BO. Our innovations, including dynamic
kernel selection, unique UCTS (hybridM) and Bayesian update strategies
(hybridD), position our hybrid models as an advancement in mixed-variable
surrogate models. Numerical experiments underscore the hybrid models'
superiority, highlighting their potential in Bayesian optimization.Comment: 32 pages, 8 Figure
A general mathematical framework for constrained mixed-variable blackbox optimization problems with meta and categorical variables
A mathematical framework for modelling constrained mixed-variable
optimization problems is presented in a blackbox optimization context. The
framework introduces a new notation and allows solution strategies. The
notation framework allows meta and categorical variables to be explicitly and
efficiently modelled, which facilitates the solution of such problems. The new
term meta variables is used to describe variables that influence which
variables are acting or nonacting: meta variables may affect the number of
variables and constraints. The flexibility of the solution strategies supports
the main blackbox mixed-variable optimization approaches: direct search methods
and surrogate-based methods (Bayesian optimization). The notation system and
solution strategies are illustrated through an example of a hyperparameter
optimization problem from the machine learning community
Uncertainty-Aware Mixed-Variable Machine Learning for Materials Design
Data-driven design shows the promise of accelerating materials discovery but
is challenging due to the prohibitive cost of searching the vast design space
of chemistry, structure, and synthesis methods. Bayesian Optimization (BO)
employs uncertainty-aware machine learning models to select promising designs
to evaluate, hence reducing the cost. However, BO with mixed numerical and
categorical variables, which is of particular interest in materials design, has
not been well studied. In this work, we survey frequentist and Bayesian
approaches to uncertainty quantification of machine learning with mixed
variables. We then conduct a systematic comparative study of their performances
in BO using a popular representative model from each group, the random
forest-based Lolo model (frequentist) and the latent variable Gaussian process
model (Bayesian). We examine the efficacy of the two models in the optimization
of mathematical functions, as well as properties of structural and functional
materials, where we observe performance differences as related to problem
dimensionality and complexity. By investigating the machine learning models'
predictive and uncertainty estimation capabilities, we provide interpretations
of the observed performance differences. Our results provide practical guidance
on choosing between frequentist and Bayesian uncertainty-aware machine learning
models for mixed-variable BO in materials design
Bayesian Optimization for Materials Design with Mixed Quantitative and Qualitative Variables
Although Bayesian Optimization (BO) has been employed for accelerating
materials design in computational materials engineering, existing works are
restricted to problems with quantitative variables. However, real designs of
materials systems involve both qualitative and quantitative design variables
representing material compositions, microstructure morphology, and processing
conditions. For mixed-variable problems, existing Bayesian Optimization (BO)
approaches represent qualitative factors by dummy variables first and then fit
a standard Gaussian process (GP) model with numerical variables as the
surrogate model. This approach is restrictive theoretically and fails to
capture complex correlations between qualitative levels. We present in this
paper the integration of a novel latent-variable (LV) approach for
mixed-variable GP modeling with the BO framework for materials design. LVGP is
a fundamentally different approach that maps qualitative design variables to
underlying numerical LV in GP, which has strong physical justification. It
provides flexible parameterization and representation of qualitative factors
and shows superior modeling accuracy compared to the existing methods. We
demonstrate our approach through testing with numerical examples and materials
design examples. It is found that in all test examples the mapped LVs provide
intuitive visualization and substantial insight into the nature and effects of
the qualitative factors. Though materials designs are used as examples, the
method presented is generic and can be utilized for other mixed variable design
optimization problems that involve expensive physics-based simulations.Comment: 29 pages, 9 figures, 3 table
Bayesian Deep Net GLM and GLMM
Deep feedforward neural networks (DFNNs) are a powerful tool for functional
approximation. We describe flexible versions of generalized linear and
generalized linear mixed models incorporating basis functions formed by a DFNN.
The consideration of neural networks with random effects is not widely used in
the literature, perhaps because of the computational challenges of
incorporating subject specific parameters into already complex models.
Efficient computational methods for high-dimensional Bayesian inference are
developed using Gaussian variational approximation, with a parsimonious but
flexible factor parametrization of the covariance matrix. We implement natural
gradient methods for the optimization, exploiting the factor structure of the
variational covariance matrix in computation of the natural gradient. Our
flexible DFNN models and Bayesian inference approach lead to a regression and
classification method that has a high prediction accuracy, and is able to
quantify the prediction uncertainty in a principled and convenient way. We also
describe how to perform variable selection in our deep learning method. The
proposed methods are illustrated in a wide range of simulated and real-data
examples, and the results compare favourably to a state of the art flexible
regression and classification method in the statistical literature, the
Bayesian additive regression trees (BART) method. User-friendly software
packages in Matlab, R and Python implementing the proposed methods are
available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table
Parallel Mixed Bayesian Optimization Algorithm: A Scaleup Analysis
Estimation of Distribution Algorithms have been proposed as a new paradigm
for evolutionary optimization. This paper focuses on the parallelization of
Estimation of Distribution Algorithms. More specifically, the paper discusses
how to predict performance of parallel Mixed Bayesian Optimization Algorithm
(MBOA) that is based on parallel construction of Bayesian networks with
decision trees. We determine the time complexity of parallel Mixed Bayesian
Optimization Algorithm and compare this complexity with experimental results
obtained by solving the spin glass optimization problem. The empirical results
fit well the theoretical time complexity, so the scalability and efficiency of
parallel Mixed Bayesian Optimization Algorithm for unknown instances of spin
glass benchmarks can be predicted. Furthermore, we derive the guidelines that
can be used to design effective parallel Estimation of Distribution Algorithms
with the speedup proportional to the number of variables in the problem.Comment: Optimization by Building and Using Probabilistic Models OBUPM-200
Inference in Hybrid Bayesian Networks Using Mixtures of Gaussians
The main goal of this paper is to describe a method for exact inference in
general hybrid Bayesian networks (BNs) (with a mixture of discrete and
continuous chance variables). Our method consists of approximating general
hybrid Bayesian networks by a mixture of Gaussians (MoG) BNs. There exists a
fast algorithm by Lauritzen-Jensen (LJ) for making exact inferences in MoG
Bayesian networks, and there exists a commercial implementation of this
algorithm. However, this algorithm can only be used for MoG BNs. Some
limitations of such networks are as follows. All continuous chance variables
must have conditional linear Gaussian distributions, and discrete chance nodes
cannot have continuous parents. The methods described in this paper will enable
us to use the LJ algorithm for a bigger class of hybrid Bayesian networks. This
includes networks with continuous chance nodes with non-Gaussian distributions,
networks with no restrictions on the topology of discrete and continuous
variables, networks with conditionally deterministic variables that are a
nonlinear function of their continuous parents, and networks with continuous
chance variables whose variances are functions of their parents.Comment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty
in Artificial Intelligence (UAI2006
Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks
Bayesian Networks (BNs) represent conditional probability relations among a
set of random variables (nodes) in the form of a directed acyclic graph (DAG),
and have found diverse applications in knowledge discovery. We study the
problem of learning the sparse DAG structure of a BN from continuous
observational data. The central problem can be modeled as a mixed-integer
program with an objective function composed of a convex quadratic loss function
and a regularization penalty subject to linear constraints. The optimal
solution to this mathematical program is known to have desirable statistical
properties under certain conditions. However, the state-of-the-art optimization
solvers are not able to obtain provably optimal solutions to the existing
mathematical formulations for medium-size problems within reasonable
computational times. To address this difficulty, we tackle the problem from
both computational and statistical perspectives. On the one hand, we propose a
concrete early stopping criterion to terminate the branch-and-bound process in
order to obtain a near-optimal solution to the mixed-integer program, and
establish the consistency of this approximate solution. On the other hand, we
improve the existing formulations by replacing the linear "big-" constraints
that represent the relationship between the continuous and binary indicator
variables with second-order conic constraints. Our numerical results
demonstrate the effectiveness of the proposed approaches
- …