528 research outputs found
Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation
We present a new algorithm, truncated variance reduction (TruVaR), that
treats Bayesian optimization (BO) and level-set estimation (LSE) with Gaussian
processes in a unified fashion. The algorithm greedily shrinks a sum of
truncated variances within a set of potential maximizers (BO) or unclassified
points (LSE), which is updated based on confidence bounds. TruVaR is effective
in several important settings that are typically non-trivial to incorporate
into myopic algorithms, including pointwise costs and heteroscedastic noise. We
provide a general theoretical guarantee for TruVaR covering these aspects, and
use it to recover and strengthen existing results on BO and LSE. Moreover, we
provide a new result for a setting where one can select from a number of noise
levels having associated costs. We demonstrate the effectiveness of the
algorithm on both synthetic and real-world data sets.Comment: Accepted to NIPS 201
Domain-Agnostic Batch Bayesian Optimization with Diverse Constraints via Bayesian Quadrature
Real-world optimisation problems often feature complex combinations of (1)
diverse constraints, (2) discrete and mixed spaces, and are (3) highly
parallelisable. (4) There are also cases where the objective function cannot be
queried if unknown constraints are not satisfied, e.g. in drug discovery,
safety on animal experiments (unknown constraints) must be established before
human clinical trials (querying objective function) may proceed. However, most
existing works target each of the above three problems in isolation and do not
consider (4) unknown constraints with query rejection. For problems with
diverse constraints and/or unconventional input spaces, it is difficult to
apply these techniques as they are often mutually incompatible. We propose
cSOBER, a domain-agnostic prudent parallel active sampler for Bayesian
optimisation, based on SOBER of Adachi et al. (2023). We consider infeasibility
under unknown constraints as a type of integration error that we can estimate.
We propose a theoretically-driven approach that propagates such error as a
tolerance in the quadrature precision that automatically balances exploitation
and exploration with the expected rejection rate. Moreover, our method flexibly
accommodates diverse constraints and/or discrete and mixed spaces via adaptive
tolerance, including conventional zero-risk cases. We show that cSOBER
outperforms competitive baselines on diverse real-world blackbox-constrained
problems, including safety-constrained drug discovery, and
human-relationship-aware team optimisation over graph-structured space.Comment: 24 pages, 5 figure
Practical Bayesian Optimization of Machine Learning Algorithms
Machine learning algorithms frequently require careful tuning of model
hyperparameters, regularization terms, and optimization parameters.
Unfortunately, this tuning is often a "black art" that requires expert
experience, unwritten rules of thumb, or sometimes brute-force search. Much
more appealing is the idea of developing automatic approaches which can
optimize the performance of a given learning algorithm to the task at hand. In
this work, we consider the automatic tuning problem within the framework of
Bayesian optimization, in which a learning algorithm's generalization
performance is modeled as a sample from a Gaussian process (GP). The tractable
posterior distribution induced by the GP leads to efficient use of the
information gathered by previous experiments, enabling optimal choices about
what parameters to try next. Here we show how the effects of the Gaussian
process prior and the associated inference procedure can have a large impact on
the success or failure of Bayesian optimization. We show that thoughtful
choices can lead to results that exceed expert-level performance in tuning
machine learning algorithms. We also describe new algorithms that take into
account the variable cost (duration) of learning experiments and that can
leverage the presence of multiple cores for parallel experimentation. We show
that these proposed algorithms improve on previous automatic procedures and can
reach or surpass human expert-level optimization on a diverse set of
contemporary algorithms including latent Dirichlet allocation, structured SVMs
and convolutional neural networks
- …