233 research outputs found
Theoretical Analysis of Bayesian Optimisation with Unknown Gaussian Process Hyper-Parameters
Bayesian optimisation has gained great popularity as a tool for optimising
the parameters of machine learning algorithms and models. Somewhat ironically,
setting up the hyper-parameters of Bayesian optimisation methods is notoriously
hard. While reasonable practical solutions have been advanced, they can often
fail to find the best optima. Surprisingly, there is little theoretical
analysis of this crucial problem in the literature. To address this, we derive
a cumulative regret bound for Bayesian optimisation with Gaussian processes and
unknown kernel hyper-parameters in the stochastic setting. The bound, which
applies to the expected improvement acquisition function and sub-Gaussian
observation noise, provides us with guidelines on how to design hyper-parameter
estimation methods. A simple simulation demonstrates the importance of
following these guidelines.Comment: 16 pages, 1 figur
Linear and Parallel Learning of Markov Random Fields
We introduce a new embarrassingly parallel parameter learning algorithm for
Markov random fields with untied parameters which is efficient for a large
class of practical models. Our algorithm parallelizes naturally over cliques
and, for graphs of bounded degree, its complexity is linear in the number of
cliques. Unlike its competitors, our algorithm is fully parallel and for
log-linear models it is also data efficient, requiring only the local
sufficient statistics of the data to estimate parameters
Portfolio Allocation for Bayesian Optimization
Bayesian optimization with Gaussian processes has become an increasingly
popular tool in the machine learning community. It is efficient and can be used
when very little is known about the objective function, making it popular in
expensive black-box optimization scenarios. It uses Bayesian methods to sample
the objective efficiently using an acquisition function which incorporates the
model's estimate of the objective and the uncertainty at any given point.
However, there are several different parameterized acquisition functions in the
literature, and it is often unclear which one to use. Instead of using a single
acquisition function, we adopt a portfolio of acquisition functions governed by
an online multi-armed bandit strategy. We propose several portfolio strategies,
the best of which we call GP-Hedge, and show that this method outperforms the
best individual acquisition function. We also provide a theoretical bound on
the algorithm's performance.Comment: This revision contains an updated the performance bound and other
minor text change
Distributed Parameter Estimation in Probabilistic Graphical Models
This paper presents foundational theoretical results on distributed parameter
estimation for undirected probabilistic graphical models. It introduces a
general condition on composite likelihood decompositions of these models which
guarantees the global consistency of distributed estimators, provided the local
estimators are consistent
- …