237 research outputs found
Asymptotically Exact, Embarrassingly Parallel MCMC
Communication costs, resulting from synchronization requirements during
learning, can greatly slow down many parallel machine learning algorithms. In
this paper, we present a parallel Markov chain Monte Carlo (MCMC) algorithm in
which subsets of data are processed independently, with very little
communication. First, we arbitrarily partition data onto multiple machines.
Then, on each machine, any classical MCMC method (e.g., Gibbs sampling) may be
used to draw samples from a posterior distribution given the data subset.
Finally, the samples from each machine are combined to form samples from the
full posterior. This embarrassingly parallel algorithm allows each machine to
act independently on a subset of the data (without communication) until the
final combination stage. We prove that our algorithm generates asymptotically
exact samples and empirically demonstrate its ability to parallelize burn-in
and sampling in several models
BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search
Over the past half-decade, many methods have been considered for neural
architecture search (NAS). Bayesian optimization (BO), which has long had
success in hyperparameter optimization, has recently emerged as a very
promising strategy for NAS when it is coupled with a neural predictor. Recent
work has proposed different instantiations of this framework, for example,
using Bayesian neural networks or graph convolutional networks as the
predictive model within BO. However, the analyses in these papers often focus
on the full-fledged NAS algorithm, so it is difficult to tell which individual
components of the framework lead to the best performance.
In this work, we give a thorough analysis of the "BO + neural predictor"
framework by identifying five main components: the architecture encoding,
neural predictor, uncertainty calibration method, acquisition function, and
acquisition optimization strategy. We test several different methods for each
component and also develop a novel path-based encoding scheme for neural
architectures, which we show theoretically and empirically scales better than
other encodings. Using all of our analyses, we develop a final algorithm called
BANANAS, which achieves state-of-the-art performance on NAS search spaces. We
adhere to the NAS research checklist (Lindauer and Hutter 2019) to facilitate
best practices, and our code is available at
https://github.com/naszilla/naszilla
- …