19,883 research outputs found
SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
Computer vision is experiencing an AI renaissance, in which machine learning
models are expediting important breakthroughs in academic research and
commercial applications. Effectively training these models, however, is not
trivial due in part to hyperparameters: user-configured values that control a
model's ability to learn from data. Existing hyperparameter optimization
methods are highly parallel but make no effort to balance the search across
heterogeneous hardware or to prioritize searching high-impact spaces. In this
paper, we introduce a framework for massively Scalable Hardware-Aware
Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the
relative complexity of each search space and monitors performance on the
learning task over all trials. These metrics are then used as heuristics to
assign hyperparameters to distributed workers based on their hardware. We first
demonstrate that our framework achieves double the throughput of a standard
distributed hyperparameter optimization framework by optimizing SVM for MNIST
using 150 distributed workers. We then conduct model search with SHADHO over
the course of one week using 74 GPUs across two compute clusters to optimize
U-Net for a cell segmentation task, discovering 515 models that achieve a lower
validation loss than standard U-Net.Comment: 10 pages, 6 figure
Denoising Autoencoders for fast Combinatorial Black Box Optimization
Estimation of Distribution Algorithms (EDAs) require flexible probability
models that can be efficiently learned and sampled. Autoencoders (AE) are
generative stochastic networks with these desired properties. We integrate a
special type of AE, the Denoising Autoencoder (DAE), into an EDA and evaluate
the performance of DAE-EDA on several combinatorial optimization problems with
a single objective. We asses the number of fitness evaluations as well as the
required CPU times. We compare the results to the performance to the Bayesian
Optimization Algorithm (BOA) and RBM-EDA, another EDA which is based on a
generative neural network which has proven competitive with BOA. For the
considered problem instances, DAE-EDA is considerably faster than BOA and
RBM-EDA, sometimes by orders of magnitude. The number of fitness evaluations is
higher than for BOA, but competitive with RBM-EDA. These results show that DAEs
can be useful tools for problems with low but non-negligible fitness evaluation
costs.Comment: corrected typos and small inconsistencie
Getting Started with Particle Metropolis-Hastings for Inference in Nonlinear Dynamical Models
This tutorial provides a gentle introduction to the particle
Metropolis-Hastings (PMH) algorithm for parameter inference in nonlinear
state-space models together with a software implementation in the statistical
programming language R. We employ a step-by-step approach to develop an
implementation of the PMH algorithm (and the particle filter within) together
with the reader. This final implementation is also available as the package
pmhtutorial in the CRAN repository. Throughout the tutorial, we provide some
intuition as to how the algorithm operates and discuss some solutions to
problems that might occur in practice. To illustrate the use of PMH, we
consider parameter inference in a linear Gaussian state-space model with
synthetic data and a nonlinear stochastic volatility model with real-world
data.Comment: 41 pages, 7 figures. In press for Journal of Statistical Software.
Source code for R, Python and MATLAB available at:
https://github.com/compops/pmh-tutoria
- …