38 research outputs found
Scalable Meta-Learning for Bayesian Optimization
Bayesian optimization has become a standard technique for hyperparameter
optimization, including data-intensive models such as deep neural networks that
may take days or weeks to train. We consider the setting where previous
optimization runs are available, and we wish to use their results to warm-start
a new optimization run. We develop an ensemble model that can incorporate the
results of past optimization runs, while avoiding the poor scaling that comes
with putting all results into a single Gaussian process model. The ensemble
combines models from past runs according to estimates of their generalization
performance on the current optimization. Results from a large collection of
hyperparameter optimization benchmark problems and from optimization of a
production computer vision platform at Facebook show that the ensemble can
substantially reduce the time it takes to obtain near-optimal configurations,
and is useful for warm-starting expensive searches or running quick
re-optimizations
PFNs Are Flexible Models for Real-World Bayesian Optimization
In this paper, we use Prior-data Fitted Networks (PFNs) as a flexible
surrogate for Bayesian Optimization (BO). PFNs are neural processes that are
trained to approximate the posterior predictive distribution (PPD) for any
prior distribution that can be efficiently sampled from. We describe how this
flexibility can be exploited for surrogate modeling in BO. We use PFNs to mimic
a naive Gaussian process (GP), an advanced GP, and a Bayesian Neural Network
(BNN). In addition, we show how to incorporate further information into the
prior, such as allowing hints about the position of optima (user priors),
ignoring irrelevant dimensions, and performing non-myopic BO by learning the
acquisition function. The flexibility underlying these extensions opens up vast
possibilities for using PFNs for BO. We demonstrate the usefulness of PFNs for
BO in a large-scale evaluation on artificial GP samples and three different
hyperparameter optimization testbeds: HPO-B, Bayesmark, and PD1. We publish
code alongside trained models at http://github.com/automl/PFNs4BO.Comment: Accepted at ICML 202
Improving generalisation of AutoML systems with dynamic fitness evaluations
A common problem machine learning developers are faced with is overfitting,
that is, fitting a pipeline too closely to the training data that the
performance degrades for unseen data. Automated machine learning aims to free
(or at least ease) the developer from the burden of pipeline creation, but this
overfitting problem can persist. In fact, this can become more of a problem as
we look to iteratively optimise the performance of an internal cross-validation
(most often \textit{k}-fold). While this internal cross-validation hopes to
reduce this overfitting, we show we can still risk overfitting to the
particular folds used. In this work, we aim to remedy this problem by
introducing dynamic fitness evaluations which approximate repeated
\textit{k}-fold cross-validation, at little extra cost over single
\textit{k}-fold, and far lower cost than typical repeated \textit{k}-fold. The
results show that when time equated, the proposed fitness function results in
significant improvement over the current state-of-the-art baseline method which
uses an internal single \textit{k}-fold. Furthermore, the proposed extension is
very simple to implement on top of existing evolutionary computation methods,
and can provide essentially a free boost in generalisation/testing performance.Comment: 19 pages, 4 figure
Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning
Automated Machine Learning (AutoML) supports practitioners and researchers with the tedious task of designing machine learning pipelines and has recently achieved substantial success. In this paper, we introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge. We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits by using a new, simple and meta-feature-free meta-learning technique and by employing a successful bandit strategy for budget allocation. However, PoSH Auto-sklearn introduces even more ways of running AutoML and might make it harder for users to set it up correctly. Therefore, we also go one step further and study the design space of AutoML itself, proposing a solution towards truly hands-free AutoML. Together, these changes give rise to the next generation of our AutoML system, Auto-sklearn 2.0. We verify the improvements by these additions in an extensive experimental study on 39 AutoML benchmark datasets. We conclude the paper by comparing to other popular AutoML frameworks and Auto-sklearn 1.0, reducing the relative error by up to a factor of 4.5, and yielding a performance in 10 minutes that is substantially better than what Auto-sklearn 1.0 achieves within an hour
Auto-Sklearn 2.0: The Next Generation
Automated Machine Learning, which supports practitioners and researchers with
the tedious task of manually designing machine learning pipelines, has recently
achieved substantial success. In this paper we introduce new Automated Machine
Learning (AutoML) techniques motivated by our winning submission to the second
ChaLearn AutoML challenge, PoSH Auto-sklearn. For this, we extend Auto-sklearn
with a new, simpler meta-learning technique, improve its way of handling
iterative algorithms and enhance it with a successful bandit strategy for
budget allocation. Furthermore, we go one step further and study the design
space of AutoML itself and propose a solution towards truly hand-free AutoML.
Together, these changes give rise to the next generation of our AutoML system,
Auto-sklearn (2.0). We verify the improvement by these additions in a large
experimental study on 39 AutoML benchmark datasets and conclude the paper by
comparing to Auto-sklearn (1.0), reducing the regret by up to a factor of five
OpenML Benchmarking Suites
Machine learning research depends on objectively interpretable, comparable,
and reproducible algorithm benchmarks. Therefore, we advocate the use of
curated, comprehensive suites of machine learning tasks to standardize the
setup, execution, and reporting of benchmarks. We enable this through software
tools that help to create and leverage these benchmarking suites. These are
seamlessly integrated into the OpenML platform, and accessible through
interfaces in Python, Java, and R. OpenML benchmarking suites are (a) easy to
use through standardized data formats, APIs, and client libraries; (b)
machine-readable, with extensive meta-information on the included datasets; and
(c) allow benchmarks to be shared and reused in future studies. We also present
a first, carefully curated and practical benchmarking suite for classification:
the OpenML Curated Classification benchmarking suite 2018 (OpenML-CC18)
Mind the Gap: Measuring Generalization Performance Across Multiple Objectives
Modern machine learning models are often constructed taking into account
multiple objectives, e.g., minimizing inference time while also maximizing
accuracy. Multi-objective hyperparameter optimization (MHPO) algorithms return
such candidate models, and the approximation of the Pareto front is used to
assess their performance. In practice, we also want to measure generalization
when moving from the validation to the test set. However, some of the models
might no longer be Pareto-optimal which makes it unclear how to quantify the
performance of the MHPO method when evaluated on the test set. To resolve this,
we provide a novel evaluation protocol that allows measuring the generalization
performance of MHPO methods and studying its capabilities for comparing two
optimization experiments
Aryltriazene photopolymer thin films as sacrificial release layers for laser-assisted forward transfer systems: study of photoablative decomposition and transfer behavior
Thin films of a tailor-made photodecomposible aryltriazene polymer were applied in a modified laser-induced forward transfer (LIFT) process as sacrificial release layers. The photopolymer film acts as an intermediate energy-absorbing dynamic release layer (DRL) that decomposes efficiently into small volatile fragments upon UV laser irradiation. A fast-expanding pressure jet is generated which is used to propel an overlying transfer material from the source target onto a receiver. This DRL-assisted laser direct-write process allows the precise deposition of intact material pixels with micrometer resolution and by single laser pulses. Triazene-based photopolymer DRL donor systems were studied to derive optimum conditions for film thickness and laser fluences necessary for a defined transfer process at the emission wavelength of a XeCl excimer laser (308nm). Photoablation, surface detachment, delamination and transfer behavior of aryltriazene polymer films with a thickness from 25nm to ∼400nm were investigated in order to improve the process control parameters for the fabrication of functional thin-film devices of microdeposited heat- and UV-sensitive material