15 research outputs found
Bayesian optimisation for automated machine learning
In this thesis, we develop a rich family of efficient and performant Bayesian optimisation (BO) methods to tackle various AutoML tasks. We first introduce a fast information-theoretic BO method, FITBO, that overcomes the computation bottleneck of information-theoretic acquisition functions while maintaining their competitiveness on the noisy optimisation problems frequently encountered in AutoML. We then improve on the idea of local penalisation and develop an asynchronous batch BO solution, PLAyBOOK, to enable more efficient use of parallel computing resources when evaluation runtime varies across configurations. In view of the fact that many practical AutoML problems involve a mixture of multiple continuous and multiple categorical variables, we propose a new framework, named Continuous and Categorical BO (CoCaBO) to handle such mixed-type input spaces. CoCaBO merges the strengths of multi-armed bandits on categorical inputs and that of BO on continuous space, and uses a tailored kernel to permit information sharing across different categorical variables. We also extend CoCaBO by harnessing the concept of local trust region to achieve competitive performance on high-dimensional optimisation problems with mixed input types.
Beyond hyper-parameter tuning, we also investigate the novel use of BO on two important AutoML applications: black-box adversarial attack and neural architecture search. For the former (adversarial attack), we introduce the first BO-based attacks on image and graph classifiers; by actively querying the unknown victim classifier, our BO attacks can successfully find adversarial perturbations with many fewer attempts than competing baselines. They can thus serve as efficient tools for assessing the robustness of models suggested by AutoML. For the latter (neural architecture search), we leverage the Weisfeiler-Lehamn graph kernel to empower our BO search strategy, NAS-BOWL, to naturally handle the directed acyclic graph representation of architectures. Besides achieving superior query efficiency, our NAS-BOWL also returns interpretable sub-features that help explain the architecture performance, thus marking the first step towards interpretable neural architecture search. Finally, we examine the most computation-intense step in AutoML pipeline: generalisation performance evaluation for a new configuration. We propose a cheap yet reliable test performance estimator based on a simple measure of training speed. It consistently outperforms various existing estimators on on a wide range of architecture search spaces and and can be easily incorporated into different search strategies, including BO, to improve the cost efficiency
Bayesian Optimization over High-Dimensional Continuous and Categorical Mixture Variables
ISM Online Open House, 2021.6.18統計数理研究所オープンハウス(オンライン開催)、R3.6.18ポスター発
Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization
We focus on kernel methods for set-valued inputs and their application to
Bayesian set optimization, notably combinatorial optimization. We investigate
two classes of set kernels that both rely on Reproducing Kernel Hilbert Space
embeddings, namely the ``Double Sum'' (DS) kernels recently considered in
Bayesian set optimization, and a class introduced here called ``Deep
Embedding'' (DE) kernels that essentially consists in applying a radial kernel
on Hilbert space on top of the canonical distance induced by another kernel
such as a DS kernel. We establish in particular that while DS kernels typically
suffer from a lack of strict positive definiteness, vast subclasses of DE
kernels built upon DS kernels do possess this property, enabling in turn
combinatorial optimization without requiring to introduce a jitter parameter.
Proofs of theoretical results about considered kernels are complemented by a
few practicalities regarding hyperparameter fitting. We furthermore demonstrate
the applicability of our approach in prediction and optimization tasks, relying
both on toy examples and on two test cases from mechanical engineering and
hydrogeology, respectively. Experimental results highlight the applicability
and compared merits of the considered approaches while opening new perspectives
in prediction and sequential design with set inputs
Black-box Mixed-Variable Optimisation using a Surrogate Model that Satisfies Integer Constraints
A challenging problem in both engineering and computer science is that of
minimising a function for which we have no mathematical formulation available,
that is expensive to evaluate, and that contains continuous and integer
variables, for example in automatic algorithm configuration. Surrogate-based
algorithms are very suitable for this type of problem, but most existing
techniques are designed with only continuous or only discrete variables in
mind. Mixed-Variable ReLU-based Surrogate Modelling (MVRSM) is a
surrogate-based algorithm that uses a linear combination of rectified linear
units, defined in such a way that (local) optima satisfy the integer
constraints. This method outperforms the state of the art on several synthetic
benchmarks with up to 238 continuous and integer variables, and achieves
competitive performance on two real-life benchmarks: XGBoost hyperparameter
tuning and Electrostatic Precipitator optimisation.Comment: Ann Math Artif Intell (2020
Robot Learning with Crash Constraints
In the past decade, numerous machine learning algorithms have been shown to
successfully learn optimal policies to control real robotic systems. However,
it is common to encounter failing behaviors as the learning loop progresses.
Specifically, in robot applications where failing is undesired but not
catastrophic, many algorithms struggle with leveraging data obtained from
failures. This is usually caused by (i) the failed experiment ending
prematurely, or (ii) the acquired data being scarce or corrupted. Both
complicate the design of proper reward functions to penalize failures. In this
paper, we propose a framework that addresses those issues. We consider failing
behaviors as those that violate a constraint and address the problem of
learning with crash constraints, where no data is obtained upon constraint
violation. The no-data case is addressed by a novel GP model (GPCR) for the
constraint that combines discrete events (failure/success) with continuous
observations (only obtained upon success). We demonstrate the effectiveness of
our framework on simulated benchmarks and on a real jumping quadruped, where
the constraint threshold is unknown a priori. Experimental data is collected,
by means of constrained Bayesian optimization, directly on the real robot. Our
results outperform manual tuning and GPCR proves useful on estimating the
constraint threshold.Comment: 8 pages, 4 figures, 1 table, 1 algorithm. Accepted for publication in
IEEE Robotics and Automation Letters (RA-L). Video demonstration of the
experiments available at https://youtu.be/RAiIo0l6_rE . Algorithm
implementation available at
https://github.com/alonrot/classified_regression.gi
An Artificial Intelligence (AI) workflow for catalyst design and optimization
In the pursuit of novel catalyst development to address pressing
environmental concerns and energy demand, conventional design and optimization
methods often fall short due to the complexity and vastness of the catalyst
parameter space. The advent of Machine Learning (ML) has ushered in a new era
in the field of catalyst optimization, offering potential solutions to the
shortcomings of traditional techniques. However, existing methods fail to
effectively harness the wealth of information contained within the burgeoning
body of scientific literature on catalyst synthesis. To address this gap, this
study proposes an innovative Artificial Intelligence (AI) workflow that
integrates Large Language Models (LLMs), Bayesian optimization, and an active
learning loop to expedite and enhance catalyst optimization. Our methodology
combines advanced language understanding with robust optimization strategies,
effectively translating knowledge extracted from diverse literature into
actionable parameters for practical experimentation and optimization. In this
article, we demonstrate the application of this AI workflow in the optimization
of catalyst synthesis for ammonia production. The results underscore the
workflow's ability to streamline the catalyst development process, offering a
swift, resource-efficient, and high-precision alternative to conventional
methods.Comment: 31 pages, 7 figure