15 research outputs found

    Bayesian optimisation for automated machine learning

    Get PDF
    In this thesis, we develop a rich family of efficient and performant Bayesian optimisation (BO) methods to tackle various AutoML tasks. We first introduce a fast information-theoretic BO method, FITBO, that overcomes the computation bottleneck of information-theoretic acquisition functions while maintaining their competitiveness on the noisy optimisation problems frequently encountered in AutoML. We then improve on the idea of local penalisation and develop an asynchronous batch BO solution, PLAyBOOK, to enable more efficient use of parallel computing resources when evaluation runtime varies across configurations. In view of the fact that many practical AutoML problems involve a mixture of multiple continuous and multiple categorical variables, we propose a new framework, named Continuous and Categorical BO (CoCaBO) to handle such mixed-type input spaces. CoCaBO merges the strengths of multi-armed bandits on categorical inputs and that of BO on continuous space, and uses a tailored kernel to permit information sharing across different categorical variables. We also extend CoCaBO by harnessing the concept of local trust region to achieve competitive performance on high-dimensional optimisation problems with mixed input types. Beyond hyper-parameter tuning, we also investigate the novel use of BO on two important AutoML applications: black-box adversarial attack and neural architecture search. For the former (adversarial attack), we introduce the first BO-based attacks on image and graph classifiers; by actively querying the unknown victim classifier, our BO attacks can successfully find adversarial perturbations with many fewer attempts than competing baselines. They can thus serve as efficient tools for assessing the robustness of models suggested by AutoML. For the latter (neural architecture search), we leverage the Weisfeiler-Lehamn graph kernel to empower our BO search strategy, NAS-BOWL, to naturally handle the directed acyclic graph representation of architectures. Besides achieving superior query efficiency, our NAS-BOWL also returns interpretable sub-features that help explain the architecture performance, thus marking the first step towards interpretable neural architecture search. Finally, we examine the most computation-intense step in AutoML pipeline: generalisation performance evaluation for a new configuration. We propose a cheap yet reliable test performance estimator based on a simple measure of training speed. It consistently outperforms various existing estimators on on a wide range of architecture search spaces and and can be easily incorporated into different search strategies, including BO, to improve the cost efficiency

    Bayesian Optimization over High-Dimensional Continuous and Categorical Mixture Variables

    Get PDF
    ISM Online Open House, 2021.6.18統計数理研究所オープンハウス(オンライン開催)、R3.6.18ポスター発

    Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

    Full text link
    We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We investigate two classes of set kernels that both rely on Reproducing Kernel Hilbert Space embeddings, namely the ``Double Sum'' (DS) kernels recently considered in Bayesian set optimization, and a class introduced here called ``Deep Embedding'' (DE) kernels that essentially consists in applying a radial kernel on Hilbert space on top of the canonical distance induced by another kernel such as a DS kernel. We establish in particular that while DS kernels typically suffer from a lack of strict positive definiteness, vast subclasses of DE kernels built upon DS kernels do possess this property, enabling in turn combinatorial optimization without requiring to introduce a jitter parameter. Proofs of theoretical results about considered kernels are complemented by a few practicalities regarding hyperparameter fitting. We furthermore demonstrate the applicability of our approach in prediction and optimization tasks, relying both on toy examples and on two test cases from mechanical engineering and hydrogeology, respectively. Experimental results highlight the applicability and compared merits of the considered approaches while opening new perspectives in prediction and sequential design with set inputs

    Black-box Mixed-Variable Optimisation using a Surrogate Model that Satisfies Integer Constraints

    Full text link
    A challenging problem in both engineering and computer science is that of minimising a function for which we have no mathematical formulation available, that is expensive to evaluate, and that contains continuous and integer variables, for example in automatic algorithm configuration. Surrogate-based algorithms are very suitable for this type of problem, but most existing techniques are designed with only continuous or only discrete variables in mind. Mixed-Variable ReLU-based Surrogate Modelling (MVRSM) is a surrogate-based algorithm that uses a linear combination of rectified linear units, defined in such a way that (local) optima satisfy the integer constraints. This method outperforms the state of the art on several synthetic benchmarks with up to 238 continuous and integer variables, and achieves competitive performance on two real-life benchmarks: XGBoost hyperparameter tuning and Electrostatic Precipitator optimisation.Comment: Ann Math Artif Intell (2020

    Robot Learning with Crash Constraints

    Full text link
    In the past decade, numerous machine learning algorithms have been shown to successfully learn optimal policies to control real robotic systems. However, it is common to encounter failing behaviors as the learning loop progresses. Specifically, in robot applications where failing is undesired but not catastrophic, many algorithms struggle with leveraging data obtained from failures. This is usually caused by (i) the failed experiment ending prematurely, or (ii) the acquired data being scarce or corrupted. Both complicate the design of proper reward functions to penalize failures. In this paper, we propose a framework that addresses those issues. We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints, where no data is obtained upon constraint violation. The no-data case is addressed by a novel GP model (GPCR) for the constraint that combines discrete events (failure/success) with continuous observations (only obtained upon success). We demonstrate the effectiveness of our framework on simulated benchmarks and on a real jumping quadruped, where the constraint threshold is unknown a priori. Experimental data is collected, by means of constrained Bayesian optimization, directly on the real robot. Our results outperform manual tuning and GPCR proves useful on estimating the constraint threshold.Comment: 8 pages, 4 figures, 1 table, 1 algorithm. Accepted for publication in IEEE Robotics and Automation Letters (RA-L). Video demonstration of the experiments available at https://youtu.be/RAiIo0l6_rE . Algorithm implementation available at https://github.com/alonrot/classified_regression.gi

    An Artificial Intelligence (AI) workflow for catalyst design and optimization

    Full text link
    In the pursuit of novel catalyst development to address pressing environmental concerns and energy demand, conventional design and optimization methods often fall short due to the complexity and vastness of the catalyst parameter space. The advent of Machine Learning (ML) has ushered in a new era in the field of catalyst optimization, offering potential solutions to the shortcomings of traditional techniques. However, existing methods fail to effectively harness the wealth of information contained within the burgeoning body of scientific literature on catalyst synthesis. To address this gap, this study proposes an innovative Artificial Intelligence (AI) workflow that integrates Large Language Models (LLMs), Bayesian optimization, and an active learning loop to expedite and enhance catalyst optimization. Our methodology combines advanced language understanding with robust optimization strategies, effectively translating knowledge extracted from diverse literature into actionable parameters for practical experimentation and optimization. In this article, we demonstrate the application of this AI workflow in the optimization of catalyst synthesis for ammonia production. The results underscore the workflow's ability to streamline the catalyst development process, offering a swift, resource-efficient, and high-precision alternative to conventional methods.Comment: 31 pages, 7 figure
    corecore