8 research outputs found

    Automating Active Learning for Gaussian Processes

    Get PDF
    In many problems in science, technology, and engineering, unlabeled data is abundant but acquiring labeled observations is expensive -- it requires a human annotator, a costly laboratory experiment, or a time-consuming computer simulation. Active learning is a machine learning paradigm designed to minimize the cost of obtaining labeled data by carefully selecting which new data should be gathered next. However, excessive machine learning expertise is often required to effectively apply these techniques in their current form. In this dissertation, we propose solutions that further automate active learning. Our core contributions are active learning algorithms that are easy for non-experts to use but that deliver results competitive with or better than human-expert solutions. We begin introducing a novel active search algorithm that automatically and dynamically balances exploration against exploitation --- without relying on a parameter to control this tradeoff. We also provide a theoretical investigation on the hardness of this problem, proving that no polynomial-time policy can achieve a constant factor approximation ratio for the expected utility of the optimal policy. Next, we introduce a novel information-theoretic approach for active model selection. Our method is based on maximizing the mutual information between the output variable and the model class. This is the first active-model-selection approach that does not require updating each model for every candidate point. As a result, we successfully developed an automated audiometry test for rapid screening of noise-induced hearing loss, a widespread and preventable disability, if diagnosed early. We proceed by introducing a novel model selection algorithm for fixed-size datasets, called Bayesian optimization for model selection (BOMS). Our proposed model search method is based on Bayesian optimization in model space, where we reason about the model evidence as a function to be maximized. BOMS is capable of finding a model that explains the dataset well without any human assistance. Finally, we extend BOMS to active learning, creating a fully automatic active learning framework. We apply this framework to Bayesian optimization, creating a sample-efficient automated system for black-box optimization. Crucially, we account for the uncertainty in the choice of model; our method uses multiple and carefully-selected models to represent its current belief about the latent objective function. Our algorithms are completely general and can be extended to any class of probabilistic models. In this dissertation, however, we mainly use the powerful class of Gaussian process models to perform inference. Extensive experimental evidence is provided to demonstrate that all proposed algorithms outperform previously developed solutions to these problems

    Bayesian Quadrature with Prior Information: Modeling and Policies

    Get PDF
    Quadrature is the problem of estimating intractable integrals. Such integrals regularly arise in engineering and the natural sciences, especially when Bayesian methods are applied; examples include model evidences, normalizing constants and marginal distributions. This dissertation explores Bayesian quadrature, a probabilistic, model-based quadrature method. Specifically, we study different ways in which Bayesian quadrature can be adapted to account for different kinds of prior information one may have about the task. We demonstrate that by taking into account prior knowledge, Bayesian quadrature can outperform commonly used numerical methods that are agnostic to prior knowledge, such as Monte Carlo based integration. We focus on two types of information that are (a) frequently available when faced with an intractable integral and (b) can be (approximately) incorporated into Bayesian quadrature: • Natural bounds on the possible values that the integrand can take, e.g., when the integrand is a probability density function, it must nonnegative everywhere.• Knowledge about how the integral estimate will be used, i.e., for settings where quadrature is a subroutine, different downstream inference tasks can result in different priorities or desiderata for the estimate. These types of prior information are used to inform two aspects of the Bayesian quadrature inference routine: • Modeling: how the belief on the integrand can be tailored to account for the additional information.• Policies: where the integrand will be observed given a constrained budget of observations. This second aspect of Bayesian quadrature, policies for deciding where to observe the integrand, can be framed as an experimental design problem, where an agent must choose locations to evaluate a function of interest so as to maximize some notion of value. We will study the broader area of sequential experimental design, applying ideas from Bayesian decision theory to develop an efficient and nonmyopic policy for general sequential experimental design problems. We consider other sequential experimental design tasks such as Bayesian optimization and active search; in the latter, we focus on facilitating human–computer partnerships with the goal of aiding human agents engaged in data foraging through the use of active search based suggestions and an interactive visual interface. Finally, this dissertation will return to Bayesian quadrature and discuss the batch setting for experimental design, where multiple observations of the function in question are made simultaneously

    Query-driven adaptive sampling

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2022.Automated information gathering allows exploration of environments where data is limited and gathering observations introduces risk, such as underwater and planetary exploration. Typically, exploration has been performed in service of a query, with a unique algorithm developed for each mission. Yet this approach does not allow scientists to respond to novel questions as they are raised. In this thesis, we develop a single approach for a broad range of adaptive sampling missions with risk and limited prior knowledge. To achieve this, we present contributions in planning adaptive missions in service of queries, and modeling multi-attribute environments. First, we define a query language suitable for specifying diverse goals in adaptive sampling. The language fully encompasses objectives from previous adaptive sampling approaches, and significantly extends the possible range of objectives. We prove that queries expressible in this language are not biased in a way that avoids information. We then describe a Monte Carlo tree search approach to plan for all queries in our language, using sample based objective estimators embedded within tree search. This approach outperforms methods that maximize information about all variables in hydrocarbon seep search and fire escape scenarios. Next, we show how to plan when the policy must bound risk as a function of reward. By solving approximating problems, we guarantee risk bounds on policies with large numbers of actions and continuous observations, ensuring that risks are only taken when justified by reward. Exploration is limited by the quality of the environment model, so we introduce Gaussian process models with directed acyclic structure to improve model accuracy under limited data. The addition of interpretable structure allows qualitative expert knowledge of the environment to be encoded through structure and parameter constraints. Since expert knowledge may be incomplete, we introduce efficient structure learning over structural models using A* search with bounding conflicts. By placing bounds on likelihood of substructures, we limit the number of structures that are trained, significantly accelerating search. Experiments modeling geographic data show that our model produces more accurate predictions than existing Gaussian process methods, and using bounds allows structure to be learned in 50% of the time.The work in this thesis was supported by the Exxon Mobil Corporation as part of the MIT Energy Initiative under the project ‘Autonomous System for Deep Sea Hydrocarbon Detection and Monitoring’, NASA’s PSTAR program under the project ‘Cooperative Exploration with Under-actuated Autonomous Vehicles in Hazardous Environments’, and the Vulcan Machine Learning Center for Impact under the project ‘Machine Learning Based Persistent Autonomous Underwater Scientific Studies’
    corecore