Search CORE

8 research outputs found

Decreasing the Computing Time of Bayesian Optimization using Generalizable Memory Pruning

Author: Buonassisi Tonio
Siemenn Alexander E.
Publication venue
Publication date: 08/09/2023
Field of study

Bayesian optimization (BO) suffers from long computing times when processing highly-dimensional or large data sets. These long computing times are a result of the Gaussian process surrogate model having a polynomial time complexity with the number of experiments. Running BO on high-dimensional or massive data sets becomes intractable due to this time complexity scaling, in turn, hindering experimentation. Alternative surrogate models have been developed to reduce the computing utilization of the BO procedure, however, these methods require mathematical alteration of the inherit surrogate function, pigeonholing use into only that function. In this paper, we demonstrate a generalizable BO wrapper of memory pruning and bounded optimization, capable of being used with any surrogate model and acquisition function. Using this memory pruning approach, we show a decrease in wall-clock computing times per experiment of BO from a polynomially increasing pattern to a sawtooth pattern that has a non-increasing trend without sacrificing convergence performance. Furthermore, we illustrate the generalizability of the approach across two unique data sets, two unique surrogate models, and four unique acquisition functions. All model implementations are run on the MIT Supercloud state-of-the-art computing hardware.Comment: Accepted as a paper in IEEE HPEC 202

arXiv.org e-Print Archive

Fast Bayesian Optimization of Needle-in-a-Haystack Problems using Zooming Memory-Based Initialization (ZoMBI)

Author: Buonassisi Tonio
Li Qianxiao
Ren Zekun
Siemenn Alexander E.
Publication venue
Publication date: 02/02/2023
Field of study

Needle-in-a-Haystack problems exist across a wide range of applications including rare disease prediction, ecological resource management, fraud detection, and material property optimization. A Needle-in-a-Haystack problem arises when there is an extreme imbalance of optimum conditions relative to the size of the dataset. For example, only

0.82\%

out of

146

k total materials in the open-access Materials Project database have a negative Poisson's ratio. However, current state-of-the-art optimization algorithms are not designed with the capabilities to find solutions to these challenging multidimensional Needle-in-a-Haystack problems, resulting in slow convergence to a global optimum or pigeonholing into a local minimum. In this paper, we present a Zooming Memory-Based Initialization algorithm, entitled ZoMBI. ZoMBI actively extracts knowledge from the previously best-performing evaluated experiments to iteratively zoom in the sampling search bounds towards the global optimum "needle" and then prunes the memory of low-performing historical experiments to accelerate compute times by reducing the algorithm time complexity from

O(n^3)

O(\phi^3)

for

\phi

forward experiments per activation, which trends to a constant

O(1)

over several activations. Additionally, ZoMBI implements two custom adaptive acquisition functions to further guide the sampling of new experiments toward the global optimum. We validate the algorithm's optimization performance on three real-world datasets exhibiting Needle-in-a-Haystack and further stress-test the algorithm's performance on an additional 174 analytical datasets. The ZoMBI algorithm demonstrates compute time speed-ups of 400x compared to traditional Bayesian optimization as well as efficiently discovering optima in under 100 experiments that are up to 3x more highly optimized than those discovered by similar methods MiP-EGO, TuRBO, and HEBO.Comment: Paper 16 pages; SI 6 page

arXiv.org e-Print Archive

Vision-driven Autocharacterization of Perovskite Semiconductors

Author: Aissi Eunice
Buonassisi Tonio
Das Basita
Kavak Hamide
Sheng Fang
Siemenn Alexander E.
Tiihonen Armi
Publication venue
Publication date: 16/03/2023
Field of study

In materials research, the task of characterizing hundreds of different materials traditionally requires equally many human hours spent measuring samples one by one. We demonstrate that with the integration of computer vision into this material research workflow, many of these tasks can be automated, significantly accelerating the throughput of the workflow for scientists. We present a framework that uses vision to address specific pain points in the characterization of perovskite semiconductors, a group of materials with the potential to form new types of solar cells. With this approach, we automate the measurement and computation of chemical and optoelectronic properties of perovskites. Our framework proposes the following four key contributions: (i) a computer vision tool for scalable segmentation to arbitrarily many material samples, (ii) a tool to extract the chemical composition of all material samples, (iii) an algorithm capable of automatically computing band gap across arbitrarily many unique samples using vision-segmented hyperspectral reflectance data, and (iv) automating the stability measurement of multi-hour perovskite degradation experiments with vision for spatially non-uniform samples. We demonstrate the key contributions of the proposed framework on eighty samples of unique composition from the formamidinium-methylammonium lead tri-iodide perovskite system and validate the accuracy of each method using human evaluation and X-ray diffraction.Comment: Manuscript 8 pages; Supplemental 7 page

arXiv.org e-Print Archive

Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark

Author: Buonassisi Tonio
Chin Zad
Drori Iddo
Hicke Yann
Hunter Gregory
Kerret Ori
Petsiuk Vitali
Plummer Bryan A.
Raghavan Arvind
Saenko Kate
Siemenn Alexander E.
Solar-Lezama Armando
Surbehera Saisamrit
Tyser Keith
Publication venue
Publication date: 22/11/2022
Field of study

We provide a new multi-task benchmark for evaluating text-to-image models. We perform a human evaluation comparing the most common open-source (Stable Diffusion) and commercial (DALL-E 2) models. Twenty computer science AI graduate students evaluated the two models, on three tasks, at three difficulty levels, across ten prompts each, providing 3,600 ratings. Text-to-image generation has seen rapid progress to the point that many recent models have demonstrated their ability to create realistic high-resolution images for various prompts. However, current text-to-image methods and the broader body of research in vision-language understanding still struggle with intricate text prompts that contain many objects with multiple attributes and relationships. We introduce a new text-to-image benchmark that contains a suite of thirty-two tasks over multiple applications that capture a model's ability to handle different features of a text prompt. For example, asking a model to generate a varying number of the same object to measure its ability to count or providing a text prompt with several objects that each have a different attribute to identify its ability to match objects and attributes correctly. Rather than subjectively evaluating text-to-image results on a set of prompts, our new multi-task benchmark consists of challenge tasks at three difficulty levels (easy, medium, and hard) and human ratings for each generated image.Comment: NeurIPS 2022 Workshop on Human Evaluation of Generative Models (HEGM

arXiv.org e-Print Archive

A System for High-Throughput Materials Exploration Driven by Machine Learning

Author: Siemenn Alexander E.
Publication venue: Massachusetts Institute of Technology
Publication date: 30/06/2021
Field of study

Functional materials have vast and high-dimensional compositions spaces which make discovering optimized compositions intractable with conventional synthesis tools. Conventional experimental methods of exploring material composition spaces are slow and resource intensive due to being manual processes and requiring trial-and-error experimentation. Thus, the question is posed: How can we design an optimized functional material from this highly dimensional and vast composition space such that it has a high performance for a given application? In this thesis, machine learning algorithms are integrated into novel, high-throughput synthesis hardware to accelerate the rate of material composition exploration by 10000x relative to these conventional methods. First, a novel inkjet droplet deposition hardware system is constructed to generate arrays of unique functional material compositions in the form of droplets using fluid mechanics and motor control theory. Second, computer vision and Bayesian optimization machine learning algorithms are integrated into the droplet synthesis loop to autonomously discover synthesis conditions that generate optimized droplets without any intervention of a domain expert. Third, mulitphysics models are developed to simulate the performance of functional material devices within the gamut of environmental conditions without having to run expensive laboratory experiments. The culmination of these three processes developed in this master's thesis provide validated methods for driving high-throughput, low-cost materials exploration and optimization to be further explored in my doctoral thesis.S.M

DSpace@MIT

Autonomous Optimization of Fluid Systems at Varying Length Scales

Author: Beveridge Matthew
Buonassisi Tonio
Drori Iddo
Hashmi Sara M.
Shaulsky Evyatar
Siemenn Alexander E.
Publication venue
Publication date: 10/06/2021
Field of study

Autonomous optimization is a process by which hardware conditions are discovered that generate an optimized experimental product without the guidance of a domain expert. We design an autonomous optimization framework to discover the experimental conditions within fluid systems that generate discrete and uniform droplet patterns. Generating discrete and uniform droplets requires high-precision control over the experimental conditions of a fluid system. Fluid stream instabilities, such as Rayleigh-Plateau instability and capillary instability, drive the separation of a flow into individual droplets. However, because this phenomenon leverages an instability, by nature the hardware must be precisely tuned to achieve uniform, repeatable droplets. Typically this requires a domain expert in the loop and constant re-tuning depending on the hardware configuration and liquid precursor selection. Herein, we propose a computer vision-driven Bayesian optimization framework to discover the precise hardware conditions that generate uniform, reproducible droplets with the desired features, leveraging flow instability without a domain expert in the loop. This framework is validated on two fluid systems, at the micrometer and millimeter length scales, using microfluidic and inkjet systems, respectively, indicating the application breadth of this approach.Comment: 8 page

arXiv.org e-Print Archive

DSpace@MIT

Vision-driven Autocharacterization of Perovskite Semiconductors

Author: Alexander E. Siemenn
Armi Tiihonen
Basita Das
Eunice Aissi
Fang Sheng
Hamide Kavak
Tonio Buonassisi
Publication venue
Publication date: 28/04/2023
Field of study

ChemRxiv

An Open-Source Environmental Chamber for Materials-Stability Testing Using an Optical Proxy

Author: Alex Encinas
Alexander E. Siemenn
Alexander J. Norquist
Armi Tiihonen
Austin C. Flick
Clio Batali
Cole A. Gurtner
Felipe Oviedo
I. Marius Peters
James Serdy
Janak Thapa
Keqing He
Margaret Zeile
Noor Titan Putri Hartono
Reinhold H. Dauskardt
Richa R. Naik
Rodolfo Keesey
Shijing Sun
Siyu Isaac Parker Tian
Thomas W. Colburn
Tonio Buonassisi
Zhe Liu
Publication venue
Publication date: 22/09/2022
Field of study

This study is motivated by the desire to disseminate a low-cost, high-precision, high-throughput environmental chamber to test materials and devices under elevated humidity, temperature, and light. This paper documents the creation of an open-source tool with a bill of materials as low as US$2,000, and the subsequent evolution of three second-generation tools installed at three different universities spanning thin films, bulk crystals, and thin-film solar-cell devices. We introduce an optical proxy measurement to detect real-time phase changes in materials. We present correlations between this optical proxy and standard X-ray diffraction measurements, describe some edge cases where the proxy measurement fails, and report key learnings from the technology-translation process. By sharing lessons learned, we hope that future open-hardware development and translation efforts can proceed with reduced friction. Throughout the paper, we provide examples of scientific impact, wherein participating laboratories used their environmental chambers to study and improve the stabilities of halide-perovskite materials. All generations of hardware bills of materials, assembly instructions, and operating codes are available in open-source repositories

ChemRxiv