Search CORE

6 research outputs found

Adaptive Informative Path Planning with Multimodal Sensing

Author: Choudhury Shushman
Gruver Nate
Kochenderfer Mykel J.
Publication venue
Publication date: 21/03/2020
Field of study

Adaptive Informative Path Planning (AIPP) problems model an agent tasked with obtaining information subject to resource constraints in unknown, partially observable environments. Existing work on AIPP has focused on representing observations about the world as a result of agent movement. We formulate the more general setting where the agent may choose between different sensors at the cost of some energy, in addition to traversing the environment to gather information. We call this problem AIPPMS (MS for Multimodal Sensing). AIPPMS requires reasoning jointly about the effects of sensing and movement in terms of both energy expended and information gained. We frame AIPPMS as a Partially Observable Markov Decision Process (POMDP) and solve it with online planning. Our approach is based on the Partially Observable Monte Carlo Planning framework with modifications to ensure constraint feasibility and a heuristic rollout policy tailored for AIPPMS. We evaluate our method on two domains: a simulated search-and-rescue scenario and a challenging extension to the classic RockSample problem. We find that our approach outperforms a classic AIPP algorithm that is modified for AIPPMS, as well as online planning using a random rollout policy.Comment: First two authors contributed equally; International Conference on Automated Planning and Scheduling (ICAPS) 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

Author: Gruver Nate
Madotto Andrea
Sriram Anuroop
Ulissi Zachary
Wilson Andrew Gordon
Zitnick C. Lawrence
Publication venue
Publication date: 06/02/2024
Field of study

We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculations, we show that our strongest model (fine-tuned LLaMA-2 70B) can generate materials predicted to be metastable at about twice the rate (49% vs 28%) of CDVAE, a competing diffusion model. Because of text prompting's inherent flexibility, our models can simultaneously be used for unconditional generation of stable material, infilling of partial structures and text-conditional generation. Finally, we show that language models' ability to capture key symmetries of crystal structures improves with model scale, suggesting that the biases of pretrained LLMs are surprisingly well-suited for atomistic data.Comment: ICLR 2024. Code available at: https://github.com/facebookresearch/crystal-ll

arXiv.org e-Print Archive

Protein Design with Guided Discrete Diffusion

Author: Cho Kyunghyun
Frey Nathan C.
Gruver Nate
Hotzel Isidro
Lafrance-Vanasse Julien
Rajpal Arvind
Rudner Tim G. J.
Stanton Samuel
Wilson Andrew Gordon
Publication venue
Publication date: 12/12/2023
Field of study

A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. The generative model samples plausible sequences while the discriminative model guides a search for sequences with high fitness. Given its broad success in conditional sampling, classifier-guided diffusion modeling is a promising foundation for protein design, leading many to develop guided diffusion models for structure with inverse folding to recover sequences. In this work, we propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models that follows gradients in the hidden states of the denoising network. NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods, including scarce data and challenging inverse design. Moreover, we use NOS to generalize LaMBO, a Bayesian optimization procedure for sequence design that facilitates multiple objectives and edit-based constraints. The resulting method, LaMBO-2, enables discrete diffusions and stronger performance with limited edits through a novel application of saliency maps. We apply LaMBO-2 to a real-world protein design task, optimizing antibodies for higher expression yield and binding affinity to several therapeutic targets under locality and developability constraints, attaining a 99% expression rate and 40% binding rate in exploratory in vitro experiments

arXiv.org e-Print Archive

Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders

Author: Delaney Emily
Greenside Peyton
Gruver Nate
Maddox Wesley
Maffettone Phillip
Stanton Samuel
Wilson Andrew Gordon
Publication venue
Publication date: 12/07/2022
Field of study

Bayesian optimization (BayesOpt) is a gold standard for query-efficient continuous optimization. However, its adoption for drug design has been hindered by the discrete, high-dimensional nature of the decision variables. We develop a new approach (LaMBO) which jointly trains a denoising autoencoder with a discriminative multi-task Gaussian process head, allowing gradient-based optimization of multi-objective acquisition functions in the latent space of the autoencoder. These acquisition functions allow LaMBO to balance the explore-exploit tradeoff over multiple design rounds, and to balance objective tradeoffs by optimizing sequences at many different points on the Pareto frontier. We evaluate LaMBO on two small-molecule design tasks, and introduce new tasks optimizing \emph{in silico} and \emph{in vitro} properties of large-molecule fluorescent proteins. In our experiments LaMBO outperforms genetic optimizers and does not require a large pretraining corpus, demonstrating that BayesOpt is practical and effective for biological sequence design.Comment: ICML 2022. Code available at https://github.com/samuelstanton/lamb

arXiv.org e-Print Archive

Characterizing Student Engagement Moods for Dropout Prediction in Question Pool Websites

Crossref