Search CORE

27 research outputs found

Fast Kinetic Monte Carlo Simulations: Implementation, Application, and Analysis.

Author: Reyes Kristofer G.
Publication venue
Publication date: 01/01/2013
Field of study

This work presents a multi-component kinetic Monte Carlo (KMC) model and its applications to three example systems: Ga droplet epitaxy, nanowires grown by the Vapor-Liquid-Solid (VLS) method, and sintering of porous granular material. The first two systems are examples of liquid mediated growth. We detail how the liquid phase is modeled. A caching technique is proposed to eliminate redundant calculations, leading to performance gains. Underlying the cache is a hash table, indexed by neighborhood patterns of an atom configuration. We present numerical evidence that such neighborhood patterns are redundant within and between configurations, justifying the caching procedure. A simulated annealing search for optimal, system-specific hash functions is performed. Simulation results and analysis of droplet epitaxy are then described. We detail the calibration of model parameters, exhibiting a good agreement with homoepitaxial thin film experiments. Droplet epitaxy simulations capture a variety of nanostrutures seen in experiments, ranging from compact dots to nanorings. The correct trends in growth conditions are also captured, resulting in a phase diagram consistent with what is seen experimentally. Core-shell structures are also simulated. We present simulations to suggest the existence of two mechanisms behind the their formation: nucleation at the vapor-liquid interface and an instability at the vapor-solid interface. An analytical model is developed and isolates the relevant processes behind the formation of the phenomena seen throughout the simulations and in experiments. In the VLS nanowire simulations, we present how the catalyzed role of the liquid phase is incorporated into the model and perform an energy parameter study. We exhibit the role of the catalyzed reaction rate and its contribution to growth leading to features such as tapering. The mobility along the liquid-solid interface is also studied. We show how this affects nanowire growth direction and kinking. In the sintering simulations, we present the KMC model in contrast with previous simulation work. A similar parameter study is then performed by studying the effect of parameters on coarsening statistics. Grain statistics are measured as a function of time and captures a power-law behavior for the grain radius. Critical behavior with respect to certain parameters is also presented.PhDApplied and Interdisciplinary MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99949/1/kgre_1.pd

Deep Blue Documents

A Knowledge Gradient Policy for Sequencing Experiments to Identify the Structure of RNA Molecules Using a Sparse Additive Belief Model

Author: Contreras Lydia M.
Li Yan
Powell Warren B.
Reyes Kristofer G.
Vazquez-Anderson Jorge
Wang Yingfei
Publication venue
Publication date: 06/08/2015
Field of study

We present a sparse knowledge gradient (SpKG) algorithm for adaptively selecting the targeted regions within a large RNA molecule to identify which regions are most amenable to interactions with other molecules. Experimentally, such regions can be inferred from fluorescence measurements obtained by binding a complementary probe with fluorescence markers to the targeted regions. We use a biophysical model which shows that the fluorescence ratio under the log scale has a sparse linear relationship with the coefficients describing the accessibility of each nucleotide, since not all sites are accessible (due to the folding of the molecule). The SpKG algorithm uniquely combines the Bayesian ranking and selection problem with the frequentist

\ell_1

regularized regression approach Lasso. We use this algorithm to identify the sparsity pattern of the linear model as well as sequentially decide the best regions to test before experimental budget is exhausted. Besides, we also develop two other new algorithms: batch SpKG algorithm, which generates more suggestions sequentially to run parallel experiments; and batch SpKG with a procedure which we call length mutagenesis. It dynamically adds in new alternatives, in the form of types of probes, are created by inserting, deleting or mutating nucleotides within existing probes. In simulation, we demonstrate these algorithms on the Group I intron (a mid-size RNA molecule), showing that they efficiently learn the correct sparsity pattern, identify the most accessible region, and outperform several other policies

arXiv.org e-Print Archive

CiteSeerX

A Rigorous Uncertainty-Aware Quantification Framework Is Essential for Reproducible and Replicable Machine Learning Workflows

Author: Alexander Francis J.
Pouchard Line
Reyes Kristofer G.
Yoon Byung-Jun
Publication venue
Publication date: 23/08/2023
Field of study

The ability to replicate predictions by machine learning (ML) or artificial intelligence (AI) models and results in scientific workflows that incorporate such ML/AI predictions is driven by numerous factors. An uncertainty-aware metric that can quantitatively assess the reproducibility of quantities of interest (QoI) would contribute to the trustworthiness of results obtained from scientific workflows involving ML/AI models. In this article, we discuss how uncertainty quantification (UQ) in a Bayesian paradigm can provide a general and rigorous framework for quantifying reproducibility for complex scientific workflows. Such as framework has the potential to fill a critical gap that currently exists in ML/AI for scientific workflows, as it will enable researchers to determine the impact of ML/AI model prediction variability on the predictive outcomes of ML/AI-powered workflows. We expect that the envisioned framework will contribute to the design of more reproducible and trustworthy workflows for diverse scientific applications, and ultimately, accelerate scientific discoveries

arXiv.org e-Print Archive

Identifying Bayesian Optimal Experiments for Uncertain Biochemical Pathway Models

Author: Isenberg Natalie M.
Mertins Susan D.
Reyes Kristofer
Urban Nathan M.
Yoon Byung-Jun
Publication venue
Publication date: 26/09/2023
Field of study

Pharmacodynamic (PD) models are mathematical models of cellular reaction networks that include drug mechanisms of action. These models are useful for studying predictive therapeutic outcomes of novel drug therapies in silico. However, PD models are known to possess significant uncertainty with respect to constituent parameter data, leading to uncertainty in the model predictions. Furthermore, experimental data to calibrate these models is often limited or unavailable for novel pathways. In this study, we present a Bayesian optimal experimental design approach for improving PD model prediction accuracy. We then apply our method using simulated experimental data to account for uncertainty in hypothetical laboratory measurements. This leads to a probabilistic prediction of drug performance and a quantitative measure of which prospective laboratory experiment will optimally reduce prediction uncertainty in the PD model. The methods proposed here provide a way forward for uncertainty quantification and guided experimental design for models of novel biological pathways

arXiv.org e-Print Archive

A Bayesian experimental autonomous researcher for mechanical design

Author: Brown Keith
Gongora Aldair
Morgan Elise
Okoye Chika
Perry Wyatt
Reyes Kristofer
Riley Patrick
Xu Bowen
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 10/04/2020
Field of study

While additive manufacturing (AM) has facilitated the production of complex structures, it has also highlighted the immense challenge inherent in identifying the optimum AM structure for a given application. Numerical methods are important tools for optimization, but experiment remains the gold standard for studying nonlinear, but critical, mechanical properties such as toughness. To address the vastness of AM design space and the need for experiment, we develop a Bayesian experimental autonomous researcher (BEAR) that combines Bayesian optimization and high-throughput automated experimentation. In addition to rapidly performing experiments, the BEAR leverages iterative experimentation by selecting experiments based on all available results. Using the BEAR, we explore the toughness of a parametric family of structures and observe an almost 60-fold reduction in the number of experiments needed to identify high-performing structures relative to a grid-based search. These results show the value of machine learning in experimental fields where data are sparse.Published versio

Boston University Institutional Repository (OpenBU)

Recommended from our members

Mathematical nuances of Gaussian process-driven autonomous experimentation

Author: Noack Marcus M
Reyes Kristofer G
Publication venue: eScholarship, University of California
Publication date: 01/02/2023
Field of study

The fields of machine learning (ML) and artificial intelligence (AI) have transformed almost every aspect of science and engineering. The excitement for AI/ML methods is in large part due to their perceived novelty, as compared to traditional methods of statistics, computation, and applied mathematics. But clearly, all methods in ML have their foundations in mathematical theories, such as function approximation, uncertainty quantification, and function optimization. Autonomous experimentation is no exception; it is often formulated as a chain of off-the-shelf tools, organized in a closed loop, without emphasis on the intricacies of each algorithm involved. The uncomfortable truth is that the success of any ML endeavor, and this includes autonomous experimentation, strongly depends on the sophistication of the underlying mathematical methods and software that have to allow for enough flexibility to consider functions that are in agreement with particular physical theories. We have observed that standard off-the-shelf tools, used by many in the applied ML community, often hide the underlying complexities and therefore perform poorly. In this paper, we want to give a perspective on the intricate connections between mathematics and ML, with a focus on Gaussian process-driven autonomous experimentation. Although the Gaussian process is a powerful mathematical concept, it has to be implemented and customized correctly for optimal performance. We present several simple toy problems to explore these nuances and highlight the importance of mathematical and statistical rigor in autonomous experimentation and ML. One key takeaway is that ML is not, as many had hoped, a set of agnostic plug-and-play solvers for everyday scientific problems, but instead needs expertise and mastery to be applied successfully. Graphical abstract: [Figure not available: see fulltext.

eScholarship - University of California

Recommended from our members

Exact Gaussian processes for massive datasets via non-stationary sparsity-discovering kernels

Author: Krishnan Harinarayan
Noack Marcus M
Reyes Kristofer G
Risser Mark D
Publication venue: eScholarship, University of California
Publication date: 01/01/2023
Field of study

A Gaussian Process (GP) is a prominent mathematical framework for stochastic function approximation in science and engineering applications. Its success is largely attributed to the GP's analytical tractability, robustness, and natural inclusion of uncertainty quantification. Unfortunately, the use of exact GPs is prohibitively expensive for large datasets due to their unfavorable numerical complexity of [Formula: see text] in computation and [Formula: see text] in storage. All existing methods addressing this issue utilize some form of approximation-usually considering subsets of the full dataset or finding representative pseudo-points that render the covariance matrix well-structured and sparse. These approximate methods can lead to inaccuracies in function approximations and often limit the user's flexibility in designing expressive kernels. Instead of inducing sparsity via data-point geometry and structure, we propose to take advantage of naturally-occurring sparsity by allowing the kernel to discover-instead of induce-sparse structure. The premise of this paper is that the data sets and physical processes modeled by GPs often exhibit natural or implicit sparsities, but commonly-used kernels do not allow us to exploit such sparsity. The core concept of exact, and at the same time sparse GPs relies on kernel definitions that provide enough flexibility to learn and encode not only non-zero but also zero covariances. This principle of ultra-flexible, compactly-supported, and non-stationary kernels, combined with HPC and constrained optimization, lets us scale exact GPs well beyond 5 million data points

eScholarship - University of California

Recommended from our members

Optimal Learning in Experimental Design Using the Knowledge Gradient Policy with Application to Characterizing Nanoemulsion Stability

Author: Chen Si
Gupta Maneesh K
McAlpine Michael C
Powell Warren B
Reyes Kristofer-Roy G
Publication venue
Publication date: 01/01/2015
Field of study

We present a technique for adaptively choosing a sequence of experiments for materials design and optimization. Specifically, we consider the problem of identifying the choice of experimental control variables that optimize the kinetic stability of a nanoemulsion, which we formulate as a ranking and selection problem. We introduce an optimization algorithm called the knowledge gradient with discrete priors (KGDP) that sequentially and adaptively selects experiments and that maximizes the rate of learning the optimal control variables. This is done through a combination of a physical, kinetic model of nanoemulsion stability, Bayesian inference, and a decision policy. Prior knowledge from domain experts is incorporated into the algorithm as well. Through numerical experiments, we show that the KGDP algorithm outperforms the policies of both random exploration (in which an experiment is selected uniformly at random among all potential experiments) and exploitation (which selects the experiment that appears to be the best, given the current state of Bayesian knowledge)

Princeton University Open Access Repository

Crossref