1,648 research outputs found
Using metadynamics to explore complex free-energy landscapes
Metadynamics is an atomistic simulation technique that allows, within the same framework, acceleration of rare events and estimation of the free energy of complex molecular systems. It is based on iteratively \u2018filling\u2019 the potential energy of the system by a sum of Gaussians centred along the trajectory followed by a suitably chosen set of collective variables (CVs), thereby forcing the system to migrate from one minimum to the next. The power of metadynamics is demonstrated by the large number of extensions and variants that have been developed. The first scope of this Technical Review is to present a critical comparison of these variants, discussing their advantages and disadvantages. The effectiveness of metadynamics, and that of the numerous alternative methods, is strongly influenced by the choice of the CVs. If an important variable is neglected, the resulting estimate of the free energy is unreliable, and predicted transition mechanisms may be qualitatively wrong. The second scope of this Technical Review is to discuss how the CVs should be selected, how to verify whether the chosen CVs are sufficient or redundant, and how to iteratively improve the CVs using machine learning approaches
Machine learning in the analysis of biomolecular simulations
Machine learning has rapidly become a key method for the analysis and organization of large-scale data in all scientific disciplines. In life sciences, the use of machine learning techniques is a particularly appealing idea since the enormous capacity of computational infrastructures generates terabytes of data through millisecond simulations of atomistic and molecular-scale biomolecular systems. Due to this explosion of data, the automation, reproducibility, and objectivity provided by machine learning methods are highly desirable features in the analysis of complex systems. In this review, we focus on the use of machine learning in biomolecular simulations. We discuss the main categories of machine learning tasks, such as dimensionality reduction, clustering, regression, and classification used in the analysis of simulation data. We then introduce the most popular classes of techniques involved in these tasks for the purpose of enhanced sampling, coordinate discovery, and structure prediction. Whenever possible, we explain the scope and limitations of machine learning approaches, and we discuss examples of applications of these techniques.Peer reviewe
Big-Data Science in Porous Materials: Materials Genomics and Machine Learning
By combining metal nodes with organic linkers we can potentially synthesize
millions of possible metal organic frameworks (MOFs). At present, we have
libraries of over ten thousand synthesized materials and millions of in-silico
predicted materials. The fact that we have so many materials opens many
exciting avenues to tailor make a material that is optimal for a given
application. However, from an experimental and computational point of view we
simply have too many materials to screen using brute-force techniques. In this
review, we show that having so many materials allows us to use big-data methods
as a powerful technique to study these materials and to discover complex
correlations. The first part of the review gives an introduction to the
principles of big-data science. We emphasize the importance of data collection,
methods to augment small data sets, how to select appropriate training sets. An
important part of this review are the different approaches that are used to
represent these materials in feature space. The review also includes a general
overview of the different ML techniques, but as most applications in porous
materials use supervised ML our review is focused on the different approaches
for supervised ML. In particular, we review the different method to optimize
the ML process and how to quantify the performance of the different methods. In
the second part, we review how the different approaches of ML have been applied
to porous materials. In particular, we discuss applications in the field of gas
storage and separation, the stability of these materials, their electronic
properties, and their synthesis. The range of topics illustrates the large
variety of topics that can be studied with big-data science. Given the
increasing interest of the scientific community in ML, we expect this list to
rapidly expand in the coming years.Comment: Editorial changes (typos fixed, minor adjustments to figures
Integrating Machine Learning and Multiscale Modeling: Perspectives, Challenges, and Opportunities in the Biological, Biomedical, and Behavioral Sciences
Fueled by breakthrough technology developments, the biological, biomedical,
and behavioral sciences are now collecting more data than ever before. There is
a critical need for time- and cost-efficient strategies to analyze and
interpret these data to advance human health. The recent rise of machine
learning as a powerful technique to integrate multimodality, multifidelity
data, and reveal correlations between intertwined phenomena presents a special
opportunity in this regard. However, classical machine learning techniques
often ignore the fundamental laws of physics and result in ill-posed problems
or non-physical solutions. Multiscale modeling is a successful strategy to
integrate multiscale, multiphysics data and uncover mechanisms that explain the
emergence of function. However, multiscale modeling alone often fails to
efficiently combine large data sets from different sources and different levels
of resolution. We show how machine learning and multiscale modeling can
complement each other to create robust predictive models that integrate the
underlying physics to manage ill-posed problems and explore massive design
spaces. We critically review the current literature, highlight applications and
opportunities, address open questions, and discuss potential challenges and
limitations in four overarching topical areas: ordinary differential equations,
partial differential equations, data-driven approaches, and theory-driven
approaches. Towards these goals, we leverage expertise in applied mathematics,
computer science, computational biology, biophysics, biomechanics, engineering
mechanics, experimentation, and medicine. Our multidisciplinary perspective
suggests that integrating machine learning and multiscale modeling can provide
new insights into disease mechanisms, help identify new targets and treatment
strategies, and inform decision making for the benefit of human health
ivis Dimensionality Reduction Framework for Biomacromolecular Simulations
Molecular dynamics (MD) simulations have been widely applied to study
macromolecules including proteins. However, high-dimensionality of the datasets
produced by simulations makes it difficult for thorough analysis, and further
hinders a deeper understanding of biomacromolecules. To gain more insights into
the protein structure-function relations, appropriate dimensionality reduction
methods are needed to project simulations onto low-dimensional spaces. Linear
dimensionality reduction methods, such as principal component analysis (PCA)
and time-structure based independent component analysis (t-ICA), could not
preserve sufficient structural information. Though better than linear methods,
nonlinear methods, such as t-distributed stochastic neighbor embedding (t-SNE),
still suffer from the limitations in avoiding system noise and keeping
inter-cluster relations. ivis is a novel deep learning-based dimensionality
reduction method originally developed for single-cell datasets. Here we applied
this framework for the study of light, oxygen and voltage (LOV) domain of
diatom Phaeodactylum tricornutum aureochrome 1a (PtAu1a). Compared with other
methods, ivis is shown to be superior in constructing Markov state model (MSM),
preserving information of both local and global distances and maintaining
similarity between high dimension and low dimension with the least information
loss. Moreover, ivis framework is capable of providing new prospective for
deciphering residue-level protein allostery through the feature weights in the
neural network. Overall, ivis is a promising member in the analysis toolbox for
proteins
Simulation Intelligence: Towards a New Generation of Scientific Methods
The original "Seven Motifs" set forth a roadmap of essential methods for the
field of scientific computing, where a motif is an algorithmic method that
captures a pattern of computation and data movement. We present the "Nine
Motifs of Simulation Intelligence", a roadmap for the development and
integration of the essential algorithms necessary for a merger of scientific
computing, scientific simulation, and artificial intelligence. We call this
merger simulation intelligence (SI), for short. We argue the motifs of
simulation intelligence are interconnected and interdependent, much like the
components within the layers of an operating system. Using this metaphor, we
explore the nature of each layer of the simulation intelligence operating
system stack (SI-stack) and the motifs therein: (1) Multi-physics and
multi-scale modeling; (2) Surrogate modeling and emulation; (3)
Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based
modeling; (6) Probabilistic programming; (7) Differentiable programming; (8)
Open-ended optimization; (9) Machine programming. We believe coordinated
efforts between motifs offers immense opportunity to accelerate scientific
discovery, from solving inverse problems in synthetic biology and climate
science, to directing nuclear energy experiments and predicting emergent
behavior in socioeconomic settings. We elaborate on each layer of the SI-stack,
detailing the state-of-art methods, presenting examples to highlight challenges
and opportunities, and advocating for specific ways to advance the motifs and
the synergies from their combinations. Advancing and integrating these
technologies can enable a robust and efficient hypothesis-simulation-analysis
type of scientific method, which we introduce with several use-cases for
human-machine teaming and automated science
- …