80 research outputs found
Data-driven discovery of coordinates and governing equations
The discovery of governing equations from scientific data has the potential
to transform data-rich fields that lack well-characterized quantitative
descriptions. Advances in sparse regression are currently enabling the
tractable identification of both the structure and parameters of a nonlinear
dynamical system from data. The resulting models have the fewest terms
necessary to describe the dynamics, balancing model complexity with descriptive
ability, and thus promoting interpretability and generalizability. This
provides an algorithmic approach to Occam's razor for model discovery. However,
this approach fundamentally relies on an effective coordinate system in which
the dynamics have a simple representation. In this work, we design a custom
autoencoder to discover a coordinate transformation into a reduced space where
the dynamics may be sparsely represented. Thus, we simultaneously learn the
governing equations and the associated coordinate system. We demonstrate this
approach on several example high-dimensional dynamical systems with
low-dimensional behavior. The resulting modeling framework combines the
strengths of deep neural networks for flexible representation and sparse
identification of nonlinear dynamics (SINDy) for parsimonious models. It is the
first method of its kind to place the discovery of coordinates and models on an
equal footing.Comment: 25 pages, 6 figures; added acknowledgment
Machine Learning for Fluid Mechanics
The field of fluid mechanics is rapidly advancing, driven by unprecedented
volumes of data from field measurements, experiments and large-scale
simulations at multiple spatiotemporal scales. Machine learning offers a wealth
of techniques to extract information from data that could be translated into
knowledge about the underlying fluid mechanics. Moreover, machine learning
algorithms can augment domain knowledge and automate tasks related to flow
control and optimization. This article presents an overview of past history,
current developments, and emerging opportunities of machine learning for fluid
mechanics. It outlines fundamental machine learning methodologies and discusses
their uses for understanding, modeling, optimizing, and controlling fluid
flows. The strengths and limitations of these methods are addressed from the
perspective of scientific inquiry that considers data as an inherent part of
modeling, experimentation, and simulation. Machine learning provides a powerful
information processing framework that can enrich, and possibly even transform,
current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202
Permutationally Invariant Networks for Enhanced Sampling (PINES): Discovery of Multi-Molecular and Solvent-Inclusive Collective Variables
The typically rugged nature of molecular free energy landscapes can frustrate
efficient sampling of the thermodynamically relevant phase space due to the
presence of high free energy barriers. Enhanced sampling techniques can improve
phase space exploration by accelerating sampling along particular collective
variables (CVs). A number of techniques exist for data-driven discovery of CVs
parameterizing the important large scale motions of the system. A challenge to
CV discovery is learning CVs invariant to symmetries of the molecular system,
frequently rigid translation, rigid rotation, and permutational relabeling of
identical particles. Of these, permutational invariance have proved a
persistent challenge in frustrating the the data-driven discovery of
multi-molecular CVs in systems of self-assembling particles and
solvent-inclusive CVs for solvated systems. In this work, we integrate
Permutation Invariant Vector (PIV) featurizations with autoencoding neural
networks to learn nonlinear CVs invariant to translation, rotation, and
permutation, and perform interleaved rounds of CV discovery and enhanced
sampling to iteratively expand sampling of configurational phase space and
obtain converged CVs and free energy landscapes. We demonstrate the
Permutationally Invariant Network for Enhanced Sampling (PINES) approach in
applications to the self-assembly of a 13-atom Argon cluster,
association/dissociation of a NaCl ion pair in water, and hydrophobic collapse
of a C45H92 n-pentatetracontane polymer chain. We make the approach freely
available as a new module within the PLUMED2 enhanced sampling libraries
Discovering Causal Relations and Equations from Data
Physics is a field of science that has traditionally used the scientific
method to answer questions about why natural phenomena occur and to make
testable models that explain the phenomena. Discovering equations, laws and
principles that are invariant, robust and causal explanations of the world has
been fundamental in physical sciences throughout the centuries. Discoveries
emerge from observing the world and, when possible, performing interventional
studies in the system under study. With the advent of big data and the use of
data-driven methods, causal and equation discovery fields have grown and made
progress in computer science, physics, statistics, philosophy, and many applied
fields. All these domains are intertwined and can be used to discover causal
relations, physical laws, and equations from observational data. This paper
reviews the concepts, methods, and relevant works on causal and equation
discovery in the broad field of Physics and outlines the most important
challenges and promising future lines of research. We also provide a taxonomy
for observational causal and equation discovery, point out connections, and
showcase a complete set of case studies in Earth and climate sciences, fluid
dynamics and mechanics, and the neurosciences. This review demonstrates that
discovering fundamental laws and causal relations by observing natural
phenomena is being revolutionised with the efficient exploitation of
observational data, modern machine learning algorithms and the interaction with
domain knowledge. Exciting times are ahead with many challenges and
opportunities to improve our understanding of complex systems.Comment: 137 page
PySAGES: flexible, advanced sampling methods accelerated with GPUs
Molecular simulations are an important tool for research in physics,
chemistry, and biology. The capabilities of simulations can be greatly expanded
by providing access to advanced sampling methods and techniques that permit
calculation of the relevant underlying free energy landscapes. In this sense,
software that can be seamlessly adapted to a broad range of complex systems is
essential. Building on past efforts to provide open-source community supported
software for advanced sampling, we introduce PySAGES, a Python implementation
of the Software Suite for Advanced General Ensemble Simulations (SSAGES) that
provides full GPU support for massively parallel applications of enhanced
sampling methods such as adaptive biasing forces, harmonic bias, or forward
flux sampling in the context of molecular dynamics simulations. By providing an
intuitive interface that facilitates the management of a system's
configuration, the inclusion of new collective variables, and the
implementation of sophisticated free energy-based sampling methods, the PySAGES
library serves as a general platform for the development and implementation of
emerging simulation techniques. The capabilities, core features, and
computational performance of this new tool are demonstrated with clear and
concise examples pertaining to different classes of molecular systems. We
anticipate that PySAGES will provide the scientific community with a robust and
easily accessible platform to accelerate simulations, improve sampling, and
enable facile estimation of free energies for a wide range of materials and
processes
- âŠ