807 research outputs found
Learning and Control of Dynamical Systems
Despite the remarkable success of machine learning in various domains in recent years, our understanding of its fundamental limitations remains incomplete. This knowledge gap poses a grand challenge when deploying machine learning methods in critical decision-making tasks, where incorrect decisions can have catastrophic consequences. To effectively utilize these learning-based methods in such contexts, it is crucial to explicitly characterize their performance. Over the years, significant research efforts have been dedicated to learning and control of dynamical systems where the underlying dynamics are unknown or only partially known a priori, and must be inferred from collected data. However, much of these classical results have focused on asymptotic guarantees, providing limited insights into the amount of data required to achieve desired control performance while satisfying operational constraints such as safety and stability, especially in the presence of statistical noise.
In this thesis, we study the statistical complexity of learning and control of unknown dynamical systems. By utilizing recent advances in statistical learning theory, high-dimensional statistics, and control theoretic tools, we aim to establish a fundamental understanding of the number of samples required to achieve desired (i) accuracy in learning the unknown dynamics, (ii) performance in the control of the underlying system, and (iii) satisfaction of the operational constraints such as safety and stability. We provide finite-sample guarantees for these objectives and propose efficient learning and control algorithms that achieve the desired performance at these statistical limits in various dynamical systems. Our investigation covers a broad range of dynamical systems, starting from fully observable linear dynamical systems to partially observable linear dynamical systems, and ultimately, nonlinear systems.
We deploy our learning and control algorithms in various adaptive control tasks in real-world control systems and demonstrate their strong empirical performance along with their learning, robustness, and stability guarantees. In particular, we implement one of our proposed methods, Fourier Adaptive Learning and Control (FALCON), on an experimental aerodynamic testbed under extreme turbulent flow dynamics in a wind tunnel. The results show that FALCON achieves state-of-the-art stabilization performance and consistently outperforms conventional and other learning-based methods by at least 37%, despite using 8 times less data. The superior performance of FALCON arises from its physically and theoretically accurate modeling of the underlying nonlinear turbulent dynamics, which yields rigorous finite-sample learning and performance guarantees. These findings underscore the importance of characterizing the statistical complexity of learning and control of unknown dynamical systems.</p
Collective variables between large-scale states in turbulent convection
The dynamics in a confined turbulent convection flow is dominated by multiple
long-lived macroscopic circulation states, which are visited subsequently by
the system in a Markov-type hopping process. In the present work, we analyze
the short transition paths between these subsequent macroscopic system states
by a data-driven learning algorithm that extracts the low-dimensional
transition manifold and the related new coordinates, which we term collective
variables, in the state space of the complex turbulent flow. We therefore
transfer and extend concepts for conformation transitions in stochastic
microscopic systems, such as in the dynamics of macromolecules, to a
deterministic macroscopic flow. Our analysis is based on long-term direct
numerical simulation trajectories of turbulent convection in a closed cubic
cell at a Prandtl number and Rayleigh numbers and
for a time lag of convective free-fall time units. The simulations
resolve vortices and plumes of all physically relevant scales resulting in a
state space spanned by more than 3.5 million degrees of freedom. The transition
dynamics between the large-scale circulation states can be captured by the
transition manifold analysis with only two collective variables which implies a
reduction of the data dimension by a factor of more than a million. Our method
demonstrates that cessations and subsequent reversals of the large-scale flow
are unlikely in the present setup and thus paves the way to the development of
efficient reduced-order models of the macroscopic complex nonlinear dynamical
system.Comment: 24 pages, 12 Figures, 1 tabl
Limit Theory under Network Dependence and Nonstationarity
These lecture notes represent supplementary material for a short course on
time series econometrics and network econometrics. We give emphasis on limit
theory for time series regression models as well as the use of the
local-to-unity parametrization when modeling time series nonstationarity.
Moreover, we present various non-asymptotic theory results for moderate
deviation principles when considering the eigenvalues of covariance matrices as
well as asymptotics for unit root moderate deviations in nonstationary
autoregressive processes. Although not all applications from the literature are
covered we also discuss some open problems in the time series and network
econometrics literature.Comment: arXiv admin note: text overlap with arXiv:1705.08413 by other author
Dissecting regional heterogeneity and modeling transcriptional cascades in brain organoids
Over the past decade, there has been a rapid expansion in the development and utilization of brain organoid models, enabling three-dimensional in vivo-like views of fundamental neurodevelopmental features of corticogenesis in health and disease. Nonetheless, the methods used for generating cortical organoid fates exhibit widespread heterogeneity across different cell lines. Here, we show that a combination of dual SMAD and WNT inhibition (Triple-i protocol) establishes a robust cortical identity in brain organoids, while other widely used derivation protocols are inconsistent with respect to regional specification. In order to measure this heterogeneity, we employ single-cell RNA-sequencing (scRNA-Seq), enabling the sampling of the gene expression profiles of thousands of cells in an individual sample. However, in order to draw meaningful conclusions from scRNA-Seq data, technical artifacts must be identified and removed. In this thesis, we present a method to detect one such artifact, empty droplets that do not contain a cell and consist mainly of free-floating mRNA in the sample. Furthermore, from their expression profiles, cells can be ordered along a developmental trajectory which recapitulates the progression of cells as they differentiate. Based on this ordering, we model gene expression using a Bayesian inference approach in order to measure transcriptional dynamics within differentiating cells. This enables the ordering of genes along transcriptional cascades, statistical testing for differences in gene expression changes, and measuring potential regulatory gene interactions. We apply this approach to differentiating cortical neural stem cells into cortical neurons via an intermediate progenitor cell type in brain organoids to provide a detailed characterization of the endogenous molecular processes underlying neurogenesis.Im letzten Jahrzent hat die Entwicklung und Nutzung von Organoidmodellen des Gehirns stark zugenommen. Diese Modelle erlauben dreidimensionale, in-vivo ähnliche Einblicke in fundamentale Aspekte der neurologischen Entwicklung des Hirnkortex in Gesundheit und Krankheit. Jedoch weisen die Methoden, um die Entwicklung kortikaler Organoide zu verfolgen, starke Heterogenität zwischen verschiedenen Zelllinien auf. Hier weisen wir nach, dass eine Kombination dualer SMAD und WNT Hemmung (Triple-i Protokoll) eine konstante kortikale Zuordnung in Hirnorganoiden erzeugt, während andere, weit verbreitete und genutzte Protokolle in Bezug auf kortikale Spezifizierung keine konstanten Ergebnisse liefern. Um die Heterogenität zu messen, haben wir Einzelzell-RNA Sequenzierung (scRNA-Seq) benutzt, wodurch die Erfassung der Genexpression von Tausenden von Zellen in einer Probe möglich ist. Um jedoch sinnvolle Schlüsse aus diesen scRNA-Seq Daten zu ziehen, müssen technische Artifakte identifiziert und aus den Daten entfernt werden. In dieser Dissertation stellen wir eine Methode vor, um eines solcher Artifakte zu erkennen: leere Tröpfchen (ohne Zellen), die hauptsächlich aus freischwebender mRNA in der Probe bestehen. Weiterhin können Zellen anhand ihrer Genexpressionsprofile entlang einer Entwicklungsschiene angeordnet werden, die die Entwicklung der Zellen während ihrer Differenzierung rekapituliert. Auf der Grundlage dieser Entwicklungsreihenfolge modellieren wir die Genexpression mit einem Bayes’schen Inferenzansatz, um die Dynamik der Transkription in sich differenzierenden Zellen zu messen. Dies ermöglicht das Anordnen von Genen entlang einer Transkriptionskaskade, sowie statistische Untersuchungen in Hinblick auf Unterschiede in der Veränderung von Genexpression, und das Messen des Einflusses möglicher Regulationsgene. Wir wenden diese Methode an, um kortikale neuronale Stammzellen zu untersuchen, die sich über einen intermediären Vorläuferzelltyp in kortikale Neuronen in Hirnorganoiden differenzieren, und um eine detaillierte Charakterisierung der molekularen Prozesse zu liefern, die der Neurogenese zugrunde liegen
Markov field models of molecular kinetics
Computer simulations such as molecular dynamics (MD) provide a possible means to understand protein dynamics and mechanisms on an atomistic scale. The resulting simulation data can be analyzed with Markov state models (MSMs), yielding a quantitative kinetic model that, e.g., encodes state populations and transition rates. However, the larger an investigated system, the more data is required to estimate a valid kinetic model. In this work, we show that this scaling problem can be escaped when decomposing a system into smaller ones, leveraging weak couplings between local domains. Our approach, termed independent Markov decomposition (IMD), is a first-order approximation neglecting couplings, i.e., it represents a decomposition of the underlying global dynamics into a set of independent local ones. We demonstrate that for truly independent systems, IMD can reduce the sampling by three orders of magnitude. IMD is applied to two biomolecular systems. First, synaptotagmin-1 is analyzed, a rapid calcium switch from the neurotransmitter release machinery. Within its C2A domain, local conformational switches are identified and modeled with independent MSMs, shedding light on the mechanism of its calcium-mediated activation. Second, the catalytic site of the serine protease TMPRSS2 is analyzed with a local drug-binding model. Equilibrium populations of different drug-binding modes are derived for three inhibitors, mirroring experimentally determined drug efficiencies. IMD is subsequently extended to an end-to-end deep learning framework called iVAMPnets, which learns a domain decomposition from simulation data and simultaneously models the kinetics in the local domains. We finally classify IMD and iVAMPnets as Markov field models (MFM), which we define as a class of models that describe dynamics by decomposing systems into local domains. Overall, this thesis introduces a local approach to Markov modeling that enables to quantitatively assess the kinetics of large macromolecular complexes, opening up possibilities to tackle current and future computational molecular biology questions
Discovering Causal Relations and Equations from Data
Physics is a field of science that has traditionally used the scientific
method to answer questions about why natural phenomena occur and to make
testable models that explain the phenomena. Discovering equations, laws and
principles that are invariant, robust and causal explanations of the world has
been fundamental in physical sciences throughout the centuries. Discoveries
emerge from observing the world and, when possible, performing interventional
studies in the system under study. With the advent of big data and the use of
data-driven methods, causal and equation discovery fields have grown and made
progress in computer science, physics, statistics, philosophy, and many applied
fields. All these domains are intertwined and can be used to discover causal
relations, physical laws, and equations from observational data. This paper
reviews the concepts, methods, and relevant works on causal and equation
discovery in the broad field of Physics and outlines the most important
challenges and promising future lines of research. We also provide a taxonomy
for observational causal and equation discovery, point out connections, and
showcase a complete set of case studies in Earth and climate sciences, fluid
dynamics and mechanics, and the neurosciences. This review demonstrates that
discovering fundamental laws and causal relations by observing natural
phenomena is being revolutionised with the efficient exploitation of
observational data, modern machine learning algorithms and the interaction with
domain knowledge. Exciting times are ahead with many challenges and
opportunities to improve our understanding of complex systems.Comment: 137 page
LieDetect: Detection of representation orbits of compact Lie groups from point clouds
We suggest a new algorithm to estimate representations of compact Lie groups
from finite samples of their orbits. Different from other reported techniques,
our method allows the retrieval of the precise representation type as a direct
sum of irreducible representations. Moreover, the knowledge of the
representation type permits the reconstruction of its orbit, which is useful to
identify the Lie group that generates the action. Our algorithm is general for
any compact Lie group, but only instantiations for SO(2), T^d, SU(2) and SO(3)
are considered. Theoretical guarantees of robustness in terms of Hausdorff and
Wasserstein distances are derived. Our tools are drawn from geometric measure
theory, computational geometry, and optimization on matrix manifolds. The
algorithm is tested for synthetic data up to dimension 16, as well as real-life
applications in image analysis, harmonic analysis, and classical mechanics
systems, achieving very accurate results.Comment: 84 pages, 16 figure
Euler Characteristic Tools For Topological Data Analysis
In this article, we study Euler characteristic techniques in topological data
analysis. Pointwise computing the Euler characteristic of a family of
simplicial complexes built from data gives rise to the so-called Euler
characteristic profile. We show that this simple descriptor achieve
state-of-the-art performance in supervised tasks at a very low computational
cost. Inspired by signal analysis, we compute hybrid transforms of Euler
characteristic profiles. These integral transforms mix Euler characteristic
techniques with Lebesgue integration to provide highly efficient compressors of
topological signals. As a consequence, they show remarkable performances in
unsupervised settings. On the qualitative side, we provide numerous heuristics
on the topological and geometric information captured by Euler profiles and
their hybrid transforms. Finally, we prove stability results for these
descriptors as well as asymptotic guarantees in random settings.Comment: 39 page
Complexity Science in Human Change
This reprint encompasses fourteen contributions that offer avenues towards a better understanding of complex systems in human behavior. The phenomena studied here are generally pattern formation processes that originate in social interaction and psychotherapy. Several accounts are also given of the coordination in body movements and in physiological, neuronal and linguistic processes. A common denominator of such pattern formation is that complexity and entropy of the respective systems become reduced spontaneously, which is the hallmark of self-organization. The various methodological approaches of how to model such processes are presented in some detail. Results from the various methods are systematically compared and discussed. Among these approaches are algorithms for the quantification of synchrony by cross-correlational statistics, surrogate control procedures, recurrence mapping and network models.This volume offers an informative and sophisticated resource for scholars of human change, and as well for students at advanced levels, from graduate to post-doctoral. The reprint is multidisciplinary in nature, binding together the fields of medicine, psychology, physics, and neuroscience
- …