807 research outputs found

    Learning and Control of Dynamical Systems

    Get PDF
    Despite the remarkable success of machine learning in various domains in recent years, our understanding of its fundamental limitations remains incomplete. This knowledge gap poses a grand challenge when deploying machine learning methods in critical decision-making tasks, where incorrect decisions can have catastrophic consequences. To effectively utilize these learning-based methods in such contexts, it is crucial to explicitly characterize their performance. Over the years, significant research efforts have been dedicated to learning and control of dynamical systems where the underlying dynamics are unknown or only partially known a priori, and must be inferred from collected data. However, much of these classical results have focused on asymptotic guarantees, providing limited insights into the amount of data required to achieve desired control performance while satisfying operational constraints such as safety and stability, especially in the presence of statistical noise. In this thesis, we study the statistical complexity of learning and control of unknown dynamical systems. By utilizing recent advances in statistical learning theory, high-dimensional statistics, and control theoretic tools, we aim to establish a fundamental understanding of the number of samples required to achieve desired (i) accuracy in learning the unknown dynamics, (ii) performance in the control of the underlying system, and (iii) satisfaction of the operational constraints such as safety and stability. We provide finite-sample guarantees for these objectives and propose efficient learning and control algorithms that achieve the desired performance at these statistical limits in various dynamical systems. Our investigation covers a broad range of dynamical systems, starting from fully observable linear dynamical systems to partially observable linear dynamical systems, and ultimately, nonlinear systems. We deploy our learning and control algorithms in various adaptive control tasks in real-world control systems and demonstrate their strong empirical performance along with their learning, robustness, and stability guarantees. In particular, we implement one of our proposed methods, Fourier Adaptive Learning and Control (FALCON), on an experimental aerodynamic testbed under extreme turbulent flow dynamics in a wind tunnel. The results show that FALCON achieves state-of-the-art stabilization performance and consistently outperforms conventional and other learning-based methods by at least 37%, despite using 8 times less data. The superior performance of FALCON arises from its physically and theoretically accurate modeling of the underlying nonlinear turbulent dynamics, which yields rigorous finite-sample learning and performance guarantees. These findings underscore the importance of characterizing the statistical complexity of learning and control of unknown dynamical systems.</p

    Collective variables between large-scale states in turbulent convection

    Full text link
    The dynamics in a confined turbulent convection flow is dominated by multiple long-lived macroscopic circulation states, which are visited subsequently by the system in a Markov-type hopping process. In the present work, we analyze the short transition paths between these subsequent macroscopic system states by a data-driven learning algorithm that extracts the low-dimensional transition manifold and the related new coordinates, which we term collective variables, in the state space of the complex turbulent flow. We therefore transfer and extend concepts for conformation transitions in stochastic microscopic systems, such as in the dynamics of macromolecules, to a deterministic macroscopic flow. Our analysis is based on long-term direct numerical simulation trajectories of turbulent convection in a closed cubic cell at a Prandtl number Pr=0.7Pr = 0.7 and Rayleigh numbers Ra=106Ra = 10^6 and 10710^7 for a time lag of 10510^5 convective free-fall time units. The simulations resolve vortices and plumes of all physically relevant scales resulting in a state space spanned by more than 3.5 million degrees of freedom. The transition dynamics between the large-scale circulation states can be captured by the transition manifold analysis with only two collective variables which implies a reduction of the data dimension by a factor of more than a million. Our method demonstrates that cessations and subsequent reversals of the large-scale flow are unlikely in the present setup and thus paves the way to the development of efficient reduced-order models of the macroscopic complex nonlinear dynamical system.Comment: 24 pages, 12 Figures, 1 tabl

    Limit Theory under Network Dependence and Nonstationarity

    Full text link
    These lecture notes represent supplementary material for a short course on time series econometrics and network econometrics. We give emphasis on limit theory for time series regression models as well as the use of the local-to-unity parametrization when modeling time series nonstationarity. Moreover, we present various non-asymptotic theory results for moderate deviation principles when considering the eigenvalues of covariance matrices as well as asymptotics for unit root moderate deviations in nonstationary autoregressive processes. Although not all applications from the literature are covered we also discuss some open problems in the time series and network econometrics literature.Comment: arXiv admin note: text overlap with arXiv:1705.08413 by other author

    Dissecting regional heterogeneity and modeling transcriptional cascades in brain organoids

    Get PDF
    Over the past decade, there has been a rapid expansion in the development and utilization of brain organoid models, enabling three-dimensional in vivo-like views of fundamental neurodevelopmental features of corticogenesis in health and disease. Nonetheless, the methods used for generating cortical organoid fates exhibit widespread heterogeneity across different cell lines. Here, we show that a combination of dual SMAD and WNT inhibition (Triple-i protocol) establishes a robust cortical identity in brain organoids, while other widely used derivation protocols are inconsistent with respect to regional specification. In order to measure this heterogeneity, we employ single-cell RNA-sequencing (scRNA-Seq), enabling the sampling of the gene expression profiles of thousands of cells in an individual sample. However, in order to draw meaningful conclusions from scRNA-Seq data, technical artifacts must be identified and removed. In this thesis, we present a method to detect one such artifact, empty droplets that do not contain a cell and consist mainly of free-floating mRNA in the sample. Furthermore, from their expression profiles, cells can be ordered along a developmental trajectory which recapitulates the progression of cells as they differentiate. Based on this ordering, we model gene expression using a Bayesian inference approach in order to measure transcriptional dynamics within differentiating cells. This enables the ordering of genes along transcriptional cascades, statistical testing for differences in gene expression changes, and measuring potential regulatory gene interactions. We apply this approach to differentiating cortical neural stem cells into cortical neurons via an intermediate progenitor cell type in brain organoids to provide a detailed characterization of the endogenous molecular processes underlying neurogenesis.Im letzten Jahrzent hat die Entwicklung und Nutzung von Organoidmodellen des Gehirns stark zugenommen. Diese Modelle erlauben dreidimensionale, in-vivo ähnliche Einblicke in fundamentale Aspekte der neurologischen Entwicklung des Hirnkortex in Gesundheit und Krankheit. Jedoch weisen die Methoden, um die Entwicklung kortikaler Organoide zu verfolgen, starke Heterogenität zwischen verschiedenen Zelllinien auf. Hier weisen wir nach, dass eine Kombination dualer SMAD und WNT Hemmung (Triple-i Protokoll) eine konstante kortikale Zuordnung in Hirnorganoiden erzeugt, während andere, weit verbreitete und genutzte Protokolle in Bezug auf kortikale Spezifizierung keine konstanten Ergebnisse liefern. Um die Heterogenität zu messen, haben wir Einzelzell-RNA Sequenzierung (scRNA-Seq) benutzt, wodurch die Erfassung der Genexpression von Tausenden von Zellen in einer Probe möglich ist. Um jedoch sinnvolle Schlüsse aus diesen scRNA-Seq Daten zu ziehen, müssen technische Artifakte identifiziert und aus den Daten entfernt werden. In dieser Dissertation stellen wir eine Methode vor, um eines solcher Artifakte zu erkennen: leere Tröpfchen (ohne Zellen), die hauptsächlich aus freischwebender mRNA in der Probe bestehen. Weiterhin können Zellen anhand ihrer Genexpressionsprofile entlang einer Entwicklungsschiene angeordnet werden, die die Entwicklung der Zellen während ihrer Differenzierung rekapituliert. Auf der Grundlage dieser Entwicklungsreihenfolge modellieren wir die Genexpression mit einem Bayes’schen Inferenzansatz, um die Dynamik der Transkription in sich differenzierenden Zellen zu messen. Dies ermöglicht das Anordnen von Genen entlang einer Transkriptionskaskade, sowie statistische Untersuchungen in Hinblick auf Unterschiede in der Veränderung von Genexpression, und das Messen des Einflusses möglicher Regulationsgene. Wir wenden diese Methode an, um kortikale neuronale Stammzellen zu untersuchen, die sich über einen intermediären Vorläuferzelltyp in kortikale Neuronen in Hirnorganoiden differenzieren, und um eine detaillierte Charakterisierung der molekularen Prozesse zu liefern, die der Neurogenese zugrunde liegen

    Markov field models of molecular kinetics

    Get PDF
    Computer simulations such as molecular dynamics (MD) provide a possible means to understand protein dynamics and mechanisms on an atomistic scale. The resulting simulation data can be analyzed with Markov state models (MSMs), yielding a quantitative kinetic model that, e.g., encodes state populations and transition rates. However, the larger an investigated system, the more data is required to estimate a valid kinetic model. In this work, we show that this scaling problem can be escaped when decomposing a system into smaller ones, leveraging weak couplings between local domains. Our approach, termed independent Markov decomposition (IMD), is a first-order approximation neglecting couplings, i.e., it represents a decomposition of the underlying global dynamics into a set of independent local ones. We demonstrate that for truly independent systems, IMD can reduce the sampling by three orders of magnitude. IMD is applied to two biomolecular systems. First, synaptotagmin-1 is analyzed, a rapid calcium switch from the neurotransmitter release machinery. Within its C2A domain, local conformational switches are identified and modeled with independent MSMs, shedding light on the mechanism of its calcium-mediated activation. Second, the catalytic site of the serine protease TMPRSS2 is analyzed with a local drug-binding model. Equilibrium populations of different drug-binding modes are derived for three inhibitors, mirroring experimentally determined drug efficiencies. IMD is subsequently extended to an end-to-end deep learning framework called iVAMPnets, which learns a domain decomposition from simulation data and simultaneously models the kinetics in the local domains. We finally classify IMD and iVAMPnets as Markov field models (MFM), which we define as a class of models that describe dynamics by decomposing systems into local domains. Overall, this thesis introduces a local approach to Markov modeling that enables to quantitatively assess the kinetics of large macromolecular complexes, opening up possibilities to tackle current and future computational molecular biology questions

    Discovering Causal Relations and Equations from Data

    Full text link
    Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws and principles that are invariant, robust and causal explanations of the world has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventional studies in the system under study. With the advent of big data and the use of data-driven methods, causal and equation discovery fields have grown and made progress in computer science, physics, statistics, philosophy, and many applied fields. All these domains are intertwined and can be used to discover causal relations, physical laws, and equations from observational data. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of Physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for observational causal and equation discovery, point out connections, and showcase a complete set of case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is being revolutionised with the efficient exploitation of observational data, modern machine learning algorithms and the interaction with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.Comment: 137 page

    LieDetect: Detection of representation orbits of compact Lie groups from point clouds

    Full text link
    We suggest a new algorithm to estimate representations of compact Lie groups from finite samples of their orbits. Different from other reported techniques, our method allows the retrieval of the precise representation type as a direct sum of irreducible representations. Moreover, the knowledge of the representation type permits the reconstruction of its orbit, which is useful to identify the Lie group that generates the action. Our algorithm is general for any compact Lie group, but only instantiations for SO(2), T^d, SU(2) and SO(3) are considered. Theoretical guarantees of robustness in terms of Hausdorff and Wasserstein distances are derived. Our tools are drawn from geometric measure theory, computational geometry, and optimization on matrix manifolds. The algorithm is tested for synthetic data up to dimension 16, as well as real-life applications in image analysis, harmonic analysis, and classical mechanics systems, achieving very accurate results.Comment: 84 pages, 16 figure

    Euler Characteristic Tools For Topological Data Analysis

    Full text link
    In this article, we study Euler characteristic techniques in topological data analysis. Pointwise computing the Euler characteristic of a family of simplicial complexes built from data gives rise to the so-called Euler characteristic profile. We show that this simple descriptor achieve state-of-the-art performance in supervised tasks at a very low computational cost. Inspired by signal analysis, we compute hybrid transforms of Euler characteristic profiles. These integral transforms mix Euler characteristic techniques with Lebesgue integration to provide highly efficient compressors of topological signals. As a consequence, they show remarkable performances in unsupervised settings. On the qualitative side, we provide numerous heuristics on the topological and geometric information captured by Euler profiles and their hybrid transforms. Finally, we prove stability results for these descriptors as well as asymptotic guarantees in random settings.Comment: 39 page

    Complexity Science in Human Change

    Get PDF
    This reprint encompasses fourteen contributions that offer avenues towards a better understanding of complex systems in human behavior. The phenomena studied here are generally pattern formation processes that originate in social interaction and psychotherapy. Several accounts are also given of the coordination in body movements and in physiological, neuronal and linguistic processes. A common denominator of such pattern formation is that complexity and entropy of the respective systems become reduced spontaneously, which is the hallmark of self-organization. The various methodological approaches of how to model such processes are presented in some detail. Results from the various methods are systematically compared and discussed. Among these approaches are algorithms for the quantification of synchrony by cross-correlational statistics, surrogate control procedures, recurrence mapping and network models.This volume offers an informative and sophisticated resource for scholars of human change, and as well for students at advanced levels, from graduate to post-doctoral. The reprint is multidisciplinary in nature, binding together the fields of medicine, psychology, physics, and neuroscience
    corecore