Search CORE

187 research outputs found

Simulation Intelligence: Towards a New Generation of Scientific Methods

Author: Anandkumar Anima
Assefa Samuel
Baydin Atılım Güneş
Brehmer Johann
Choudry Sanjay
Cranmer Kyle
Gottschlich Justin
Hanuka Adi
Isayev Olexandr
Krakauer David
Lavin Alexander
Macke Jakob
Mattson Tim
McMahon Peter L.
Paige Brooks
Peterson Erik
Pfeffer Avi
Prunkl Carina
Rocki Kamil
Veloso Manuela
Wainwright Haruko
Zenil Hector
Zhang Jiaxin
Zheng Stephan
Publication venue
Publication date: 27/11/2022
Field of study

The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science

arXiv.org e-Print Archive

Bioinformatics

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

Directory of Open Access Books (DOAB)

Recommended from our members

Statistical methods for the integrative analysis of single-cell multi-omics data

Author: Argelaguet Ricardo
Publication venue: University of Cambridge
Publication date: 25/09/2020
Field of study

Single-cell profiling techniques have provided an unprecedented opportunity to study cellular heterogeneity at the molecular level. This represents a remarkable advance over traditional bulk sequencing methods, particularly to study lineage diversification and cell fate commitment events in heterogeneous biological processes. While the large majority of single-cell studies are focused on quantifying RNA expression, transcriptomic readouts provide only a single dimension of cellular heterogeneity. Recently, technological advances have enabled multiple biological layers to be probed in parallel one cell at a time, unveiling a powerful approach for investigating multiple dimensions of cellular heterogeneity. However, the increasing availability of multi-modal data sets needs to be accompanied by the development of suitable integrative strategies to fully exploit the data generated. In this thesis I worked in collaboration with different research groups to introduce innovative experimental and computational strategies for the integrative study of multi-omics at single-cell resolution. The first contribution is the development of scNMT-seq, a protocol for the simultaneous profiling of RNA expression, DNA methylation and chromatin accessibility in single cells. I demonstrate how this assay provides a powerful approach for investigating regulatory relationships between the epigenome and the transcriptome within individual cells. The second contribution is Multi-Omics Factor Analysis (MOFA), a statistical framework for the unsupervised integration of multi-omics data sets. MOFA is a Bayesian latent variable model that can be viewed as a statistically rigorous generalization of Principal Component Analysis to multi-omics data. The method provides a principled approach to retrieve, in an unsupervised manner, the underlying sources of sample heterogeneity while at the same time disentangling which axes of heterogeneity are shared across multiple modalities and which are specific to individual data modalities. The third contribution is the generation of a comprehensive molecular roadmap of mouse gastrulation at single-cell resolution. We employed scNMT-seq to simultaneously profile RNA expression, DNA methylation and chromatin accessibility for hundreds of cells, spanning multiple time points from the exit from pluripotency to primary germ layer specification. Using MOFA, and other tools, I performed an integrative analysis of the multi-modal measurements, revealing novel insights into the role of the epigenome in regulating this key developmental process. The fourth contribution is an extended formulation of the MOFA model tailored to the analysis of large-scale single-cell data with complex experimental designs. I extended the model to incorporate a flexible regularisation that enables the joint analysis of multiple omics as well as multiple sample groups (batches and/or experimental conditions). In addition, I implemented a GPU-accelerated stochastic variational inference framework, thus enabling the scalable analysis of potentially millions of samples

Apollo (Cambridge)

CORNN: Convex optimization of recurrent neural networks for rapid inference of neural dynamics

Author: Dinc Fatih
Schnitzer Mark
Shai Adam
Tanaka Hidenori
Publication venue
Publication date: 16/11/2023
Field of study

Advances in optical and electrophysiological recording technologies have made it possible to record the dynamics of thousands of neurons, opening up new possibilities for interpreting and controlling large neural populations in behaving animals. A promising way to extract computational principles from these large datasets is to train data-constrained recurrent neural networks (dRNNs). Performing this training in real-time could open doors for research techniques and medical applications to model and control interventions at single-cell resolution and drive desired forms of animal behavior. However, existing training algorithms for dRNNs are inefficient and have limited scalability, making it a challenge to analyze large neural recordings even in offline scenarios. To address these issues, we introduce a training method termed Convex Optimization of Recurrent Neural Networks (CORNN). In studies of simulated recordings, CORNN attained training speeds ~100-fold faster than traditional optimization approaches while maintaining or enhancing modeling accuracy. We further validated CORNN on simulations with thousands of cells that performed simple computations such as those of a 3-bit flip-flop or the execution of a timed response. Finally, we showed that CORNN can robustly reproduce network dynamics and underlying attractor structures despite mismatches between generator and inference models, severe subsampling of observed neurons, or mismatches in neural time-scales. Overall, by training dRNNs with millions of parameters in subminute processing times on a standard computer, CORNN constitutes a first step towards real-time network reproduction constrained on large-scale neural recordings and a powerful computational tool for advancing the understanding of neural computation.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive

Recent Advances of Deep Learning in Bioinformatics and Computational Biology

Author: Asif Khateeb
Binhua Tang
Binhua Tang
Kang Yin
Zixiang Pan
Publication venue: 'Frontiers Media SA'
Publication date: 01/03/2019
Field of study

Extracting inherent valuable knowledge from omics big data remains as a daunting problem in bioinformatics and computational biology. Deep learning, as an emerging branch from machine learning, has exhibited unprecedented performance in quite a few applications from academia and industry. We highlight the difference and similarity in widely utilized models in deep learning studies, through discussing their basic structures, and reviewing diverse applications and disadvantages. We anticipate the work can serve as a meaningful perspective for further development of its theory, algorithm and application in bioinformatic and computational biology

Directory of Open Access Journals

Generative Model based Training of Deep Neural Networks for Event Detection in Microscopy Data

Author: Speiser Artur
Publication venue: Universität Tübingen
Publication date: 15/06/2023
Field of study

Several imaging techniques employed in the life sciences heavily rely on machine learning methods to make sense of the data that they produce. These include calcium imaging and multi-electrode recordings of neural activity, single molecule localization microscopy, spatially-resolved transcriptomics and particle tracking, among others. All of them only produce indirect readouts of the spatiotemporal events they aim to record. The objective when analysing data from these methods is the identification of patterns that indicate the location of the sought-after events, e.g. spikes in neural recordings or fluorescent particles in microscopy data. Existing approaches for this task invert a forward model, i.e. a mathematical description of the process that generates the observed patterns for a given set of underlying events, using established methods like MCMC or variational inference. Perhaps surprisingly, for a long time deep learning saw little use in this domain, even though it became the dominant approach in the field of pattern recognition over the previous decade. The principal reason is that in the absence of labeled data needed for supervised optimization it remains unclear how neural networks can be trained to solve these tasks. To unlock the potential of deep learning, this thesis proposes different methods for training neural networks using forward models and without relying on labeled data. The thesis rests on two publications: In the first publication we introduce an algorithm for spike extraction from calcium imaging time traces. Building on the variational autoencoder framework, we simultaneously train a neural network that performs spike inference and optimize the parameters of the forward model. This approach combines several advantages that were previously incongruous: it is fast at test-time, can be applied to different non-linear forward models and produces samples from the posterior distribution over spike trains. The second publication deals with the localization of fluorescent particles in single molecule localization microscopy. We show that an accurate forward model can be used to generate simulations that act as a surrogate for labeled training data. Careful design of the output representation and loss function result in a method with outstanding precision across experimental designs and imaging conditions. Overall this thesis highlights how neural networks can be applied for precise, fast and flexible model inversion on this class of problems and how this opens up new avenues to achieve performance beyond what was previously possible

Publikationsserver der Universität Tübingen