285 research outputs found
Forward and Backward Bisimulations for Chemical Reaction Networks
We present two quantitative behavioral equivalences over species of a
chemical reaction network (CRN) with semantics based on ordinary differential
equations. Forward CRN bisimulation identifies a partition where each
equivalence class represents the exact sum of the concentrations of the species
belonging to that class. Backward CRN bisimulation relates species that have
the identical solutions at all time points when starting from the same initial
conditions. Both notions can be checked using only CRN syntactical information,
i.e., by inspection of the set of reactions. We provide a unified algorithm
that computes the coarsest refinement up to our bisimulations in polynomial
time. Further, we give algorithms to compute quotient CRNs induced by a
bisimulation. As an application, we find significant reductions in a number of
models of biological processes from the literature. In two cases we allow the
analysis of benchmark models which would be otherwise intractable due to their
memory requirements.Comment: Extended version of the CONCUR 2015 pape
Model-driven analysis of gene expression control
During this PhD, I worked on three different aspects in the broad field of experimental and theoretical analysis of gene regulation.
The first part, "Quantifying the strength of miRNA-target interactions", addresses the problem of predicting mRNA targets of miRNAs. I show that biochemical measurements of miRNA-mRNA interactions can be used to optimise the parameter inference of a pre-existing model of miRNA target prediction. This model named MIRZA, predicts miRNA-mRNA binding using 25 energy parameters that describe the miRNA-mRNA hybrid structure, with 2 base pairing parameters for the AU and GC pairs, 3 configuration parameters for the symmetric and asymmetric loops, and 21 positional parameters for the 21 nucleotides of the miRNA sequence. MIRZA was built to infer these parameters from Argonaute protein CLIP data, which captures potential targets of miRNAs. Upon the publication of precise measurements of chemical kinetic constants of miRNA-mRNA binding interactions between a mRNA target and a set of systematically mutated miRNA sequences, we reasoned that such data could be used to improve the parameters inference of the MIRZA model. After showing that the prediction of the existing model on the set of measured miRNA-mRNA pairs shows high correlation with the binding energy calculated from the measurements, I used simulations as a proof of principle of the inference procedure and to design measurements that would be needed to infer the parameters of the MIRZA model.
Staying in the field of miRNA, in "Single cell mRNA profiling reveals the hierarchical response of miRNA targets to miRNA induction", I developed an approach to infer miRNA targets based on scRNA-seq data from cells that express the miRNA at different levels. A miRNA can target several hundreds of different mRNAs and is present in the cell in limited quantities, implying that the interaction of a target mRNA with a specific miRNA depends on its concentration and on the interactions of the miRNA with its other targets. In other words, since miRNA binding is exclusive, mRNA targets compete for the same miRNA pool. Therefore, the concentrations of the thereby coupled mRNAs depend not only on the miRNA concentration but also on the concentration of every competing mRNA that is targeted by the same miRNA. To study this, HEK 293 cell lines were constructed to inducibly express a miRNA (hsa-miR-199a) as well as the mRNA encoding a green fluorescent protein. Express from the same promoter as the miRNA, this mRNA allows the monitoring of the miRNA concentration. The study aimed not only to determine the parameters of individual mRNA-mRNA interactions, but also to assess the degree to which mRNAs act in a competitive manner to influence each other's expression. scRNA-seq was chosen to bring the resolution needed to reach these goals. The effect of the miRNA on a bound target is to increase its decay rate, hence the expression levels of the targets depends on the miRNA concentration and their binding energy. To gain insight into the target binding energy, we constructed a model considering mRNA transcription rate, the miRNA-mRNA binding/unbinding rate, the mRNA decay rates in the bound and unbound state, and the free/bound concentration of miRNA. We showed that the model can be factored in terms of the miRNA concentrations in individual cells and the miRNA-mRNA target interaction parameters and we solved the model to obtain estimates of miRNA-mRNA interaction parameters, which we showed explain the mRNA levels in cells more accurately than the sequence-based computationally predicted interaction energies.
Finally, in "Bayesian inference of the gene expression states from single-cell RNA-seq data" I carried out fundamental technical work on the normalisation of count data obtained in scRNA-seq experiments. As introduced above, multiple strategies have been developed with the aim of reducing the high level of noise present on such data, and estimating a 'true' biological state of expression for each gene in each cell. While the project aimed to reconstruct the Waddington landscape of regulator activity based on the single cell gene expression measurements, at the start of the project we realised that there is no satisfactory solution to gene expression normalisation in single cells in the literature. Thus, we tackled this problem with a Bayesian model, considering each gene independently and inferring a posterior probability of gene expression in each cell. Our model assumes a log-normal distribution of gene expression across cells and additional Poisson noise caused by the stochastic process of gene expression and the sampling process introduced by the mRNA capture in experimental protocols. These normalised gene expression values are the basis of a motif-activity response based approach for inferring the activity of TFs and miRNAs in individual cells, and for reconstructing the underlying landscape.
The application of this normalisation algorithm to reconstruct a landscape is presented in the last part, "Realizing Waddingtonâs metaphor: Inferring regulatory landscapes from single-cell gene expression data". There I present the mathematical principles needed to formally define a landscape following the idea of Waddington from 1957, and I propose two applications of the landscape. First I show that it defines cell types as local minima, and secondly, in the case of cells undergoing differentiation, I show how the landscape can be used to find developmental path and the transcription factors associated with the differentiation process
Computational Methods in Science and Engineering : Proceedings of the Workshop SimLabs@KIT, November 29 - 30, 2010, Karlsruhe, Germany
In this proceedings volume we provide a compilation of article contributions equally covering applications from different research fields and ranging from capacity up to capability computing. Besides classical computing aspects such as parallelization, the focus of these proceedings is on multi-scale approaches and methods for tackling algorithm and data complexity. Also practical aspects regarding the usage of the HPC infrastructure and available tools and software at the SCC are presented
Stochastic fragments: A framework for the exact reduction of the stochastic semantics of rule-based models
In this paper, we propose an abstract interpretation-based framework for reducing the state space of stochastic semantics for protein protein interaction networks. Our approach consists in quotienting the state space of networks. Yet interestingly, we do not apply the widelyused strong lumpability criterion which imposes that two equivalent states behave similarly with respect to the quotient, but a weak version of it. More precisely, our framework detects and proves some invariants about the dynamics of the system: indeed the quotient of the state space is such that the probability of being in a given state knowing that this state is in a given equivalence class, is an invariant of the semantics. Then we introduce an individual-based stochastic semantics (where each agent is identified by a unique identifier) for the programs of a rulebased language (namely Kappa) and we use our abstraction framework for deriving a sound population-based semantics and a sound fragments based semantics, which give the distribution of the traces respectively for the number of instances of molecular species and for the number of instances of partially defined molecular species. These partially defined species are chosen automatically thanks to a dependency analysis which is also described in the paper
Structural decomposition and structural relaxation of solvation shells of hydrated molecular ionic liquids and protein solutions
Die vorliegende Arbeit liefert neue methodische Beitraege zur Untersuchung der
Struktur und Dynamik von Biomolekuelen in Loesung mittels Voronoi-Analyse von Computersimulationen.
Dabei werden sowohl kollektive wie auch Einteilchen-Eigenschaften der Solvathuellen und des Bulk-Mediums
betrachtet.
Als Modellproteine dienen Ubiquitin (PDB-code: 1UBQ), Calbindin (1CLB) und eine Phospholipase (2PLD)
deren Solvatation in Wasser einen wesentlichen Bestandteil dieser Arbeit darstellt. Darueber hinaus werden Vorstudien
zu Molekularen Ionischen Fluessigkeiten (MIL) angestellt die in den letzten Jahren unter anderem als umweltvertraegliche
polare Loesungsmittel in den Vordergrund getreten sind. Trifluoroazetat-, Tetrafluoroborat-
und Trifluoromethylsulfonat- Salze von alkyliertem Imidazolium werden einerseits in Reinform, andererseits in Mischung mit
Wasser untersucht.
Neu an dieser Arbeit ist zunaechst die Atom-aufgeloeste Tesselierung, die fuer Systeme mit 30000 Atomen
mit periodischen Randbedingungen ueber hundertausende Zeitschritte sehr rechenintensiv, und daher
nur durch die effiziente Implementierung geeigneter Algorithmen zu bewerkstelligen ist.
Auf dieser Grundlage werden weitestgehend parameterfreie Ansaetze zur lokalen und globalen Strukturanalyse entwickelt
die einerseits mit konventionellen Methoden wie etwa Radialen Verteilungsfunktionen und Orientierungskorrelationsfunktionen
verglichen werden, andererseits zusaetzliche Moeglichkeiten der Interpretation bieten.
Position und Orientierung von benachbarten Molekuelen kann direkt anhand von graphentheoretischen Interaktionen beschrieben
und interpretiert werden. Ein Markov-Modell fuer die Dynamik innerhalb und zwischen einzelnen Solvathuellen wird entwickelt
und auf MIL Systeme angewendet.The present work provides new methodical contributions to investigation of structural and dynamic behaviour of solvated biomolecules
using Voronoi analysis of computer simulations. Thereby, collective as well as single particle properties of solvation shells
and the bulk medium are considered.
The three proteins ubiquitin (PDB-code: 1UBQ), calbindin (1CLB) and phospholipase (2PLD) serve as model systems. The study of their
solvation in water is an integral part of this work. Moreover, preliminary studies of Molecular Ionic Liquids (MIL) are being
made, that have come to the fore in recent years as environmentally compliant polar solvents. Alkylated imidazolium salts of
Trifluoroacetate, Tetrafluoroborate and Trifluoromethylsulfonate are analysed in the pure form as well as mixed with water.
For one thing, new in this work is the atom-resolved tesselation, that is computationally demanding for systems with about 30000 atoms
and periodic boundary conditions over 100-thousands of time steps and hence is to be managed only by the efficient implementation
of suitable algorithms.
Widely parameter free approaches to local and global structure analysis are developed on this basis and compared
to conventional methods like radial distribution functions and orientation correlation functions. Furthermore,
they provide additional possibilities for interpretation.
Position and orientation of neighbouring molecules can be described and interpreted directly by graph theoretical interactions.
A Markov model for dynamics within and between solvation shells is being developed and applied to MIL systems
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems
Advances in artificial intelligence (AI) are fueling a new paradigm of
discoveries in natural sciences. Today, AI has started to advance natural
sciences by improving, accelerating, and enabling our understanding of natural
phenomena at a wide range of spatial and temporal scales, giving rise to a new
area of research known as AI for science (AI4Science). Being an emerging
research paradigm, AI4Science is unique in that it is an enormous and highly
interdisciplinary area. Thus, a unified and technical treatment of this field
is needed yet challenging. This work aims to provide a technically thorough
account of a subarea of AI4Science; namely, AI for quantum, atomistic, and
continuum systems. These areas aim at understanding the physical world from the
subatomic (wavefunctions and electron density), atomic (molecules, proteins,
materials, and interactions), to macro (fluids, climate, and subsurface) scales
and form an important subarea of AI4Science. A unique advantage of focusing on
these areas is that they largely share a common set of challenges, thereby
allowing a unified and foundational treatment. A key common challenge is how to
capture physics first principles, especially symmetries, in natural systems by
deep learning methods. We provide an in-depth yet intuitive account of
techniques to achieve equivariance to symmetry transformations. We also discuss
other common technical challenges, including explainability,
out-of-distribution generalization, knowledge transfer with foundation and
large language models, and uncertainty quantification. To facilitate learning
and education, we provide categorized lists of resources that we found to be
useful. We strive to be thorough and unified and hope this initial effort may
trigger more community interests and efforts to further advance AI4Science
Mapping genome-wide neuropsychiatric mutation effects on functional brain connectivity : c opy number variants delineate dimensions contributing to autism and schizophrenia
Les recherches menĂ©es pour comprendre les troubles du spectre autistique (TSA) et la schizophrĂ©nie (SZ) ont communĂ©ment utilisĂ© une approche dite descendante, partant du diagnostic clinique pour investiguer des phĂ©notypes intermĂ©diaires cĂ©rĂ©braux ainsi que des variations gĂ©nĂ©tiques associĂ©es. Des Ă©tudes transdiagnostiques rĂ©centes ont remis en question ces frontiĂšres nosologiques, et suggĂšrent des mĂ©canismes Ă©tiologiques imbriquĂ©s. Lâapproche montante propose de composer des groupes de porteurs dâun mĂȘme variant gĂ©nĂ©tique afin dâinvestiguer leur contribution aux conditions neuropsychiatriques (NPs) associĂ©es. Les variations du nombre de copies (CNV, perte ou gain dâun fragment dâADN) figurent parmi les facteurs biologiques les plus associĂ©s aux NPs, et sont dĂšs lors des candidats particuliĂšrement appropriĂ©s. Les CNVs induisant un risque pour des conditions similaires, nous posons lâhypothĂšse que des classes entiĂšres de CNVs convergent sur des dimensions dâaltĂ©rations cĂ©rĂ©brales qui contribuent aux NPs. Lâimagerie fonctionnelle au repos (rs-fMRI) sâest rĂ©vĂ©lĂ©e un outil prometteur en psychiatrie, mais presquâaucune Ă©tude nâa Ă©tĂ© menĂ©e pour comprendre lâimpact des CNVs sur la connectivitĂ© fonctionnelle cĂ©rĂ©brale (FC).
Nos objectifs Ă©taient de: 1) CaractĂ©riser lâeffet des CNVs sur la FC; 2) Rechercher la prĂ©sence des motifs confĂ©rĂ©s par ces signatures biologiques dans des conditions idiopathiques; 3) Tester si la suppression de gĂšnes intolĂ©rants Ă lâhaploinsuffisance rĂ©organise la FC de maniĂšre indĂ©pendante Ă leur localisation dans le gĂ©nome. Nous avons agrĂ©gĂ© des donnĂ©es de rs-fMRI chez: 502 porteurs de 8 CNVs associĂ©es aux NPs (CNVs-NP), de 4 CNVs sans association Ă©tablie, ainsi que de porteurs de CNVs-NPs Ă©parses; 756 sujets ayant un diagnostic de TSA, de SZ, ou de trouble dĂ©ficitaire de lâattention/hyperactivitĂ© (TDAH), et 5377 contrĂŽles.
Les analyses du connectome entier ont montrĂ© un effet de dosage gĂ©nique positif pour les CNVs 22q11.2 et 1q21.1, et nĂ©gatif pour le 16p11.2. La taille de lâeffet des CNVs sur la FC Ă©tait corrĂ©lĂ©e au niveau de risque psychiatrique confĂ©rĂ© par le CNV. En accord avec leurs effets sur la cognition, lâeffet des dĂ©lĂ©tions sur la FC Ă©tait plus Ă©levĂ© que celui des duplications. Nous avons identifiĂ© des similaritĂ©s entre les motifs cĂ©rĂ©braux confĂ©rĂ©s par les CNVs-NP, et lâarchitecture fonctionnelle des individus avec NPs. Le niveau de similaritĂ© Ă©tait associĂ© Ă la sĂ©vĂ©ritĂ© du CNV, et Ă©tait plus fort avec la SZ et les TSA quâavec les TDAH. La comparaison des motifs confĂ©rĂ©s par les dĂ©lĂ©tions les plus sĂ©vĂšres (16p11.2, 22q11.2) Ă lâĂ©chelle fonctionnelle, et dâexpression gĂ©nique, nous a confirmĂ© lâexistence prĂ©sumĂ©e de relation entre les mutations elles-mĂȘmes. Ă lâaide dâune mesure dâintolĂ©rance aux mutations (pLI), nous avons pu inclure tous les porteurs de CNVs disponibles, et ainsi identifier un profil dâhaploinsuffisance impliquant le thalamus, le cortex antĂ©rieur cingulaire, et le rĂ©seau somato-moteur, associĂ© Ă une diminution de mesure dâintelligence gĂ©nĂ©rale. Enfin, une analyse dâexploration factorielle nous a permis de confirmer la contribution de ces rĂ©gions cĂ©rĂ©brales Ă 3 composantes latentes partagĂ©es entre les CNVs et les NPs.
Nos rĂ©sultats ouvrent de nouvelles perspectives dans la comprĂ©hension des mĂ©canismes polygĂ©niques Ă lâoeuvre dans les maladies mentales, ainsi que des effets plĂ©iotropiques des CNVs.Research on Autism Spectrum Disorder (ASD) and schizophrenia (SZ) has mainly adopted a âtop-downâ approach, starting from psychiatric diagnosis, and moving to intermediate brain phenotypes and underlying genetic factors. Recent cross-disorder studies have raised questions about diagnostic boundaries and pleiotropic mechanisms. By contrast, the recruitment of groups based on the presence of a genetic risk factor allows for the investigation of molecular pathways related to a particular risk for neuropsychiatric conditions (NPs). Copy number variants (CNVs, loss or gain of a DNA segment), which confer high risk for NPs are natural candidates to conduct such bottom-up approaches.
Because CNVs have a similar range of adverse effects on NPs, we hypothesized that entire classes of CNVs may converge upon shared connectivity dimensions contributing to mental illness. Resting-state functional MRI (rs-fMRI) studies have provided critical insight into the architecture of brain networks involved in NPs, but so far only a few studies have investigated networks modulated by CNVs.
We aimed at 1) Delineating the effects of neuropsychiatric variants on functional connectivity (FC), 2) Investigating whether the alterations associated with CNVs are also found among idiopathic psychiatric populations, 3) Testing whether deletions reorganize FC along general dimensions, irrespective of their localization in the genome.
We gathered rsfMRI data on 502 carriers of eight NP-CNVs (high-risk), four CNVs without prior association to NPs as well as carriers of eight scarcer NP-CNVs. We also analyzed 756 subjects with idiopathic ASD, SZ, and attention deficit hyperactivity disorder (ADHD), and 5,377 controls. Connectome-wide analyses showed a positive gene dosage effect for the 22q11.2 and 1q21.1 CNVs, and a negative association for the 16p11.2 CNV. The effect size of CNVs on relative FC (mean-connectivity adjusted) was correlated with the known level of NP-risk conferred by CNVs. Consistent with results on cognition, we also reported that deletions had a larger effect size on FC than duplications. We identified similarities between high-risk CNV profiles and the connectivity architecture of individuals with NPs. The level of similarity was associated with mutation severity and was strongest in SZ, followed by ASD, and ADHD. The similarity was driven by the thalamus, and the posterior cingulate cortex, previously identified as hubs in transdiagnostic psychiatric studies. These results raised questions about shared mechanisms across CNVs. By comparing deletions at the 16p11.2 and 22q11.2 loci, we identified similarities at the connectivity, and at the gene expression level. We extended this work by pooling all deletions available for analysis. We asked if connectivity alterations were associated with the severity of deletions scored using pLI, a measure of intolerance to haploinsufficiency. The haploinsufficiency profile involved the thalamus, anterior cingulate cortex, and somatomotor network and was correlated with lower general intelligence and higher autism severity scores in 3 unselected and disease cohorts. An exploratory factor analysis confirmed the contribution of these regions to three latent components shared across CNVs and NPs.
Our results open new avenues for understanding polygenicity in psychiatric conditions, and the pleiotropic effect of CNVs on cognition and on risk for neuropsychiatric disorders
- âŠ