10 research outputs found
Physically-inspired Gaussian process models for post-transcriptional regulation in Drosophila
The regulatory process of Drosophila is thoroughly studied for understanding a great variety of biological principles. While pattern-forming gene networks are analysed in the transcription step, post-transcriptional events (e.g. translation, protein processing) play an important role in establishing protein expression patterns and levels. Since the post-transcriptional regulation of Drosophila depends on spatiotemporal interactions between mRNAs and gap proteins, proper physically-inspired stochastic models are required to study the link between both quantities. Previous research attempts have shown that using Gaussian processes (GPs) and differential equations lead to promising predictions when analysing regulatory networks. Here we aim at further investigating two types of physically-inspired GP models based on a reaction-diffusion equation where the main difference lies in where the prior is placed. While one of them has been studied previously using protein data only, the other is novel and yields a simple approach requiring only the differentiation of kernel functions. In contrast to other stochastic frameworks, discretising the spatial space is not required here. Both GP models are tested under different conditions depending on the availability of gap gene mRNA expression data. Finally, their performances are assessed on a high-resolution dataset describing the blastoderm stage of the early embryo of Drosophila melanogaster
Quantitative system drift compensates for altered maternal inputs to the gap gene network of the scuttle fly Megaselia abdita.
Published onlineJournal ArticleThis is the final version of the article. Available from eLife Sciences Publications via the DOI in this record.The segmentation gene network in insects can produce equivalent phenotypic outputs despite differences in upstream regulatory inputs between species. We investigate the mechanistic basis of this phenomenon through a systems-level analysis of the gap gene network in the scuttle fly Megaselia abdita (Phoridae). It combines quantification of gene expression at high spatio-temporal resolution with systematic knock-downs by RNA interference (RNAi). Initiation and dynamics of gap gene expression differ markedly between M. abdita and Drosophila melanogaster, while the output of the system converges to equivalent patterns at the end of the blastoderm stage. Although the qualitative structure of the gap gene network is conserved, there are differences in the strength of regulatory interactions between species. We term such network rewiring 'quantitative system drift'. It provides a mechanistic explanation for the developmental hourglass model in the dipteran lineage. Quantitative system drift is likely to be a widespread mechanism for developmental evolution.Ministerio de EconomĂa y Competitividad MEC/EMBL Agreement/ BFU2009-10184/ BFU2012-33775/
SEV-2012-0208
Agència de Gestió d'Ajuts Universitaris I de Recerca SGR Grant 406
European Commission FP7-KBBE-2011-5/289434
National Science Foundation IOS-0719445/IOS-112121
Reverse-engineering post-transcriptional regulation of gap genes in Drosophila melanogaster
16 páginas, 6 figuras, 1 tablaSystems biology proceeds through repeated cycles of experiment and modeling. One way to implement this is reverse engineering, where models are fit to data to infer and analyse regulatory mechanisms. This requires rigorous methods to determine whether model parameters can be properly identified. Applying such methods in a complex biological context remains challenging. We use reverse engineering to study post-transcriptional regulation in pattern formation. As a case study, we analyse expression of the gap genes Krüppel, knirps, and giant in Drosophila melanogaster. We use detailed, quantitative datasets of gap gene mRNA and protein expression to solve and fit a model of post-transcriptional regulation, and establish its structural and practical identifiability. Our results demonstrate that post-transcriptional regulation is not required for patterning in this system, but is necessary for proper control of protein levels. Our work demonstrates that the uniqueness and specificity of a fitted model can be rigorously determined in the context of spatio-temporal pattern formation. This greatly increases the potential of reverse engineering for the study of development and other, similarly complex, biological processesThis collaborative project was carried out in the context of the BioPreDyn consortium, which is co-ordinated by JJ and JRB, and funded by European Commission grant FP7-KBBE-2011-5/289434. The laboratory of JJ is funded by the MEC-EMBL agreement for the EMBL/CRG Research Unit in Systems Biology. Additional financial support was provided by SGR Grant 406 from the Catalan funding agency AGAUR, and by grants BFU2009- 10184 and 273 BFU2009-09168 from the Spanish Ministerio de Economia y Competitividad (MINECO). The group at IIM-CSIC acknowledges financial support from MINECO and the European Regional Development Fund (ERDF; project “MultiScales”, DPI2011-28112-C04-03). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Peer reviewe
Modeling spatio-temporal enhancer expression in Drosophila segmentation
Thermodynamic models are a key tool to investigate transcription control in the segmentation of Drosophila. By modeling the binding of transcription factors to DNA sequences and their effect on transcription initiation, thermodynamic models predict expression patterns directly from the enhancer sequence, given the binding motifs and concentrations of all relevant transcription factors (TFs). However, many parameters of the model are impossible to measure, e.g. the interaction strength between the TFs and the core promoter. Hence, it is necessary to estimate these parameters by training the thermodynamic model on known data, i.e. to fit the model predictions to already measured expression patterns of known enhancers. The quality of the parameter training result, evaluated on independent test data, indicates how well the model recapitulates the biological measurements, which can help us to improve our understanding of the underlaying mechanisms of transcription control. Therefore, proper parameter training is a crucial step for the construction of thermodynamic models.
In this thesis, I develop a thorough parameter training setup that uses the limited amount of available training data efficiently and reduces parameter overfitting significantly. This optimized training setup applies a global parameter training algorithm, a method to artificially increase the amount of training data, called data augmentation, and parameter penalties, which is a technique to limit overfitting. I apply the novel training setup to expand the scope of thermodynamic models of Drosophila segmentation considerably by incorporating additional TFs into the model, and to investigate many aspects of transcription control in greater detail than it was possible before. Among these topics are the specificity of TF binding motifs, the nature of TF cooperativity and DNA accessibility. With the help of the here developed impact score, I assess the influence of all relevant TFs in silico, delineate the cooperativity range of the key TF bcd, and determine the importance of weak binding sites. Finally, I develop and discuss two alternative models of transcription control that lack the prediction quality of thermodynamic models, but, nevertheless, give valuable insights into the architectural principles of enhancers.
This project is part of a larger effort to advance our current understanding of transcription regulation by reconstructing the segmentation network of Drosophila in silico. The results of this thesis facilitate future modeling efforts by optimally leveraging the available data as well as by improving our understanding of thermodynamic models
Information processing in biology
To survive, organisms must respond appropriately to a variety of challenges posed by a dynamic and uncertain environment. The mechanisms underlying such responses can in general be framed as input-output devices which map environment states (inputs) to associated responses (output. In this light, it is appealing to attempt to model these systems using information theory, a well developed mathematical framework to describe input-output systems.
Under the information theoretical perspective, an organism’s behavior is fully characterized by the repertoire of its outputs under different environmental conditions. Due to natural selection, it is reasonable to assume this input-output mapping has been fine tuned in such a way as to maximize the organism’s fitness. If that is the case, it should be possible to abstract away the mechanistic implementation details and obtain the general principles that lead to fitness under a certain environment. These can then be used inferentially to both generate hypotheses about the underlying implementation as well as predict novel responses under external perturbations.
In this work I use information theory to address the question of how biological systems generate complex outputs using relatively simple mechanisms in a robust manner. In particular, I will examine how communication and distributed processing can lead to emergent phenomena which allow collective systems to respond in a much richer way than a single organism could
Recommended from our members
Investigating processing body condensation, material properties and function during early Drosophila development
How cells control spatiotemporal protein synthesis is a biological phenomenon with broad implications in physiology, pathology, and therapeutics. Cellular material can be organised into membrane-bound or membrane-less compartments. The latter, commonly known as biomolecular condensates, often contain ribonucleoprotein assemblies that form via liquid-liquid phase separation. Conserved across eukaryotes,
Processing bodies (P bodies) are cytoplasmic biomolecular condensates which act as hubs for RNA regulation including storage, translation, and degradation. During Drosophila oogenesis, several maternal mRNAs are stored and translationally repressed inside P bodies before they are translated and degraded in the early embryo. How P bodies differentially regulate maternal RNAs is not fully understood.
Using a combination of in vivo and in vitro assays, I show that P bodies in the mature oocyte exist as multilayered viscoelastic condensates that are regulated by synergistic, multivalent interactions between structurally distinct protein and RNA molecules. I also demonstrate that the gel-like biophysical state of P bodies allows for the storage of bicoid, a key maternal transcript needed for embryonic development. Using pharmacological disruption and live imaging, I show that large scale cytoplasmic reorganisation causes P body dispersal during the oocyte to embryo transition to release the stored maternal mRNAs. Finally, in the early embryo, using live imaging and biochemical analyses, I show that P bodies re-form into smaller sized, highly dynamic condensates with altered post-translational and biochemical modifications.
Taken together, developmental cues coordinate synergistic macromolecular interactions and cytoplasmic modifications to differentially regulate RNAs through P body phase transitions during early Drosophila development
Information processing in biology
To survive, organisms must respond appropriately to a variety of challenges posed by a dynamic and uncertain environment. The mechanisms underlying such responses can in general be framed as input-output devices which map environment states (inputs) to associated responses (output. In this light, it is appealing to attempt to model these systems using information theory, a well developed mathematical framework to describe input-output systems.
Under the information theoretical perspective, an organism’s behavior is fully characterized by the repertoire of its outputs under different environmental conditions. Due to natural selection, it is reasonable to assume this input-output mapping has been fine tuned in such a way as to maximize the organism’s fitness. If that is the case, it should be possible to abstract away the mechanistic implementation details and obtain the general principles that lead to fitness under a certain environment. These can then be used inferentially to both generate hypotheses about the underlying implementation as well as predict novel responses under external perturbations.
In this work I use information theory to address the question of how biological systems generate complex outputs using relatively simple mechanisms in a robust manner. In particular, I will examine how communication and distributed processing can lead to emergent phenomena which allow collective systems to respond in a much richer way than a single organism could