3,698 research outputs found

    Transcriptional regulatory network refinement and quantification through kinetic modeling, gene expression microarray data and information theory

    Get PDF
    BACKGROUND: Gene expression microarray and other multiplex data hold promise for addressing the challenges of cellular complexity, refined diagnoses and the discovery of well-targeted treatments. A new approach to the construction and quantification of transcriptional regulatory networks (TRNs) is presented that integrates gene expression microarray data and cell modeling through information theory. Given a partial TRN and time series data, a probability density is constructed that is a functional of the time course of transcription factor (TF) thermodynamic activities at the site of gene control, and is a function of mRNA degradation and transcription rate coefficients, and equilibrium constants for TF/gene binding. RESULTS: Our approach yields more physicochemical information that compliments the results of network structure delineation methods, and thereby can serve as an element of a comprehensive TRN discovery/quantification system. The most probable TF time courses and values of the aforementioned parameters are obtained by maximizing the probability obtained through entropy maximization. Observed time delays between mRNA expression and activity are accounted for implicitly since the time course of the activity of a TF is coupled by probability functional maximization, and is not assumed to be proportional to expression level of the mRNA type that translates into the TF. This allows one to investigate post-translational and TF activation mechanisms of gene regulation. Accuracy and robustness of the method are evaluated. A kinetic formulation is used to facilitate the analysis of phenomena with a strongly dynamical character while a physically-motivated regularization of the TF time course is found to overcome difficulties due to omnipresent noise and data sparsity that plague other methods of gene expression data analysis. An application to Escherichia coli is presented. CONCLUSION: Multiplex time series data can be used for the construction of the network of cellular processes and the calibration of the associated physicochemical parameters. We have demonstrated these concepts in the context of gene regulation understood through the analysis of gene expression microarray time series data. Casting the approach in a probabilistic framework has allowed us to address the uncertainties in gene expression microarray data. Our approach was found to be robust to error in the gene expression microarray data and mistakes in a proposed TRN

    The complexity of gene expression dynamics revealed by permutation entropy

    Get PDF
    Background: High complexity is considered a hallmark of living systems. Here we investigate the complexity of temporal gene expression patterns using the concept of Permutation Entropy (PE) first introduced in dynamical systems theory. The analysis of gene expression data has so far focused primarily on the identification of differentially expressed genes, or on the elucidation of pathway and regulatory relationships. We aim to study gene expression time series data from the viewpoint of complexity.Results: Applying the PE complexity metric to abiotic stress response time series data in Arabidopsis thaliana, genes involved in stress response and signaling were found to be associated with the highest complexity not only under stress, but surprisingly, also under reference, non-stress conditions. Genes with house-keeping functions exhibited lower PE complexity. Compared to reference conditions, the PE of temporal gene expression patterns generally increased upon stress exposure. High-complexity genes were found to have longer upstream intergenic regions and more cis-regulatory motifs in their promoter regions indicative of a more complex regulatory apparatus needed to orchestrate their expression, and to be associated with higher correlation network connectivity degree. Arabidopsis genes also present in other plant species were observed to exhibit decreased PE complexity compared to Arabidopsis specific genes.Conclusions: We show that Permutation Entropy is a simple yet robust and powerful approach to identify temporal gene expression profiles of varying complexity that is equally applicable to other types of molecular profile data

    Signaling network prediction by the Ontology Fingerprint enhanced Bayesian network

    Full text link
    Abstract Background Despite large amounts of available genomic and proteomic data, predicting the structure and response of signaling networks is still a significant challenge. While statistical method such as Bayesian network has been explored to meet this challenge, employing existing biological knowledge for network prediction is difficult. The objective of this study is to develop a novel approach that integrates prior biological knowledge in the form of the Ontology Fingerprint to infer cell-type-specific signaling networks via data-driven Bayesian network learning; and to further use the trained model to predict cellular responses. Results We applied our novel approach to address the Predictive Signaling Network Modeling challenge of the fourth (2009) Dialog for Reverse Engineering Assessment's and Methods (DREAM4) competition. The challenge results showed that our method accurately captured signal transduction of a network of protein kinases and phosphoproteins in that the predicted protein phosphorylation levels under all experimental conditions were highly correlated (R2 = 0.93) with the observed results. Based on the evaluation of the DREAM4 organizer, our team was ranked as one of the top five best performers in predicting network structure and protein phosphorylation activity under test conditions. Conclusions Bayesian network can be used to simulate the propagation of signals in cellular systems. Incorporating the Ontology Fingerprint as prior biological knowledge allows us to efficiently infer concise signaling network structure and to accurately predict cellular responses.http://deepblue.lib.umich.edu/bitstream/2027.42/109490/1/12918_2012_Article_989.pd

    Synthesis of Biological and Mathematical Methods for Gene Network Control

    Get PDF
    abstract: Synthetic biology is an emerging field which melds genetics, molecular biology, network theory, and mathematical systems to understand, build, and predict gene network behavior. As an engineering discipline, developing a mathematical understanding of the genetic circuits being studied is of fundamental importance. In this dissertation, mathematical concepts for understanding, predicting, and controlling gene transcriptional networks are presented and applied to two synthetic gene network contexts. First, this engineering approach is used to improve the function of the guide ribonucleic acid (gRNA)-targeted, dCas9-regulated transcriptional cascades through analysis and targeted modification of the RNA transcript. In so doing, a fluorescent guide RNA (fgRNA) is developed to more clearly observe gRNA dynamics and aid design. It is shown that through careful optimization, RNA Polymerase II (Pol II) driven gRNA transcripts can be strong enough to exhibit measurable cascading behavior, previously only shown in RNA Polymerase III (Pol III) circuits. Second, inherent gene expression noise is used to achieve precise fractional differentiation of a population. Mathematical methods are employed to predict and understand the observed behavior, and metrics for analyzing and quantifying similar differentiation kinetics are presented. Through careful mathematical analysis and simulation, coupled with experimental data, two methods for achieving ratio control are presented, with the optimal schema for any application being dependent on the noisiness of the system under study. Together, these studies push the boundaries of gene network control, with potential applications in stem cell differentiation, therapeutics, and bio-production.Dissertation/ThesisDoctoral Dissertation Biomedical Engineering 201

    Decoding Complexity in Metabolic Networks using Integrated Mechanistic and Machine Learning Approaches

    Get PDF
    How can we get living cells to do what we want? What do they actually ‘want’? What ‘rules’ do they observe? How can we better understand and manipulate them? Answers to fundamental research questions like these are critical to overcoming bottlenecks in metabolic engineering and optimizing heterologous pathways for synthetic biology applications. Unfortunately, biological systems are too complex to be completely described by physicochemical modeling alone. In this research, I developed and applied integrated mechanistic and data-driven frameworks to help uncover the mysteries of cellular regulation and control. These tools provide a computational framework for seeking answers to pertinent biological questions. Four major tasks were accomplished. First, I developed innovative tools for key areas in the genome-to-phenome mapping pipeline. An efficient gap filling algorithm (called BoostGAPFILL) that integrates mechanistic and machine learning techniques was developed for the refinement of genome-scale metabolic network reconstructions. Genome-scale metabolic network reconstructions are finding ever increasing applications in metabolic engineering for industrial, medical and environmental purposes. Second, I designed a thermodynamics-based framework (called REMEP) for mutant phenotype prediction (integrating metabolomics, fluxomics and thermodynamics data). These tools will go a long way in improving the fidelity of model predictions of microbial cell factories. Third, I designed a data-driven framework for characterizing and predicting the effectiveness of metabolic engineering strategies. This involved building a knowledgebase of historical microbial cell factory performance from published literature. Advanced machine learning concepts, such as ensemble learning and data augmentation, were employed in combination with standard mechanistic models to develop a predictive platform for important industrial biotechnology metrics such as yield, titer, and productivity. Fourth, my modeling tools and skills have been used for case studies on fungal lipid metabolism analyses, E. coli resource allocation balances, reconstruction of the genome-scale metabolic network for a non-model species, R. opacus, as well as the rapid prediction of bacterial heterotrophic fluxomics. In the long run, this integrated modeling approach will significantly shorten the “design-build-test-learn” cycle of metabolic engineering, as well as provide a platform for biological discovery

    THE ROLE OF GENE EXPRESSION NOISE IN MAMMALIAN CELL SURVIVAL

    Get PDF
    Drug resistance and metastasis remain obstacles to effective cancer treatment. A major challenge contributing to this problem is cellular heterogeneity. Even in the same environment, cells with identical genomes can display cell-to-cell differences in gene expression, also known as gene expression noise. Gene expression noise can vary in magnitude in a population or in fluctuation time scales, which is influenced by gene regulatory networks. Currently, it is unclear how gene expression noise from gene regulatory networks contributes to drug survival outcomes in mammalian cells. An isogenic cell line with a noise-modulating genetic system tuned to the same mean is required. Additionally, how modulating endogenous mean gene expression and noise in living cells influences pro-survival metastatic state transitions remains unanswered. To address these knowledge gaps, I implemented an exogenous synthetic biology approach to control noise for the drug resistance gene PuroR in drug survival while complementing with endogenous expression measurements of the pro-metastatic gene BACH1 as a correlate for metastatic survival. For exogenous control, I developed synthetic gene circuits in Chinese Hamster Ovary (CHO) cells based on positive and negative feedback that tune noise for PuroR at identical mean expression. At a decoupled noise point, isogenic cells were treated with various Puromycin concentrations. Evolution experiments revealed that noise hurts drug resistance during low drug dosage while facilitating resistance at a high Puromycin concentration. Drug adaptation for the low-noise gene circuit relied on intra-circuit mutations while the high-noise circuit did not and became re-sensitized to drug after removing circuit induction. To implement the endogenous approach, I tagged the endogenous BACH1 gene with the mCherry fluorescent protein in six HEK293 clones. Molecular perturbations such as serum starvation and long-term hemin treatment altered mean fluorescence in at least one clone. Additionally, monitoring migration after cell wounding revealed increased non-uniform fluorescence at the wound edge. The increased mean fluorescence for the potentially bistable HEK293 clone 2C10 during hemin treatment may reflect altered BACH1 state transitions. Overall, noise enhanced the probability of cells to reach an expression level that confers survival during drug treatment while hemin perturbations may induce a pro-survival metastatic transition via BACH1 expression

    Enhanced characteristics of genetically modified switchgrass (Panicum virgatum L.) for high biofuel production

    Get PDF
    Background Lignocellulosic biomass is one of the most promising renewable and clean energy resources to reduce greenhouse gas emissions and dependence on fossil fuels. However, the resistance to accessibility of sugars embedded in plant cell walls (so-called recalcitrance) is a major barrier to economically viable cellulosic ethanol production. A recent report from the US National Academy of Sciences indicated that, “absent technological breakthroughs”, it was unlikely that the US would meet the congressionally mandated renewable fuel standard of 35 billion gallons of ethanol-equivalent biofuels plus 1 billion gallons of biodiesel by 2022. We here describe the properties of switchgrass (Panicum virgatum) biomass that has been genetically engineered to increase the cellulosic ethanol yield by more than 2-fold. Results We have increased the cellulosic ethanol yield from switchgrass by 2.6-fold through overexpression of the transcription factor PvMYB4. This strategy reduces carbon deposition into lignin and phenolic fermentation inhibitors while maintaining the availability of potentially fermentable soluble sugars and pectic polysaccharides. Detailed biomass characterization analyses revealed that the levels and nature of phenolic acids embedded in the cell-wall, the lignin content and polymer size, lignin internal linkage levels, linkages between lignin and xylans/pectins, and levels of wall-bound fucose are all altered in PvMYB4-OX lines. Genetically engineered PvMYB4-OX switchgrass therefore provides a novel system for further understanding cell wall recalcitrance. Conclusions Our results have demonstrated that overexpression of PvMYB4, a general transcriptional repressor of the phenylpropanoid/lignin biosynthesis pathway, can lead to very high yield ethanol production through dramatic reduction of recalcitrance. MYB4-OX switchgrass is an excellent model system for understanding recalcitrance, and provides new germplasm for developing switchgrass cultivars as biomass feedstocks for biofuel production. Keywords: Switchgrass; Bioenergy; Biofuel; Feedstock; Cellulosic ethanol; PvMYB4; Transcription factor; Cell wall; Recalcitrance; Lignin; Hemicellulose; Pecti
    corecore