7 research outputs found

    Dynamical hybrid modeling of human metabolism

    Get PDF
    Human metabolism plays a key role in disease pathogenesis and drug action. Half a century of biochemical literature leveraged by the advent of genomics allowed the emergence of computational modeling techniques and the in silico analysis of complex biological systems. In particular, Constraint-Based Reconstruction and Analysis (COBRA) methods address the complexity of metabolism through building tissue-specific networks in their steady state. It is known that biological systems respond to perturbations induced by pathogens, drugs or malignant processes by shifting their activity to safeguard key metabolic functions. Extending the modeling framework to consider the dynamics of these complex systems will bring simulations closer to observable human phenotypes. In this thesis, I combined physiologically-based pharmacokinetic (PBPK) models with genome-scale metabolic models (GSMMs) to form hybrid genome-scale dynamical models that provide a hypothesis-free framework to study the perturbations induced by one or more perturbagen on human tissues. On a first stage, these methodologies were applied to decipher the absorption of levodopa and amino acids by the intestinal epithelium and allowed to derive a model-based diet for Parkinson's Disease patients. In the next phase, we extended the study to 605 drugs in order to predict the occurrence of gastrointestinal side effects through a machine learning classifier, using a combination of gene expression and metabolic reactions set as features. Finally, the approach upscaled to several tissues, specifically to investigate the genesis of metabolic symptoms in type 1 diabetes and to suggest key metabolic players underlying within and between-individual variability to insulin action. Taken as whole, the integration of two modeling techniques constrained by expert biological knowledge and heterogeneous data types will be a step forward in achieving convergence in human biology

    Deciphering transcriptional patterns of gene regulation : a computational approach

    Get PDF
    With rapid advancements in sequencing technology, we now have the ability to sequence the entire human genome, and to quantify expression of tens of thousands of genes from hundreds of individuals. This provides an extraordinary opportunity to learn phenotype relevant genomic patterns that can improve our understanding of molecular and cellular processes underlying a trait. The high dimensional nature of genomic data presents a range of computational and statistical challenges. This dissertation presents a compilation of projects that were driven by the motivation to efficiently capture gene regulatory patterns in the human transcriptome, while addressing statistical and computational challenges that accompany this data. We attempt to address two major difficulties in this domain: a) artifacts and noise in transcriptomic data, and b) limited statistical power. First, we present our work on investigating the effect of artifactual variation in gene expression data and its impact on trans-eQTL discovery. Here we performed an in-depth analysis of diverse pre-recorded covariates and latent confounders to understand their contribution to heterogeneity in gene expression measurements. Next, we discovered 673 trans-eQTLs across 16 human tissues using v6 data from the Genotype Tissue Expression (GTEx) project. Finally, we characterized two trait-associated trans-eQTLs; one in Skeletal Muscle and another in Thyroid. Second, we present a principal component based residualization method to correct gene expression measurements prior to reconstruction of co-expression networks. In this work, we demonstrated theoretically, in simulation, and empirically, that principal component correction of gene expression measurements prior to network inference can reduce false positive edges. Using data from the GTEx project in multiple tissues, we showed that this approach reduced false discoveries beyond correcting only for known confounders. Third, we present a multi-study integration approach to identify universal transcriptional patterns underlying epithelial to mesenchymal transition (EMT) across different cancer types. With informed statistical analysis and functional validation, we identified consensus ranked universal EMT genes. This gene list consisted of a) known EMT genes, b) genes studied in a subset of carcinomas, unknown in prostate cancer, and c) novel unknown EMT and cancer genes such as C1orf116. Finally we present methods to integrate co-expression signals across multiple human RNA-seq data to reconstruct networks with increased power. First, we considered multiple aggregation strategies to build context-agnostic networks using data from recount2. These networks captured ubiquitous patterns of gene co-expression shared across tissues and cell types. Next, we briefly describe a hierarchical mixture model groupNet that leverages signal from multiple datasets to learn the structure of a Gaussian Markov random field (GRMF) to build context-specific co-expression networks

    An evaluation protocol for subtype-specific breast cancer event prediction

    Get PDF
    In recent years increasing evidence appeared that breast cancer may not constitute a single disease at the molecular level, but comprises a heterogeneous set of subtypes. This suggests that instead of building a single monolithic predictor, better predictors might be constructed that solely target samples of a designated subtype, which are believed to represent more homogeneous sets of samples. An unavoidable drawback of developing subtype-specific predictors, however, is that a stratification by subtype drastically reduces the number of samples available for their construction. As numerous studies have indicated sample size to be an important factor in predictor construction, it is therefore questionable whether the potential benefit of subtyping can outweigh the drawback of a severe loss in sample size. Factors like unequal class distributions and differences in the number of samples per subtype, further complicate comparisons. We present a novel experimental protocol that facilitates a comprehensive comparison between subtype-specific predictors and predictors that do not take subtype information into account. Emphasis lies on careful control of sample size as well as class and subtype distributions. The methodology is applied to a large breast cancer compendium involving over 1500 arrays, using a state-of-the-art subtyping scheme. We show that the resulting subtype-specific predictors outperform those that do not take subtype information into account, especially when taking sample size considerations into account.MediamaticsElectrical Engineering, Mathematics and Computer Scienc
    corecore