17 research outputs found

    Metabolomics (Los Angel)

    Get PDF
    Background, cancer significance and questionBioProspecting is a novel approach that enabled our team to mine genetic marker related data from the New England Journal of Medicine (NEJM) utilizing Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and the Human Gene Ontology (HUGO). Genes associated with disorders using the Multi-threaded Clinical Vocabulary Server (MCVS) Natural Language Processing (NLP) engine, whose output was represented as an ontology-network incorporating the semantic encodings of the literature. Metabolic functions were used to identify potentially novel relationships between (genes or proteins) and (diseases or drugs). In an effort to identify genes important to transformation of normal tissue into a malignancy, we went on to identify the genes linked to multiple cancers and then mapped those genes to metabolic and signaling pathways.FindingsTen Genes were related to 30 or more cancers, 72 genes were related to 20 or more cancers and 191 genes were related to 10 or more cancers. The three pathways most often associated with the top 200 novel cancer markers were the Acute Phase Response Signaling, the Glucocorticoid Receptor Signaling and the Hepatic Fibrosis/Hepatic Stellate Cell Activation pathway.Meaning and implications of the advanceThis association highlights the role of inflammation in the induction and perhaps transformation of mortal cells into cancers.Major findingsBioProspecting can speed our identification and understanding of synergies between articles in the biomedical literature. In this case we found considerable synergy between the Oncology literature and the Sepsis literature. By mapping these associations to known metabolic, regulatory and signaling pathways we were able to identify further evidence for the inflammatory basis of cancer.R01 PH000022/PH/PHPPO CDC HHS/United StatesU38 HK000014/HK/PHITPO CDC HHS/United StatesUL1 RR029887/RR/NCRR NIH HHS/United StatesPHS HHS/United States2013-11-27T00:00:00Z24294537PMC384134

    Correlation analysis reveals the emergence of coherence in the gene expression dynamics following system perturbation

    Get PDF
    Time course gene expression experiments are a popular means to infer co-expression. Many methods have been proposed to cluster genes or to build networks based on similarity measures of their expression dynamics. In this paper we apply a correlation based approach to network reconstruction to three datasets of time series gene expression following system perturbation: 1) Conditional, Tamoxifen dependent, activation of the cMyc proto-oncogene in rat fibroblast; 2) Genomic response to nutrition changes in D. melanogaster; 3) Patterns of gene activity as a consequence of ageing occurring over a life-span time series (25y–90y) sampled from T-cells of human donors

    Identifying Tmem59 related gene regulatory network of mouse neural stem cell from a compendium of expression profiles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Neural stem cells offer potential treatment for neurodegenerative disorders, such like Alzheimer's disease (AD). While much progress has been made in understanding neural stem cell function, a precise description of the molecular mechanisms regulating neural stem cells is not yet established. This lack of knowledge is a major barrier holding back the discovery of therapeutic uses of neural stem cells. In this paper, the regulatory mechanism of mouse neural stem cell (NSC) differentiation by <it>tmem59 </it>is explored on the genome-level.</p> <p>Results</p> <p>We identified regulators of <it>tmem59 </it>during the differentiation of mouse NSCs from a compendium of expression profiles. Based on the microarray experiment, we developed the parallelized SWNI algorithm to reconstruct gene regulatory networks of mouse neural stem cells. From the inferred <it>tmem59 </it>related gene network including 36 genes, <it>pou6f1 </it>was identified to regulate <it>tmem59 </it>significantly and might play an important role in the differentiation of NSCs in mouse brain. There are four pathways shown in the gene network, indicating that <it>tmem59 </it>locates in the downstream of the signalling pathway. The real-time RT-PCR results shown that the over-expression of <it>pou6f1 </it>could significantly up-regulate <it>tmem59 </it>expression in C17.2 NSC line. 16 out of 36 predicted genes in our constructed network have been reported to be AD-related, including <it>Ace</it>, <it>aqp1</it>, <it>arrdc3</it>, <it>cd14</it>, <it>cd59a</it>, <it>cds1</it>, <it>cldn1</it>, <it>cox8b</it>, <it>defb11</it>, <it>folr1</it>, <it>gdi2</it>, <it>mmp3</it>, <it>mgp</it>, <it>myrip</it>, <it>Ripk4</it>, <it>rnd3</it>, and <it>sncg</it>. The localization of <it>tmem59 </it>related genes and functional-related gene groups based on the Gene Ontology (GO) annotation was also identified.</p> <p>Conclusions</p> <p>Our findings suggest that the expression of <it>tmem59 </it>is an important factor contributing to AD. The parallelized SWNI algorithm increased the efficiency of network reconstruction significantly. This study enables us to highlight novel genes that may be involved in NSC differentiation and provides a shortcut to identifying genes for AD.</p

    The Structure of a Gene Co-Expression Network Reveals Biological Functions Underlying eQTLs

    Get PDF
    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology

    On the Choice and Number of Microarrays for Transcriptional Regulatory Network Inference

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Transcriptional regulatory network inference (TRNI) from large compendia of DNA microarrays has become a fundamental approach for discovering transcription factor (TF)-gene interactions at the genome-wide level. In correlation-based TRNI, network edges can in principle be evaluated using standard statistical tests. However, while such tests nominally assume independent microarray experiments, we expect dependency between the experiments in microarray compendia, due to both project-specific factors (e.g., microarray preparation, environmental effects) in the multi-project compendium setting and effective dependency induced by gene-gene correlations. Herein, we characterize the nature of dependency in an <it>Escherichia coli </it>microarray compendium and explore its consequences on the problem of determining which and how many arrays to use in correlation-based TRNI.</p> <p>Results</p> <p>We present evidence of substantial effective dependency among microarrays in this compendium, and characterize that dependency with respect to experimental condition factors. We then introduce a measure <it>n</it><sub><it>eff </it></sub>of the effective number of experiments in a compendium, and find that corresponding to the dependency observed in this particular compendium there is a huge reduction in effective sample size i.e., <it>n</it><sub><it>eff </it></sub>= 14.7 versus <it>n </it>= 376. Furthermore, we found that the <it>n</it><sub><it>eff </it></sub>of select subsets of experiments actually exceeded <it>n</it><sub><it>eff </it></sub>of the full compendium, suggesting that the adage 'less is more' applies here. Consistent with this latter result, we observed improved performance in TRNI using subsets of the data compared to results using the full compendium. We identified experimental condition factors that trend with changes in TRNI performance and <it>n</it><sub><it>eff </it></sub>, including growth phase and media type. Finally, using the set of known E. coli genetic regulatory interactions from RegulonDB, we demonstrated that false discovery rates (FDR) derived from <it>n</it><sub><it>eff </it></sub>-adjusted p-values were well-matched to FDR based on the RegulonDB truth set.</p> <p>Conclusions</p> <p>These results support utilization of <it>n</it><sub><it>eff </it></sub>as a potent descriptor of microarray compendia. In addition, they highlight a straightforward correlation-based method for TRNI with demonstrated meaningful statistical testing for significant edges, readily applicable to compendia from any species, even when a truth set is not available. This work facilitates a more refined approach to construction and utilization of mRNA expression compendia in TRNI.</p

    Network Medicine in the Age of Biomedical Big Data

    Get PDF
    Network medicine is an emerging area of research dealing with molecular and genetic interactions, network biomarkers of disease, and therapeutic target discovery. Large-scale biomedical data generation offers a unique opportunity to assess the effect and impact of cellular heterogeneity and environmental perturbations on the observed phenotype. Marrying the two, network medicine with biomedical data provides a framework to build meaningful models and extract impactful results at a network level. In this review, we survey existing network types and biomedical data sources. More importantly, we delve into ways in which the network medicine approach, aided by phenotype-specific biomedical data, can be gainfully applied. We provide three paradigms, mainly dealing with three major biological network archetypes: protein-protein interaction, expression-based, and gene regulatory networks. For each of these paradigms, we discuss a broad overview of philosophies under which various network methods work. We also provide a few examples in each paradigm as a test case of its successful application. Finally, we delineate several opportunities and challenges in the field of network medicine. We hope this review provides a lexicon for researchers from biological sciences and network theory to come on the same page to work on research areas that require interdisciplinary expertise. Taken together, the understanding gained from combining biomedical data with networks can be useful for characterizing disease etiologies and identifying therapeutic targets, which, in turn, will lead to better preventive medicine with translational impact on personalized healthcare
    corecore