27 research outputs found
Functional Genomics Complements Quantitative Genetics in Identifying Disease-Gene Associations
An ultimate goal of genetic research is to understand the connection between genotype and phenotype in order to improve the diagnosis and treatment of diseases. The quantitative genetics field has developed a suite of statistical methods to associate genetic loci with diseases and phenotypes, including quantitative trait loci (QTL) linkage mapping and genome-wide association studies (GWAS). However, each of these approaches have technical and biological shortcomings. For example, the amount of heritable variation explained by GWAS is often surprisingly small and the resolution of many QTL linkage mapping studies is poor. The predictive power and interpretation of QTL and GWAS results are consequently limited. In this study, we propose a complementary approach to quantitative genetics by interrogating the vast amount of high-throughput genomic data in model organisms to functionally associate genes with phenotypes and diseases. Our algorithm combines the genome-wide functional relationship network for the laboratory mouse and a state-of-the-art machine learning method. We demonstrate the superior accuracy of this algorithm through predicting genes associated with each of 1157 diverse phenotype ontology terms. Comparison between our prediction results and a meta-analysis of quantitative genetic studies reveals both overlapping candidates and distinct, accurate predictions uniquely identified by our approach. Focusing on bone mineral density (BMD), a phenotype related to osteoporotic fracture, we experimentally validated two of our novel predictions (not observed in any previous GWAS/QTL studies) and found significant bone density defects for both Timp2 and Abcg8 deficient mice. Our results suggest that the integration of functional genomics data into networks, which itself is informative of protein function and interactions, can successfully be utilized as a complementary approach to quantitative genetics to predict disease risks. All supplementary material is available at http://cbfg.jax.org/phenotype
Comparative Microbial Modules Resource: Generation and Visualization of Multi-species Biclusters
The increasing abundance of large-scale, high-throughput datasets for many closely related organisms provides opportunities for comparative analysis via the simultaneous biclustering of datasets from multiple species. These analyses require a reformulation of how to organize multi-species datasets and visualize comparative genomics data analyses results. Recently, we developed a method, multi-species cMonkey, which integrates heterogeneous high-throughput datatypes from multiple species to identify conserved regulatory modules. Here we present an integrated data visualization system, built upon the Gaggle, enabling exploration of our method's results (available at http://meatwad.bio.nyu.edu/cmmr.html). The system can also be used to explore other comparative genomics datasets and outputs from other data analysis procedures – results from other multiple-species clustering programs or from independent clustering of different single-species datasets. We provide an example use of our system for two bacteria, Escherichia coli and Salmonella Typhimurium. We illustrate the use of our system by exploring conserved biclusters involved in nitrogen metabolism, uncovering a putative function for yjjI, a currently uncharacterized gene that we predict to be involved in nitrogen assimilation
Recommended from our members
Low-variance RNAs identify Parkinson’s disease molecular signature in blood
The diagnosis of Parkinson's disease (PD) is usually not established until advanced neurodegeneration leads to clinically detectable symptoms. Previous blood PD transcriptome studies show low concordance, possibly resulting from the use of microarray technology, which has high measurement variation. The Leucine-rich repeat kinase 2 (LRRK2) G2019S mutation predisposes to PD. Using preclinical and clinical studies, we sought to develop a novel statistically motivated transcriptomic-based approach to identify a molecular signature in the blood of Ashkenazi Jewish PD patients, including LRRK2 mutation carriers. Using a digital gene expression platform to quantify 175 messenger RNA (mRNA) markers with low coefficients of variation (CV), we first compared whole-blood transcript levels in mouse models (1) overexpressing wild-type (WT) LRRK2, (2) overexpressing G2019S LRRK2, (3) lacking LRRK2 (knockout), and (4) and in WT controls. We then studied an Ashkenazi Jewish cohort of 34 symptomatic PD patients (both WT LRRK2 and G2019S LRRK2) and 32 asymptomatic controls. The expression profiles distinguished the four mouse groups with different genetic background. In patients, we detected significant differences in blood transcript levels both between individuals differing in LRRK2 genotype and between PD patients and controls. Discriminatory PD markers included genes associated with innate and adaptive immunity and inflammatory disease. Notably, gene expression patterns in levodopa-treated PD patients were significantly closer to those of healthy controls in a dose-dependent manner. We identify whole-blood mRNA signatures correlating with LRRK2 genotype and with PD disease state. This approach may provide insight into pathogenesis and a route to early disease detection
Recommended from our members
Implications of Big Data for cell biology
“Big Data” has surpassed “systems biology” and “omics” as the hottest buzzword in the biological sciences, but is there any substance behind the hype? Certainly, we have learned about various aspects of cell and molecular biology from the many individual high-throughput data sets that have been published in the past 15–20 years. These data, although useful as individual data sets, can provide much more knowledge when interrogated with Big Data approaches, such as applying integrative methods that leverage the heterogeneous data compendia in their entirety. Here we discuss the benefits and challenges of such Big Data approaches in biology and how cell and molecular biologists can best take advantage of them
Recommended from our members
Accurate Quantification of Functional Analogy among Close Homologs
Correctly evaluating functional similarities among homologous proteins is necessary for accurate transfer of experimental knowledge from one organism to another, and is of particular importance for the development of animal models of human disease. While the fact that sequence similarity implies functional similarity is a fundamental paradigm of molecular biology, sequence comparison does not directly assess the extent to which two proteins participate in the same biological processes, and has limited utility for analyzing families with several parologous members. Nevertheless, we show that it is possible to provide a cross-organism functional similarity measure in an unbiased way through the exclusive use of high-throughput gene-expression data. Our methodology is based on probabilistic cross-species mapping of functionally analogous proteins based on Bayesian integrative analysis of gene expression compendia. We demonstrate that even among closely related genes, our method is able to predict functionally analogous homolog pairs better than relying on sequence comparison alone. We also demonstrate that the landscape of functional similarity is often complex and that definitive “functional orthologs” do not always exist. Even in these cases, our method and the online interface we provide are designed to allow detailed exploration of sources of inferred functional similarity that can be evaluated by the user