89 research outputs found

    A Differential Network Approach to Exploring Differences between Biological States: An Application to Prediabetes

    Get PDF
    Background: Variations in the pattern of molecular associations are observed during disease development. The comprehensive analysis of molecular association patterns and their changes in relation to different physiological conditions can yield insight into the biological basis of disease-specific phenotype variation. Methodology: Here, we introduce a formal statistical method for the differential analysis of molecular associations via network representation. We illustrate our approach with extensive data on lipoprotein subclasses measured by NMR spectroscopy in 4,406 individuals with normal fasting glucose, and 531 subjects with impaired fasting glucose (prediabetes). We estimate the pair-wise association between measures using shrinkage estimates of partial correlations and build the differential network based on this measure of association. We explore the topological properties of the inferred network to gain insight into important metabolic differences between individuals with normal fasting glucose and prediabetes. Conclusions/Significance: Differential networks provide new insights characterizing differences in biological states. Based on conventional statistical methods, few differences in concentration levels of lipoprotein subclasses were found between individuals with normal fasting glucose and individuals with prediabetes. By performing the differential analysis of networks, several characteristic changes in lipoprotein metabolism known to be related to diabetic dyslipidemias were identified. The results demonstrate the applicability of the new approach to identify key molecular changes inaccessible to standard approaches

    Simple connectome inference from partial correlation statistics in calcium imaging

    Full text link
    In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods

    Arabidopsis thaliana computationally-generated next-state gene interaction models

    Get PDF
    The construction of gene interaction models must be a fully collaborative and intentional effort. All aspects of the research, such as growing the plants, extracting the measurements, refining the measured data, developing the statistical framework, and forming and applying the algorithmic techniques, must lend themselves to repeatable and sound practices. This paper holistically focuses on the process of producing gene interaction models based on transcript abundance data from Arabidopsis thaliana after stimulation by a plant hormone

    Application of new probabilistic graphical models in the genetic regulatory networks studies

    Get PDF
    This paper introduces two new probabilistic graphical models for reconstruction of genetic regulatory networks using DNA microarray data. One is an Independence Graph (IG) model with either a forward or a backward search algorithm and the other one is a Gaussian Network (GN) model with a novel greedy search method. The performances of both models were evaluated on four MAPK pathways in yeast and three simulated data sets. Generally, an IG model provides a sparse graph but a GN model produces a dense graph where more information about gene-gene interactions is preserved. Additionally, we found two key limitations in the prediction of genetic regulatory networks using DNA microarray data, the first is the sufficiency of sample size and the second is the complexity of network structures may not be captured without additional data at the protein level. Those limitations are present in all prediction methods which used only DNA microarray data.Comment: 38 pages, 3 figure

    Phenotype Prediction Using Regularized Regression on Genetic Data in the DREAM5 Systems Genetics B Challenge

    Get PDF
    A major goal of large-scale genomics projects is to enable the use of data from high-throughput experimental methods to predict complex phenotypes such as disease susceptibility. The DREAM5 Systems Genetics B Challenge solicited algorithms to predict soybean plant resistance to the pathogen Phytophthora sojae from training sets including phenotype, genotype, and gene expression data. The challenge test set was divided into three subcategories, one requiring prediction based on only genotype data, another on only gene expression data, and the third on both genotype and gene expression data. Here we present our approach, primarily using regularized regression, which received the best-performer award for subchallenge B2 (gene expression only). We found that despite the availability of 941 genotype markers and 28,395 gene expression features, optimal models determined by cross-validation experiments typically used fewer than ten predictors, underscoring the importance of strong regularization in noisy datasets with far more features than samples. We also present substantial analysis of the training and test setup of the challenge, identifying high variance in performance on the gold standard test sets.National Science Foundation (U.S.). Graduate Research Fellowship ProgramNational Defense Science and Engineering Graduate Fellowshi

    Identifying a Transcription Factor’s Regulatory Targets from its Binding Targets

    Get PDF
    ChIP-chip data, which shows binding of transcription factors (TFs) to promoter regions in vivo, are widely used by biologists to identify the regulatory targets of TFs. However, the binding of a TF to a gene does not necessarily imply regulation. Thus, it is important to develop computational methods which can extract a TF’s regulatory targets from its binding targets. We developed a method, called REgulatory Targets Extraction Algorithm (RETEA), which uses partial correlation analysis on gene expression data to extract a TF’s regulatory targets from its binding targets inferred from ChIP-chip data. We applied RETEA to yeast cell cycle microarray data and identified the plausible regulatory targets of eleven known cell cycle TFs. We validated our predictions by checking the enrichments for cell cycle-regulated genes, common cellular processes and common molecular functions. Finally, we showed that RETEA performs better than three published methods (MA-Network, TRIA and Garten et al’s method)

    Transkingdom Networks: A Systems Biology Approach to Identify Causal Members of Host-Microbiota Interactions

    Full text link
    Improvements in sequencing technologies and reduced experimental costs have resulted in a vast number of studies generating high-throughput data. Although the number of methods to analyze these "omics" data has also increased, computational complexity and lack of documentation hinder researchers from analyzing their high-throughput data to its true potential. In this chapter we detail our data-driven, transkingdom network (TransNet) analysis protocol to integrate and interrogate multi-omics data. This systems biology approach has allowed us to successfully identify important causal relationships between different taxonomic kingdoms (e.g. mammals and microbes) using diverse types of data

    From Knockouts to Networks: Establishing Direct Cause-Effect Relationships through Graph Analysis

    Get PDF
    Background: Reverse-engineering gene networks from expression profiles is a difficult problem for which a multitude of techniques have been developed over the last decade. The yearly organized DREAM challenges allow for a fair evaluation and unbiased comparison of these methods. Results: We propose an inference algorithm that combines confidence matrices, computed as the standard scores from single-gene knockout data, with the down-ranking of feed-forward edges. Substantial improvements on the predictions can be obtained after the execution of this second step. Conclusions: Our algorithm was awarded the best overall performance at the DREAM4 In Silico 100-gene network subchallenge, proving to be effective in inferring medium-size gene regulatory networks. This success demonstrates once again the decisive importance of gene expression data obtained after systematic gene perturbations and highlights the usefulness of graph analysis to increase the reliability of inference
    • …
    corecore