32 research outputs found

    Gene network reconstruction from microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Often, software available for biological pathways reconstruction rely on literature search to find links between genes. The aim of this study is to reconstruct gene networks from microarray data, using Graphical Gaussian models.</p> <p>Results</p> <p>The <it>GeneNet </it>R package was applied to the Eadgene chicken infection data set. No significant edges were found for the list of differentially expressed genes between conditions MM8 and MA8. On the other hand, a large number of significant edges were found among 85 differentially expressed genes between conditions MM8 and MM24.</p> <p>Conclusion</p> <p>Many edges were inferred from the microarray data. Most of them could, however, not be validated using other pathway reconstruction software. This was partly due to the fact that a quite large proportion of the differentially expressed genes were not annotated. Further biological validation is therefore needed for these networks, using for example in vitro invalidation of genes.</p

    Constructing non-stationary Dynamic Bayesian Networks with a flexible lag choosing mechanism

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Dynamic Bayesian Networks (DBNs) are widely used in regulatory network structure inference with gene expression data. Current methods assumed that the underlying stochastic processes that generate the gene expression data are stationary. The assumption is not realistic in certain applications where the intrinsic regulatory networks are subject to changes for adapting to internal or external stimuli.</p> <p>Results</p> <p>In this paper we investigate a novel non-stationary DBNs method with a potential regulator detection technique and a flexible lag choosing mechanism. We apply the approach for the gene regulatory network inference on three non-stationary time series data. For the Macrophages and Arabidopsis data sets with the reference networks, our method shows better network structure prediction accuracy. For the Drosophila data set, our approach converges faster and shows a better prediction accuracy on transition times. In addition, our reconstructed regulatory networks on the Drosophila data not only share a lot of similarities with the predictions of the work of other researchers but also provide many new structural information for further investigation.</p> <p>Conclusions</p> <p>Compared with recent proposed non-stationary DBNs methods, our approach has better structure prediction accuracy By detecting potential regulators, our method reduces the size of the search space, hence may speed up the convergence of MCMC sampling.</p

    Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The combination of genotypic and genome-wide expression data arising from segregating populations offers an unprecedented opportunity to model and dissect complex phenotypes. The immense potential offered by these data derives from the fact that genotypic variation is the sole source of perturbation and can therefore be used to reconcile changes in gene expression programs with the parental genotypes. To date, several methodologies have been developed for modeling eQTL data. These methods generally leverage genotypic data to resolve causal relationships among gene pairs implicated as associates in the expression data. In particular, leading studies have augmented Bayesian networks with genotypic data, providing a powerful framework for learning and modeling causal relationships. While these initial efforts have provided promising results, one major drawback associated with these methods is that they are generally limited to resolving causal orderings for transcripts most proximal to the genomic loci. In this manuscript, we present a probabilistic method capable of learning the causal relationships between transcripts at all levels in the network. We use the information provided by our method as a prior for Bayesian network structure learning, resulting in enhanced performance for gene network reconstruction.</p> <p>Results</p> <p>Using established protocols to synthesize eQTL networks and corresponding data, we show that our method achieves improved performance over existing leading methods. For the goal of gene network reconstruction, our method achieves improvements in recall ranging from 20% to 90% across a broad range of precision levels and for datasets of varying sample sizes. Additionally, we show that the learned networks can be utilized for expression quantitative trait loci mapping, resulting in upwards of 10-fold increases in recall over traditional univariate mapping.</p> <p>Conclusions</p> <p>Using the information from our method as a prior for Bayesian network structure learning yields large improvements in accuracy for the tasks of gene network reconstruction and expression quantitative trait loci mapping. In particular, our method is effective for establishing causal relationships between transcripts located both proximally and distally from genomic loci.</p

    Casual Compressive Sensing for Gene Network Inference

    Full text link
    We propose a novel framework for studying causal inference of gene interactions using a combination of compressive sensing and Granger causality techniques. The gist of the approach is to discover sparse linear dependencies between time series of gene expressions via a Granger-type elimination method. The method is tested on the Gardner dataset for the SOS network in E. coli, for which both known and unknown causal relationships are discovered

    Nonparametric identification of regulatory interactions from spatial and temporal gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The correlation between the expression levels of transcription factors and their target genes can be used to infer interactions within animal regulatory networks, but current methods are limited in their ability to make correct predictions.</p> <p>Results</p> <p>Here we describe a novel approach which uses nonparametric statistics to generate ordinary differential equation (ODE) models from expression data. Compared to other dynamical methods, our approach requires minimal information about the mathematical structure of the ODE; it does not use qualitative descriptions of interactions within the network; and it employs new statistics to protect against over-fitting. It generates spatio-temporal maps of factor activity, highlighting the times and spatial locations at which different regulators might affect target gene expression levels. We identify an ODE model for <it>eve </it>mRNA pattern formation in the <it>Drosophila melanogaster </it>blastoderm and show that this reproduces the experimental patterns well. Compared to a non-dynamic, spatial-correlation model, our ODE gives 59% better agreement to the experimentally measured pattern. Our model suggests that protein factors frequently have the potential to behave as both an activator and inhibitor for the same <it>cis</it>-regulatory module depending on the factors' concentration, and implies different modes of activation and repression.</p> <p>Conclusions</p> <p>Our method provides an objective quantification of the regulatory potential of transcription factors in a network, is suitable for both low- and moderate-dimensional gene expression datasets, and includes improvements over existing dynamic and static models.</p

    Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size

    Get PDF
    Abstract Background Co-expression has been widely used to identify novel regulatory relationships using high throughput measurements, such as microarray and RNA-seq data. Evaluation studies on co-expression network analysis methods mostly focus on networks of small or medium size of up to a few hundred nodes. For large networks, simulated expression data usually consist of hundreds or thousands of profiles with different perturbations or knock-outs, which is uncommon in real experiments due to their cost and the amount of work required. Thus, the performances of co-expression network analysis methods on large co-expression networks consisting of a few thousand nodes, with only a small number of profiles with a single perturbation, which more accurately reflect normal experimental conditions, are generally uncharacterized and unknown. Methods We proposed a novel network inference methods based on Relevance Low order Partial Correlation (RLowPC). RLowPC method uses a two-step approach to select on the high-confidence edges first by reducing the search space by only picking the top ranked genes from an intial partial correlation analysis and, then computes the partial correlations in the confined search space by only removing the linear dependencies from the shared neighbours, largely ignoring the genes showing lower association. Results We selected six co-expression-based methods with good performance in evaluation studies from the literature: Partial correlation, PCIT, ARACNE, MRNET, MRNETB and CLR. The evaluation of these methods was carried out on simulated time-series data with various network sizes ranging from 100 to 3000 nodes. Simulation results show low precision and recall for all of the above methods for large networks with a small number of expression profiles. We improved the inference significantly by refinement of the top weighted edges in the pre-inferred partial correlation networks using RLowPC. We found improved performance by partitioning large networks into smaller co-expressed modules when assessing the method performance within these modules. Conclusions The evaluation results show that current methods suffer from low precision and recall for large co-expression networks where only a small number of profiles are available. The proposed RLowPC method effectively reduces the indirect edges predicted as regulatory relationships and increases the precision of top ranked predictions. Partitioning large networks into smaller highly co-expressed modules also helps to improve the performance of network inference methods. The RLowPC R package for network construction, refinement and evaluation is available at GitHub: https://github.com/wyguo/RLowPC

    Bagging Statistical Network Inference from Large-Scale Gene Expression Data

    Get PDF
    Modern biology and medicine aim at hunting molecular and cellular causes of biological functions and diseases. Gene regulatory networks (GRN) inferred from gene expression data are considered an important aid for this research by providing a map of molecular interactions. Hence, GRNs have the potential enabling and enhancing basic as well as applied research in the life sciences. In this paper, we introduce a new method called BC3NET for inferring causal gene regulatory networks from large-scale gene expression data. BC3NET is an ensemble method that is based on bagging the C3NET algorithm, which means it corresponds to a Bayesian approach with noninformative priors. In this study we demonstrate for a variety of simulated and biological gene expression data from S. cerevisiae that BC3NET is an important enhancement over other inference methods that is capable of capturing biochemical interactions from transcription regulation and protein-protein interaction sensibly. An implementation of BC3NET is freely available as an R package from the CRAN repository
    corecore