518 research outputs found

    Data integration for microarrays: enhanced inference for gene regulatory networks

    Get PDF
    Microarray technologies have been the basis of numerous important findings regarding gene expression in the last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related e.g. to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple data sets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come

    Altered developmental programming of the mouse mammary gland in female offspring following perinatal dietary exposures : a systems-biology perspective.

    Get PDF
    Mishaps in prenatal development can influence mammary gland development and, ultimately, affect susceptibility to factors that cause breast cancer. This research was based on the underlying hypothesis that maternal dietary composition during pregnancy can alter developmental (fetal) programming of the mammary gland. We used a computational systems-biology approach and Bayesian-based stochastic search variable selection algorithm (SSVS) to identify differentially expressed genes and biological themes and pathways. Postnatal growth trajectories and gene expression in the mammary gland at 10-weeks of age in female mice were investigated following different maternal diet exposures during prenatal-lactational-early-juvenile development. This correlated a decrease in expression of energy pathways with a reciprocal increase in cytokine and inflammatory-signaling pathways. These findings suggest maternal dietary fat exposure significantly influences postnatal growth trajectories, metabolic programming, and signaling networks in the mammary gland of female offspring. In addition, the adipocytokine pathway may be a sensitive trigger to dietary changes and may influence or enhance activation of an immune response, a key event in cancer development

    KPP: KEGG Pathway Painter

    Full text link

    Systems biology approaches to a rational drug discovery paradigm

    Full text link
    The published manuscript is available at EurekaSelect via http://www.eurekaselect.com/openurl/content.php?genre=article&doi=10.2174/1568026615666150826114524.Prathipati P., Mizuguchi K.. Systems biology approaches to a rational drug discovery paradigm. Current Topics in Medicinal Chemistry, 16, 9, 1009. https://doi.org/10.2174/1568026615666150826114524

    Integrative Pathway Analysis Pipeline For Mirna And Mrna Data

    Get PDF
    The identification of pathways that are involved in a particular phenotype helps us understand the underlying biological processes. Traditional pathway analysis techniques aim to infer the impact on individual pathways using only mRNA levels. However, recent studies showed that gene expression alone is unable to capture the whole picture of biological phenomena. At the same time, MicroRNAs (miRNAs) are newly discovered gene regulators that have shown to play an important role in diagnosis, and prognosis for different types of diseases. Current pathway analysis techniques do not take miRNAs into consideration. In this project, we investigate the effect of integrating miRNA and mRNA expression in pathway analysis. In order to analyze biological pathways using miRNA expression data, we developed a novel method that augments KEGG pathways with microRNAs targeting genes. To validate our method, we analyzed nine GEO datasets. We also performed the analyses using just mRNA as well as using the integrative state-of-the-art method (microGraphite) to compare the results. In each case, we monitored the position of the pathway describing the given condition. We observed that our method outperforms the state-of-the-art approach

    Finding disease similarity based on implicit semantic similarity

    Get PDF
    AbstractGenomics has contributed to a growing collection of gene–function and gene–disease annotations that can be exploited by informatics to study similarity between diseases. This can yield insight into disease etiology, reveal common pathophysiology and/or suggest treatment that can be appropriated from one disease to another. Estimating disease similarity solely on the basis of shared genes can be misleading as variable combinations of genes may be associated with similar diseases, especially for complex diseases. This deficiency can be potentially overcome by looking for common biological processes rather than only explicit gene matches between diseases. The use of semantic similarity between biological processes to estimate disease similarity could enhance the identification and characterization of disease similarity. We present functions to measure similarity between terms in an ontology, and between entities annotated with terms drawn from the ontology, based on both co-occurrence and information content. The similarity measure is shown to outperform other measures used to detect similarity. A manually curated dataset with known disease similarities was used as a benchmark to compare the estimation of disease similarity based on gene-based and Gene Ontology (GO) process-based comparisons. The detection of disease similarity based on semantic similarity between GO Processes (Recall=55%, Precision=60%) performed better than using exact matches between GO Processes (Recall=29%, Precision=58%) or gene overlap (Recall=88% and Precision=16%). The GO-Process based disease similarity scores on an external test set show statistically significant Pearson correlation (0.73) with numeric scores provided by medical residents. GO-Processes associated with similar diseases were found to be significantly regulated in gene expression microarray datasets of related diseases

    An integrated analysis of molecular aberrations in NCI-60 cell lines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cancer is a complex disease where various types of molecular aberrations drive the development and progression of malignancies. Large-scale screenings of multiple types of molecular aberrations (e.g., mutations, copy number variations, DNA methylations, gene expressions) become increasingly important in the prognosis and study of cancer. Consequently, a computational model integrating multiple types of information is essential for the analysis of the comprehensive data.</p> <p>Results</p> <p>We propose an integrated modeling framework to identify the statistical and putative causal relations of various molecular aberrations and gene expressions in cancer. To reduce spurious associations among the massive number of probed features, we sequentially applied three layers of logistic regression models with increasing complexity and uncertainty regarding the possible mechanisms connecting molecular aberrations and gene expressions. Layer 1 models associate gene expressions with the molecular aberrations on the same loci. Layer 2 models associate expressions with the aberrations on different loci but have known mechanistic links. Layer 3 models associate expressions with nonlocal aberrations which have unknown mechanistic links. We applied the layered models to the integrated datasets of NCI-60 cancer cell lines and validated the results with large-scale statistical analysis. Furthermore, we discovered/reaffirmed the following prominent links: (1)Protein expressions are generally consistent with mRNA expressions. (2)Several gene expressions are modulated by composite local aberrations. For instance, CDKN2A expressions are repressed by either frame-shift mutations or DNA methylations. (3)Amplification of chromosome 6q in leukemia elevates the expression of MYB, and the downstream targets of MYB on other chromosomes are up-regulated accordingly. (4)Amplification of chromosome 3p and hypo-methylation of PAX3 together elevate MITF expression in melanoma, which up-regulates the downstream targets of MITF. (5)Mutations of TP53 are negatively associated with its direct target genes.</p> <p>Conclusions</p> <p>The analysis results on NCI-60 data justify the utility of the layered models for the incoming flow of cancer genomic data. Experimental validations on selected prominent links and application of the layered modeling framework to other integrated datasets will be carried out subsequently.</p

    Genomic applications of statistical signal processing

    Get PDF
    Biological phenomena in the cells can be explained in terms of the interactions among biological macro-molecules, e.g., DNAs, RNAs and proteins. These interactions can be modeled by genetic regulatory networks (GRNs). This dissertation proposes to reverse engineering the GRNs based on heterogeneous biological data sets, including time-series and time-independent gene expressions, Chromatin ImmunoPrecipatation (ChIP) data, gene sequence and motifs and other possible sources of knowledge. The objective of this research is to propose novel computational methods to catch pace with the fast evolving biological databases. Signal processing techniques are exploited to develop computationally efficient, accurate and robust algorithms, which deal individually or collectively with various data sets. Methods of power spectral density estimation are discussed to identify genes participating in various biological processes. Information theoretic methods are applied for non-parametric inference. Bayesian methods are adopted to incorporate several sources with prior knowledge. This work aims to construct an inference system which takes into account different sources of information such that the absence of some components will not interfere with the rest of the system. It has been verified that the proposed algorithms achieve better inference accuracy and higher computational efficiency compared with other state-of-the-art schemes, e.g. REVEAL, ARACNE, Bayesian Networks and Relevance Networks, at presence of artificial time series and steady state microarray measurements. The proposed algorithms are especially appealing when the the sample size is small. Besides, they are able to integrate multiple heterogeneous data sources, e.g. ChIP and sequence data, so that a unified GRN can be inferred. The analysis of biological literature and in silico experiments on real data sets for fruit fly, yeast and human have corroborated part of the inferred GRN. The research has also produced a set of potential control targets for designing gene therapy strategies
    corecore