8 research outputs found

    Identification of dilated cardiomyopathy signature genes through gene expression and network data integration

    Get PDF
    AbstractDilated cardiomyopathy (DCM) is a leading cause of heart failure (HF) and cardiac transplantations in Western countries. Single-source gene expression analysis studies have identified potential disease biomarkers and drug targets. However, because of the diversity of experimental settings and relative lack of data, concerns have been raised about the robustness and reproducibility of the predictions. This study presents the identification of robust and reproducible DCM signature genes based on the integration of several independent data sets and functional network information. Gene expression profiles from three public data sets containing DCM and non-DCM samples were integrated and analyzed, which allowed the implementation of clinical diagnostic models. Differentially expressed genes were evaluated in the context of a global protein–protein interaction network, constructed as part of this study. Potential associations with HF were identified by searching the scientific literature. From these analyses, classification models were built and their effectiveness in differentiating between DCM and non-DCM samples was estimated. The main outcome was a set of integrated, potentially novel DCM signature genes, which may be used as reliable disease biomarkers. An empirical demonstration of the power of the integrative classification models against single-source models is also given

    Linking Gene Expression and Functional Network Data in Human Heart Failure

    Get PDF
    BACKGROUND: Gene expression profiling and the analysis of protein-protein interaction (PPI) networks may support the identification of disease bio-markers and potential drug targets. Thus, a step forward in the development of systems approaches to medicine is the integrative analysis of these data sources in specific pathological conditions. We report such an integrative bioinformatics analysis in human heart failure (HF). A global PPI network in HF was assembled, which by itself represents a useful compendium of the current status of human HF-relevant interactions. This provided the basis for the analysis of interaction connectivity patterns in relation to a HF gene expression data set. RESULTS: Relationships between the significance of the differentiation of gene expression and connectivity degrees in the PPI network were established. In addition, relationships between gene co-expression and PPI network connectivity were analysed. Highly-connected proteins are not necessarily encoded by genes significantly differentially expressed. Genes that are not significantly differentially expressed may encode proteins that exhibit diverse network connectivity patterns. Furthermore, genes that were not defined as significantly differentially expressed may encode proteins with many interacting partners. Genes encoding network hubs may exhibit weak co-expression with the genes encoding their interacting protein partners. We also found that hubs and superhubs display a significant diversity of co-expression patterns in comparison to peripheral nodes. Gene Ontology (GO) analysis established that highly-connected proteins are likely to be engaged in higher level GO biological process terms, while low-connectivity proteins tend to be engaged in more specific disease-related processes. CONCLUSION: This investigation supports the hypothesis that the integrative analysis of differential gene expression and PPI network analysis may facilitate a better understanding of functional roles and the identification of potential drug targets in human heart failure

    Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease

    Get PDF
    AbstractThe discovery of novel disease biomarkers is a crucial challenge for translational bioinformatics. Demonstration of both their classification power and reproducibility across independent datasets are essential requirements to assess their potential clinical relevance. Small datasets and multiplicity of putative biomarker sets may explain lack of predictive reproducibility. Studies based on pathway-driven discovery approaches have suggested that, despite such discrepancies, the resulting putative biomarkers tend to be implicated in common biological processes. Investigations of this problem have been mainly focused on datasets derived from cancer research. We investigated the predictive andfunctional concordance of five methods for discovering putative biomarkers in four independently-generated datasets from the cardiovascular disease domain. A diversity of biosignatures was identified by the different methods. However, we found strong biological process concordance between them, especially in the case of methods based on gene set analysis. With a few exceptions, we observed lack of classification reproducibility using independent datasets. Partial overlaps between our putative sets of biomarkers and the primary studies exist. Despite the observed limitations, pathway-driven or gene set analysis can predict potentially novel biomarkers and can jointly point to biomedically-relevant underlying molecular mechanisms

    Computational Modeling of the Regulatory Network Organizing the Wound Response in Arabidopsis thaliana

    No full text
    Plants are frequently wounded by mechanical impact or by insects, and their ability to adequately respond to wounding is essential for their survival and reproductive success. The wound response is mediated by a signal transduction and regulatory network. Molecular studies in Arabidopsis have identified the COI1 gene as a central component of this network. Current models of these networks qualitatively describe the wound response, but they are not directly assessed using quantitative gene expression data. We built a model comprising the key components of the Arabidopsis wound response using the transsys framework. For comparison, we constructed a null model that is devoid of any regulatory interactions, and various alternative models by rewiring the wound response model. All models were parametrized by computational optimization to generate synthetic gene expression profiles that approximate the empirical data set. We scored the fit of the synthetic to the empirical data with various distance measures, and used the median distance after optimization to directly and quantitatively assess the wound response model and its alternatives. Discrimination of candidate models depends substantially on the measure of gene expression profile distance. Using the null model to assess quality of the distance measures for discrimination, we identify correlation of log-ratio profiles as the most suitable distance. Our wound response model fits the empirical data significantly better than the alternative models. Gradual perturbation of the wound response model results in a corresponding gradual decline in fit. The optimization approach provides insights into biologically relevant features, such as robustness. It is a step toward enabling integrative studies of multiple cross-talking pathways, and thus may help to develop our understanding how the genome informs the mapping of environmental signals to phenotypic traits. </jats:p
    corecore