7 research outputs found

    Improvement of Reproducibility in Cancer Classification Based on Pathway Markers and Subnetwork Markers

    Get PDF
    Identification of robust biomarkers for cancer prognosis based on gene expression data is an important research problem in translational genomics. The high-dimensional and small-sample-size data setting makes the prediction of biomarkers very challenging. Biomarkers have been identified based solely on gene expression data in the early stage. However, very few of them are jointly shared among independent studies. To overcome this irreproducibility, the integrative approach has been proposed to identify better biomarkers by overlaying gene expression data with available biological knowledge and investigating genes at the modular level. These module-based markers jointly analyze the gene expression activities of closely associated genes; for example, those that belong to a common biological pathway or genes whose protein products form a subnetwork module in a protein-protein interaction network. Several studies have shown that modular biomarkers lead to more accurate and reproducible prognostic predictions than single-gene markers and also provide the better understanding of the disease mechanisms. We propose novel methods for identifying modular markers which can be used to predict breast cancer prognosis. First, to improve identification of pathway markers, we propose using probabilistic pathway activity inference and relative expression analysis. Then, we propose a new method to identify subnetwork markers based on a message-passing clustering algorithm, and we further improve this method by incorporating topological attribute using association coefficients. Through extensive evaluations using multiple publicly available datasets, we demonstrate that all of the proposed methods can identify modular markers that are more reliable and reproducible across independent datasets compared to those identified by existing methods, hence they have the potential to become more effective prognostic cancer classifiers

    Incorporating topological information for predicting robust cancer subnetwork markers in human protein-protein interaction network

    Get PDF
    BACKGROUND: Discovering robust markers for cancer prognosis based on gene expression data is an important yet challenging problem in translational bioinformatics. By integrating additional information in biological pathways or a protein-protein interaction (PPI) network, we can find better biomarkers that lead to more accurate and reproducible prognostic predictions. In fact, recent studies have shown that, “modular markers,” that integrate multiple genes with potential interactions can improve disease classification and also provide better understanding of the disease mechanisms. RESULTS: In this work, we propose a novel algorithm for finding robust and effective subnetwork markers that can accurately predict cancer prognosis. To simultaneously discover multiple synergistic subnetwork markers in a human PPI network, we build on our previous work that uses affinity propagation, an efficient clustering algorithm based on a message-passing scheme. Using affinity propagation, we identify potential subnetwork markers that consist of discriminative genes that display coherent expression patterns and whose protein products are closely located on the PPI network. Furthermore, we incorporate the topological information from the PPI network to evaluate the potential of a given set of proteins to be involved in a functional module. Primarily, we adopt widely made assumptions that densely connected subnetworks may likely be potential functional modules and that proteins that are not directly connected but interact with similar sets of other proteins may share similar functionalities. CONCLUSIONS: Incorporating topological attributes based on these assumptions can enhance the prediction of potential subnetwork markers. We evaluate the performance of the proposed subnetwork marker identification method by performing classification experiments using multiple independent breast cancer gene expression datasets and PPI networks. We show that our method leads to the discovery of robust subnetwork markers that can improve cancer classification. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1224-1) contains supplementary material, which is available to authorized users

    Additional file 1 of Incorporating topological information for predicting robust cancer subnetwork markers in human protein-protein interaction network

    No full text
    Supplementary materials. Figure S1: Discriminative power of subnetwork markers identified on NKI295 by different methods. Figure S2: Discriminative power of subnetwork markers across independent gene expression datasets. (PDF 1260 kb
    corecore