24,642 research outputs found

    Feature Selection for Gene Expression Using Model-Based Entropy

    Full text link

    Feature selection for microarray gene expression data using simulated annealing guided by the multivariate joint entropy

    Get PDF
    In this work a new way to calculate the multivariate joint entropy is presented. This measure is the basis for a fast information-theoretic based evaluation of gene relevance in a Microarray Gene Expression data context. Its low complexity is based on the reuse of previous computations to calculate current feature relevance. The mu-TAFS algorithm --named as such to differentiate it from previous TAFS algorithms-- implements a simulated annealing technique specially designed for feature subset selection. The algorithm is applied to the maximization of gene subset relevance in several public-domain microarray data sets. The experimental results show a notoriously high classification performance and low size subsets formed by biologically meaningful genes.Postprint (published version

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    Intra-tumour signalling entropy determines clinical outcome in breast and lung cancer.

    Get PDF
    The cancer stem cell hypothesis, that a small population of tumour cells are responsible for tumorigenesis and cancer progression, is becoming widely accepted and recent evidence has suggested a prognostic and predictive role for such cells. Intra-tumour heterogeneity, the diversity of the cancer cell population within the tumour of an individual patient, is related to cancer stem cells and is also considered a potential prognostic indicator in oncology. The measurement of cancer stem cell abundance and intra-tumour heterogeneity in a clinically relevant manner however, currently presents a challenge. Here we propose signalling entropy, a measure of signalling pathway promiscuity derived from a sample's genome-wide gene expression profile, as an estimate of the stemness of a tumour sample. By considering over 500 mixtures of diverse cellular expression profiles, we reveal that signalling entropy also associates with intra-tumour heterogeneity. By analysing 3668 breast cancer and 1692 lung adenocarcinoma samples, we further demonstrate that signalling entropy correlates negatively with survival, outperforming leading clinical gene expression based prognostic tools. Signalling entropy is found to be a general prognostic measure, valid in different breast cancer clinical subgroups, as well as within stage I lung adenocarcinoma. We find that its prognostic power is driven by genes involved in cancer stem cells and treatment resistance. In summary, by approximating both stemness and intra-tumour heterogeneity, signalling entropy provides a powerful prognostic measure across different epithelial cancers
    • 

    corecore