Abstract

Background: MicroRNA (miRNA) expression profiling data has recently been found to be particularly important in cancer research and can be used as a diagnostic and prognostic tool. Current approaches of tumor classification using miRNA expression data do not integrate the experimental knowledge available in the literature. A judicious integration of such knowledge with effective miRNA and sample selection through a biclustering approach could be an important step in improving the accuracy of tumor classification. Results: In this article, a novel classification technique called SFSSClass is developed that judiciously integrates a biclustering technique SAMBA for simultaneous feature (miRNA) and sample (tissue) selection (SFSS), a cancer-miRNA network that we have developed by mining the literature of experimentally verified cancer-miRNA relationships and a classifier uncorrelated shrunken centroid (USC). SFSSClass is used for classifying multiple classes of tumors and cancer cell lines. In a part of the investigation, poorly differentiated tumors (PDT) having non diagnostic histological appearance are classified while training on more differentiated tumor (MDT) samples. The proposed method is found to outperform the best known accuracy in the literature on the experimental data sets. For example, while the best accuracy reported in the literature for classifying PDT samples is similar to 76.5%, the accuracy of SFSSClass is found to be similar to 82.3%. The advantage of incorporating biclustering integrated with the cancer-miRNA network is evident from the consistently better performance of SFSSClass (integration of SAMBA, cancer-miRNA network and USC) over USC (eg., similar to 70.5% for SFSSClass versus similar to 58.8% in classifying a set of 17 MDT samples from 9 tumor types, similar to 91.7% for SFSSClass versus similar to 75% in classifying 12 cell lines from 6 tumor types and similar to 382.3% for SFSSClass versus similar to 41.2% in classifying 17 PDT samples from 11 tumor types). Conclusion: In this article, we develop the SFSSClass algorithm which judiciously integrates a biclustering technique for simultaneous feature (miRNA) and sample (tissue) selection, the cancer-miRNA network and a classifier. The novel integration of experimental knowledge with computational tools efficiently selects relevant features that have high intra-class and low interclass similarity. The performance of the SFSSClass is found to be significantly improved with respect to the other existing approaches

    Similar works