Skip to main content
Article thumbnail
Location of Repository

Sparse Representation for Classification of Tumors Using Gene Expression Data

By Xiyi Hang and Fang-Xiang Wu


Personalized drug design requires the classification of cancer patients as accurate as possible. With advances in genome sequencing and microarray technology, a large amount of gene expression data has been and will continuously be produced from various cancerous patients. Such cancer-alerted gene expression data allows us to classify tumors at the genomewide level. However, cancer-alerted gene expression datasets typically have much more number of genes (features) than that of samples (patients), which imposes a challenge for classification of tumors. In this paper, a new method is proposed for cancer diagnosis using gene expression data by casting the classification problem as finding sparse representations of test samples with respect to training samples. The sparse representation is computed by the l1-regularized least square method. To investigate its performance, the proposed method is applied to six tumor gene expression datasets and compared with various support vector machine (SVM) methods. The experimental results have shown that the performance of the proposed method is comparable with or better than those of SVMs. In addition, the proposed method is more efficient than SVMs as it has no need of model selection

Topics: Research Article
Publisher: Hindawi Publishing Corporation
OAI identifier:
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles


    1. (2005). A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis,”
    2. (2007). A review of feature selection techniques in bioinformatics,”
    3. (2003). An analytical method for multiclass molecular cancer classification,”
    4. (2007). An interior-point method for large-scale l1-regularized least squares,”
    5. (2001). Atomic decomposition by basis pursuit,”
    6. (2003). C.L.Nutt,D.R.Mani,R.A.Betensky,etal.,“Geneexpressionbasedclassificationofmalignantgliomascorrelatesbetterwith survival than histological classification,”
    7. (2001). Chemosensitivity prediction by transcriptional profiling,”
    8. (2003). Classification of clear-cell sarcoma as a subtype of melanoma by genomic profiling,”JournalofClinicalOncology,vol.21,no.9,pp.1775– 1781,
    9. (2001). Classification of human lung carcinomas by mRNA expression profilingrevealsdistinctadenocarcinomasubclasses,”Proceedings of the
    10. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data,”
    11. (2006). Compressed sensing,”
    12. (2002). Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling andsupervisedmachinelearning,”Nature
    13. (2002). Gene expression correlates of clinical prostate cancer behavior,”
    14. (2004). Gene expression profiling of colon cancer by DNA microarrays and correlation with histoclinical parameters,”
    15. (2005). h e n ,Z .L i u ,X .M a ,a n dD .H u a ,“ S e l e c t i n gg e n e sb yt e s t statistics,”
    16. (2006). l1—magic: a collection of MATLAB routines for solving the convex optimization programs central to compressive sampling,”
    17. (2000). Large margin DAGs for multiclass classification,”
    18. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring,”
    19. (2001). Molecular classification of human carcinomas by use of gene expression signatures,”
    20. (2001). Multiclass cancer diagnosis using tumor gene expression signatures,”
    21. (2006). Near-optimal signal recovery from random projections: universal encoding strategies?”
    22. (2003). Nonparametric Statistical Inference,
    23. (2000). On the learnability and design of output codes for multiclass problems,”
    24. (1999). Pairwise classification and support vector machines,”
    25. (2002). Prediction of central nervous system embryonal tumour outcome based on gene expression,”
    26. (1996). Regression shrinkage and selection via the Lasso,”
    27. (2009). Robust face recognition via sparse representation,”
    28. (2006). Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,”
    29. (2002). S a w i r i s ,C .A .S h e r m a n - B a u s t ,K .G .B e c k e r ,C .C h e a d l e
    30. (1999). Support vector machines for multi-classpatternrecognition,”inProceedingsofthe7thEuropean
    31. (2003). v a n ’ tV e e r ,H .D a i ,M .J .v a nd eV i j v e r ,e ta l . ,“ E x p r e s s i o n profiling predicts outcome in breast cancer,”
    32. (2002). y r s k j ø t ,T .T h y k j a e r ,M .K r u h ø ffer, et al., “Identifying distinct classes of bladder carcinoma using microarrays,”

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.