An integrated approach of particle swarm optimization and support vector machine for gene signature selection and cancer prediction

Abstract

To improve cancer diagnosis and drug development, the classification of tumor types based on genomic information is important. As DNA micro array studies produce a large amount of data, expression data are highly redundant and noisy, and most genes are believed to be uninformative with respect to the studied classes. Only a fraction of genes may present distinct profiles for different classes of samples. Classification tools to deal with these issues are thus important. These tools should learn to robustly identify a subset of informative genes embedded in a large dataset that is contaminated with high dimensional noises. In this paper, an integrated approach of support vector machine (SVM) and particle swarm optimization (PSO) is proposed for this purpose. The proposed approach can simultaneously optimize the selection of feature subset and the classifier through a common solution coding mechanism. As an illustration, the proposed approach is applied to search the combinational gene signatures for predicting histologic response to chemotherapy of osteosarcoma patients. Cross validation results show that the proposed approach outperforms other existing methods in terms of classification accuracy. Further validation using an independent dataset shows misclassification of only one out of fourteen patient samples, suggesting that the selected gene signatures can reflect the chemoresistance in osteosarcoma

    Similar works