Search CORE

3,864 research outputs found

Recommended from our members

Recursive SVM Feature Selection and Sample Classification for Mass-Spectrometry and Microarray Data

Author: Harris Lyndsay N
Iglehart James Dirk
Leung Hon-chiu E
Liu Jun
Lu Xin
Miron Alexander
Shi Qian
Wong Wing H.
Xu Xiu-qin
Zhang Xuegong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/09/2010
Field of study

Background: Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological classification problems. Feature selection and classification algorithms need to be robust to noise and outliers in the data. Results: We developed a recursive support vector machine (R-SVM) algorithm to select important genes/biomarkers for the classification of noisy data. We compared its performance to a similar, state-of-the-art method (SVM recursive feature elimination or SVM-RFE), paying special attention to the ability of recovering the true informative genes/biomarkers and the robustness to outliers in the data. Simulation experiments show that a 5 %-~20 % improvement over SVM-RFE can be achieved regard to these properties. The SVM-based methods are also compared with a conventional univariate method and their respective strengths and weaknesses are discussed. R-SVM was applied to two sets of SELDI-TOF-MS proteomics data, one from a human breast cancer study and the other from a study on rat liver cirrhosis. Important biomarkers found by the algorithm were validated by follow-up biological experiments. Conclusion: The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features. The multivariate SVM-based method outperforms the univariate method in the classification performance, but univariate methods can reveal more of the differentially expressed features especially when there are correlations between the features.Statistic

Harvard University - DASH

ANALISIS SENTIMEN PENGGUNA MEDIA SOSIAL TERHADAP KEBIJAKAN KENAIKAN PAJAK HIBURAN MENGGUNAKAN METODE SVM (SUPPORT VECTOR MACHINE)

Author: Isnain Auliya Rahman
Romadhona Waldy
Publication venue: STKIP PGRI Tulungagung
Publication date: 19/11/2024
Field of study

Pemerintah Indonesia telah memberlakukan kenaikan pajak hiburan sebesar 40-75% atas aktivitas karaoke, diskotek, bar, dan mandi uap atau spa. melalui UU No 1 Tahun 2022 tentang Hubungan Keuangan Antara Pemerintah Pusat dan Pemerintahan Daerah (HKPD). Kebijakan ini menuai beragam sentimen dari masyarakat, baik pro maupun kontra. Berdasarkan permasalahan tersebut, penulis melakukan analisis ini untuk mengetahui sentimen masyarakat pada kebijakan kenaikan pajak hiburan dengan mengggunakan data yang didapatkan dari media sosial twitter. Metode yang dipakai adalah Support Vector Machine (SVM). Kemudian untuk mengukur kinerja klasifikasi SVM menggunakan metode RFE (Recursive Feature Elimination). Hasil penelitian menunjukkan bahwa pada metode SVM RFE (Recursive Feature Elimination) dengan nilai akurasi mencapai 95%, precision 99%, recall 94%, dan F1-Score 97%. Sedangkan hasil klasifikasi SVM tanpa menggunakan metode RFE dengan akurasi mencapai 93%, precission 85%, recall 94%, F1-Score 88%

JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika)

Lower Limb Movements Recognition Based on Feature Recursive Elimination and Backpropagation Neural Network

Author: Chen Zekun
Liang Shili
Ma Yongkai
Publication venue
Publication date: 17/04/2024
Field of study

Surface electromyographic (sEMG) signal serve as a signal source commonly used for lower limb movement recognition, reflecting the intent of human movement. However, it has been a challenge to improve the movements recognition rate while using fewer features in this area of research area. In this paper, a method for lower limb movements recognition based on recursive feature elimination and backpropagation neural network of support vector machine is proposed. First, the sEMG signal of five subjects performing eight different lower limb movements was recorded using a BIOPAC collector. The optimal feature subset consists of 25 feature vectors, determined using a Recursive Feature Elimination based on Support Vector Machine (SVM-RFE). Finally, this study used five supervised classification algorithms to recognize these eight different lower limb movements. The results of the experimental study show that the combination of the BPNN classifier and the SVM-RFE feature selection algorithm is able to achieve an excellent action recognition accuracy of 95\%, which provides sufficient support for the feasibility of this approach

arXiv.org e-Print Archive

Correcting for selection bias via cross-validation in the classification of microarray data

Author: Chevelu J.
McLachlan G. J.
Zhu J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

There is increasing interest in the use of diagnostic rules based on microarray data. These rules are formed by considering the expression levels of thousands of genes in tissue samples taken on patients of known classification with respect to a number of classes, representing, say, disease status or treatment strategy. As the final versions of these rules are usually based on a small subset of the available genes, there is a selection bias that has to be corrected for in the estimation of the associated error rates. We consider the problem using cross-validation. In particular, we present explicit formulae that are useful in explaining the layers of validation that have to be performed in order to avoid improperly cross-validated estimates.Comment: Published in at http://dx.doi.org/10.1214/193940307000000284 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

UQ eSpace (University of Queensland)

Correlation and variable importance in random forests

Author: Gregorutti Baptiste
Michel Bertrand
Saint-Pierre Philippe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/04/2016
Field of study

This paper is about variable selection with the random forests algorithm in presence of correlated predictors. In high-dimensional regression or classification frameworks, variable selection is a difficult task, that becomes even more challenging in the presence of highly correlated predictors. Firstly we provide a theoretical study of the permutation importance measure for an additive regression model. This allows us to describe how the correlation between predictors impacts the permutation importance. Our results motivate the use of the Recursive Feature Elimination (RFE) algorithm for variable selection in this context. This algorithm recursively eliminates the variables using permutation importance measure as a ranking criterion. Next various simulation experiments illustrate the efficiency of the RFE algorithm for selecting a small number of variables together with a good prediction error. Finally, this selection algorithm is tested on the Landsat Satellite data from the UCI Machine Learning Repository

arXiv.org e-Print Archive

Portail HAL Nantes Université

HAL: Hyper Article en Ligne

Hybrid optimal feature selection approach for internet of things based medical data analysis for prognosis

Author: Bel Felcia
Selvaraj Sabeen
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/06/2024
Field of study

Healthcare is very important application domain in internet of things (IoT). The aim is to provide a novel combined feature selection (FS) methods like univariate (UV) with tree-based methods (TB), recursive feature elimination (RFE) with least absolute shrinkage selection operator (LASSO), mutual information (MI) with genetic algorithm (GA) and embedded methods (EM) with univariate has been applied to internet of medical things (IoMT)based heart disease dataset. The well-suited machine learning algorithms for IoT medical data are logistic regression (LR) and support vector machine (SVM). Each combined method has been applied to the machine learning algorithms to find the best classifier for prognosis. The various performance metrices has been calculated for all the combined feature selection methods for logistic regression and support vector machine and found that for precise classification could be done using recursive elimination feature selection method with LASSO applied to logistic regression achieved a better performance than all other combined methods with high accuracy, sensitivity and high area under curve. Decision has been taken by data analytics that RFE+LASSO using LR feature selection method will provide an overall better performance for IoT based medical heart disease dataset after comparing all other combined methods with LR and SVM classifiers

IAES International Journal of Artificial Intelligence (IJ-AI)