3,864 research outputs found
Recommended from our members
Recursive SVM Feature Selection and Sample Classification for Mass-Spectrometry and Microarray Data
Background: Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological classification problems. Feature selection and classification algorithms need to be robust to noise and outliers in the data. Results: We developed a recursive support vector machine (R-SVM) algorithm to select important genes/biomarkers for the classification of noisy data. We compared its performance to a similar, state-of-the-art method (SVM recursive feature elimination or SVM-RFE), paying special attention to the ability of recovering the true informative genes/biomarkers and the robustness to outliers in the data. Simulation experiments show that a 5 %-~20 % improvement over SVM-RFE can be achieved regard to these properties. The SVM-based methods are also compared with a conventional univariate method and their respective strengths and weaknesses are discussed. R-SVM was applied to two sets of SELDI-TOF-MS proteomics data, one from a human breast cancer study and the other from a study on rat liver cirrhosis. Important biomarkers found by the algorithm were validated by follow-up biological experiments. Conclusion: The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features. The multivariate SVM-based method outperforms the univariate method in the classification performance, but univariate methods can reveal more of the differentially expressed features especially when there are correlations between the features.Statistic
ANALISIS SENTIMEN PENGGUNA MEDIA SOSIAL TERHADAP KEBIJAKAN KENAIKAN PAJAK HIBURAN MENGGUNAKAN METODE SVM (SUPPORT VECTOR MACHINE)
Pemerintah Indonesia telah memberlakukan kenaikan pajak hiburan sebesar 40-75% atas aktivitas karaoke, diskotek, bar, dan mandi uap atau spa. melalui UU No 1 Tahun 2022 tentang Hubungan Keuangan Antara Pemerintah Pusat dan Pemerintahan Daerah (HKPD). Kebijakan ini menuai beragam sentimen dari masyarakat, baik pro maupun kontra. Berdasarkan permasalahan tersebut, penulis melakukan analisis ini untuk mengetahui sentimen masyarakat pada kebijakan kenaikan pajak hiburan dengan mengggunakan data yang didapatkan dari media sosial twitter. Metode yang dipakai adalah Support Vector Machine (SVM). Kemudian untuk mengukur kinerja klasifikasi SVM menggunakan metode RFE (Recursive Feature Elimination). Hasil penelitian menunjukkan bahwa pada metode SVM RFE (Recursive Feature Elimination) dengan nilai akurasi mencapai 95%, precision 99%, recall 94%, dan F1-Score 97%. Sedangkan hasil klasifikasi SVM tanpa menggunakan metode RFE dengan akurasi mencapai 93%, precission 85%, recall 94%, F1-Score 88%
Lower Limb Movements Recognition Based on Feature Recursive Elimination and Backpropagation Neural Network
Surface electromyographic (sEMG) signal serve as a signal source commonly
used for lower limb movement recognition, reflecting the intent of human
movement. However, it has been a challenge to improve the movements recognition
rate while using fewer features in this area of research area. In this paper, a
method for lower limb movements recognition based on recursive feature
elimination and backpropagation neural network of support vector machine is
proposed. First, the sEMG signal of five subjects performing eight different
lower limb movements was recorded using a BIOPAC collector. The optimal feature
subset consists of 25 feature vectors, determined using a Recursive Feature
Elimination based on Support Vector Machine (SVM-RFE). Finally, this study used
five supervised classification algorithms to recognize these eight different
lower limb movements. The results of the experimental study show that the
combination of the BPNN classifier and the SVM-RFE feature selection algorithm
is able to achieve an excellent action recognition accuracy of 95\%, which
provides sufficient support for the feasibility of this approach
Correcting for selection bias via cross-validation in the classification of microarray data
There is increasing interest in the use of diagnostic rules based on
microarray data. These rules are formed by considering the expression levels of
thousands of genes in tissue samples taken on patients of known classification
with respect to a number of classes, representing, say, disease status or
treatment strategy. As the final versions of these rules are usually based on a
small subset of the available genes, there is a selection bias that has to be
corrected for in the estimation of the associated error rates. We consider the
problem using cross-validation. In particular, we present explicit formulae
that are useful in explaining the layers of validation that have to be
performed in order to avoid improperly cross-validated estimates.Comment: Published in at http://dx.doi.org/10.1214/193940307000000284 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
Correlation and variable importance in random forests
This paper is about variable selection with the random forests algorithm in
presence of correlated predictors. In high-dimensional regression or
classification frameworks, variable selection is a difficult task, that becomes
even more challenging in the presence of highly correlated predictors. Firstly
we provide a theoretical study of the permutation importance measure for an
additive regression model. This allows us to describe how the correlation
between predictors impacts the permutation importance. Our results motivate the
use of the Recursive Feature Elimination (RFE) algorithm for variable selection
in this context. This algorithm recursively eliminates the variables using
permutation importance measure as a ranking criterion. Next various simulation
experiments illustrate the efficiency of the RFE algorithm for selecting a
small number of variables together with a good prediction error. Finally, this
selection algorithm is tested on the Landsat Satellite data from the UCI
Machine Learning Repository
Hybrid optimal feature selection approach for internet of things based medical data analysis for prognosis
Healthcare is very important application domain in internet of things (IoT). The aim is to provide a novel combined feature selection (FS) methods like univariate (UV) with tree-based methods (TB), recursive feature elimination (RFE) with least absolute shrinkage selection operator (LASSO), mutual information (MI) with genetic algorithm (GA) and embedded methods (EM) with univariate has been applied to internet of medical things (IoMT)based heart disease dataset. The well-suited machine learning algorithms for IoT medical data are logistic regression (LR) and support vector machine (SVM). Each combined method has been applied to the machine learning algorithms to find the best classifier for prognosis. The various performance metrices has been calculated for all the combined feature selection methods for logistic regression and support vector machine and found that for precise classification could be done using recursive elimination feature selection method with LASSO applied to logistic regression achieved a better performance than all other combined methods with high accuracy, sensitivity and high area under curve. Decision has been taken by data analytics that RFE+LASSO using LR feature selection method will provide an overall better performance for IoT based medical heart disease dataset after comparing all other combined methods with LR and SVM classifiers
- …
