50,897 research outputs found
Binarized support vector machines
The widely used Support Vector Machine (SVM) method has shown to yield very good results in
Supervised Classification problems. Other methods such as Classification Trees have become
more popular among practitioners than SVM thanks to their interpretability, which is an important
issue in Data Mining.
In this work, we propose an SVM-based method that automatically detects the most important
predictor variables, and the role they play in the classifier. In particular, the proposed method is
able to detect those values and intervals which are critical for the classification. The method
involves the optimization of a Linear Programming problem, with a large number of decision
variables. The numerical experience reported shows that a rather direct use of the standard
Column-Generation strategy leads to a classification method which, in terms of classification
ability, is competitive against the standard linear SVM and Classification Trees. Moreover, the
proposed method is robust, i.e., it is stable in the presence of outliers and invariant to change of
scale or measurement units of the predictor variables.
When the complexity of the classifier is an important issue, a wrapper feature selection method is
applied, yielding simpler, still competitive, classifiers
A real time classification algorithm for EEG-based BCI driven by self-induced emotions
Background and objective: The aim of this paper is to provide an efficient, parametric, general, and completely automatic real time classification method of electroencephalography (EEG) signals obtained from self-induced emotions. The particular characteristics of the considered low-amplitude signals (a self-induced emotion produces a signal whose amplitude is about 15% of a really experienced emotion) require exploring and adapting strategies like the Wavelet Transform, the Principal Component Analysis (PCA) and the Support Vector Machine (SVM) for signal processing, analysis and classification. Moreover, the method is thought to be used in a multi-emotions based Brain Computer Interface (BCI) and, for this reason, an ad hoc shrewdness is assumed. Method: The peculiarity of the brain activation requires ad-hoc signal processing by wavelet decomposition, and the definition of a set of features for signal characterization in order to discriminate different self-induced emotions. The proposed method is a two stages algorithm, completely parameterized, aiming at a multi-class classification and may be considered in the framework of machine learning. The first stage, the calibration, is off-line and is devoted at the signal processing, the determination of the features and at the training of a classifier. The second stage, the real-time one, is the test on new data. The PCA theory is applied to avoid redundancy in the set of features whereas the classification of the selected features, and therefore of the signals, is obtained by the SVM. Results: Some experimental tests have been conducted on EEG signals proposing a binary BCI, based on the self-induced disgust produced by remembering an unpleasant odor. Since in literature it has been shown that this emotion mainly involves the right hemisphere and in particular the T8 channel, the classification procedure is tested by using just T8, though the average accuracy is calculated and reported also for the whole set of the measured channels. Conclusions: The obtained classification results are encouraging with percentage of success that is, in the average for the whole set of the examined subjects, above 90%. An ongoing work is the application of the proposed procedure to map a large set of emotions with EEG and to establish the EEG headset with the minimal number of channels to allow the recognition of a significant range of emotions both in the field of affective computing and in the development of auxiliary communication tools for subjects affected by severe disabilities
Detection of atrial fibrillation episodes in long-term heart rhythm signals using a support vector machine
Atrial fibrillation (AF) is a serious heart arrhythmia leading to a significant increase of the risk for occurrence of ischemic stroke. Clinically, the AF episode is recognized in an electrocardiogram. However, detection of asymptomatic AF, which requires a long-term monitoring, is more efficient when based on irregularity of beat-to-beat intervals estimated by the heart rate (HR) features. Automated classification of heartbeats into AF and non-AF by means of the Lagrangian Support Vector Machine has been proposed. The classifier input vector consisted of sixteen features, including four coefficients very sensitive to beat-to-beat heart changes, taken from the fetal heart rate analysis in perinatal medicine. Effectiveness of the proposed classifier has been verified on the MIT-BIH Atrial Fibrillation Database. Designing of the LSVM classifier using very large number of feature vectors requires extreme computational efforts. Therefore, an original approach has been proposed to determine a training set of the smallest possible size that still would guarantee a high quality of AF detection. It enables to obtain satisfactory results using only 1.39% of all heartbeats as the training data. Post-processing stage based on aggregation of classified heartbeats into AF episodes has been applied to provide more reliable information on patient risk. Results obtained during the testing phase showed the sensitivity of 98.94%, positive predictive value of 98.39%, and classification accuracy of 98.86%.Web of Science203art. no. 76
Intrusion Detection in Mobile Ad Hoc Networks Using Classification Algorithms
In this paper we present the design and evaluation of intrusion detection
models for MANETs using supervised classification algorithms. Specifically, we
evaluate the performance of the MultiLayer Perceptron (MLP), the Linear
classifier, the Gaussian Mixture Model (GMM), the Naive Bayes classifier and
the Support Vector Machine (SVM). The performance of the classification
algorithms is evaluated under different traffic conditions and mobility
patterns for the Black Hole, Forging, Packet Dropping, and Flooding attacks.
The results indicate that Support Vector Machines exhibit high accuracy for
almost all simulated attacks and that Packet Dropping is the hardest attack to
detect.Comment: 12 pages, 7 figures, presented at MedHocNet 200
Classification of Radiology Reports Using Neural Attention Models
The electronic health record (EHR) contains a large amount of
multi-dimensional and unstructured clinical data of significant operational and
research value. Distinguished from previous studies, our approach embraces a
double-annotated dataset and strays away from obscure "black-box" models to
comprehensive deep learning models. In this paper, we present a novel neural
attention mechanism that not only classifies clinically important findings.
Specifically, convolutional neural networks (CNN) with attention analysis are
used to classify radiology head computed tomography reports based on five
categories that radiologists would account for in assessing acute and
communicable findings in daily practice. The experiments show that our CNN
attention models outperform non-neural models, especially when trained on a
larger dataset. Our attention analysis demonstrates the intuition behind the
classifier's decision by generating a heatmap that highlights attended terms
used by the CNN model; this is valuable when potential downstream medical
decisions are to be performed by human experts or the classifier information is
to be used in cohort construction such as for epidemiological studies
DC-Prophet: Predicting Catastrophic Machine Failures in DataCenters
When will a server fail catastrophically in an industrial datacenter? Is it
possible to forecast these failures so preventive actions can be taken to
increase the reliability of a datacenter? To answer these questions, we have
studied what are probably the largest, publicly available datacenter traces,
containing more than 104 million events from 12,500 machines. Among these
samples, we observe and categorize three types of machine failures, all of
which are catastrophic and may lead to information loss, or even worse,
reliability degradation of a datacenter. We further propose a two-stage
framework-DC-Prophet-based on One-Class Support Vector Machine and Random
Forest. DC-Prophet extracts surprising patterns and accurately predicts the
next failure of a machine. Experimental results show that DC-Prophet achieves
an AUC of 0.93 in predicting the next machine failure, and a F3-score of 0.88
(out of 1). On average, DC-Prophet outperforms other classical machine learning
methods by 39.45% in F3-score.Comment: 13 pages, 5 figures, accepted by 2017 ECML PKD
- …