144,683 research outputs found
Recommended from our members
Thumbs up, thumbs down:non-verbal human-robot interaction through real-time EMG classification via inductive and supervised transductive transfer learning
In this study, we present a transfer learning method for gesture classification via an inductive and supervised transductive approach with an electromyographic dataset gathered via the Myo armband. A ternary gesture classification problem is presented by states of ’thumbs up’, ’thumbs down’, and ’relax’ in order to communicate in the affirmative or negative in a non-verbal fashion to a machine. Of the nine statistical learning paradigms benchmarked over 10-fold cross validation (with three methods of feature selection), an ensemble of Random Forest and Support Vector Machine through voting achieves the best score of 91.74% with a rule-based feature selection method. When new subjects are considered, this machine learning approach fails to generalise new data, and thus the processes of Inductive and Supervised Transductive Transfer Learning are introduced with a short calibration exercise (15 s). Failure of generalisation shows that 5 s of data per-class is the strongest for classification (versus one through seven seconds) with only an accuracy of 55%, but when a short 5 s per class calibration task is introduced via the suggested transfer method, a Random Forest can then classify unseen data from the calibrated subject at an accuracy of around 97%, outperforming the 83% accuracy boasted by the proprietary Myo system. Finally, a preliminary application is presented through social interaction with a humanoid Pepper robot, where the use of our approach and a most-common-class metaclassifier achieves 100% accuracy for all trials of a ‘20 Questions’ game
High-dimensional and one-class classification
When dealing with high-dimensional data and, in particular, when the number of attributes p is large comparatively to the sample size n, several classification methods cannot be applied. Fisher's linear discriminant rule or the quadratic discriminant one are unfeasible, as the inverse of the involved covariance matrices cannot be computed. A recent approach to overcome this problem is based on Random Projections (RPs), which have emerged as a powerful method for dimensionality reduction. In 2017, Cannings and Samworth introduced the RP method in the ensemble context to extend to the high-dimensional domain classification methods originally designed for low-dimensional data. Although the RP ensemble classifier allows improving classification accuracy, it may still include redundant information. Moreover, differently from other ensemble classifiers (e.g. Random Forest), it does not provide any insight on the actual classification importance of the input features. To account for these aspects, in the first part of this thesis, we investigate two new directions of the RP ensemble classifier. Firstly, combining the original idea of using the Multiplicative Binomial distribution as the reference model to describe and predict the ensemble accuracy and an important result on such distribution, we introduce a stepwise strategy for post-pruning (called Ensemble Selection Algorithm). Secondly, we propose a criterion (called Variable Importance in Projection) that uses the feature coefficients in the best discriminant projections to measure the variable importance in classification. In the second part, we faced the new challenges posed by the high-dimensional data in a recently emerging classification context: one-class classification. This is a special classification task, where only one class is fully known (the target class), while the information on the others is completely missing. In particular, we address this task by using Gini's transvariation probability as a measure of typicality, aimed at identifying the best boundary around the target class
On Machine-Learned Classification of Variable Stars with Sparse and Noisy Time-Series Data
With the coming data deluge from synoptic surveys, there is a growing need
for frameworks that can quickly and automatically produce calibrated
classification probabilities for newly-observed variables based on a small
number of time-series measurements. In this paper, we introduce a methodology
for variable-star classification, drawing from modern machine-learning
techniques. We describe how to homogenize the information gleaned from light
curves by selection and computation of real-numbered metrics ("feature"),
detail methods to robustly estimate periodic light-curve features, introduce
tree-ensemble methods for accurate variable star classification, and show how
to rigorously evaluate the classification results using cross validation. On a
25-class data set of 1542 well-studied variable stars, we achieve a 22.8%
overall classification error using the random forest classifier; this
represents a 24% improvement over the best previous classifier on these data.
This methodology is effective for identifying samples of specific science
classes: for pulsational variables used in Milky Way tomography we obtain a
discovery efficiency of 98.2% and for eclipsing systems we find an efficiency
of 99.1%, both at 95% purity. We show that the random forest (RF) classifier is
superior to other machine-learned methods in terms of accuracy, speed, and
relative immunity to features with no useful class information; the RF
classifier can also be used to estimate the importance of each feature in
classification. Additionally, we present the first astronomical use of
hierarchical classification methods to incorporate a known class taxonomy in
the classifier, which further reduces the catastrophic error rate to 7.8%.
Excluding low-amplitude sources, our overall error rate improves to 14%, with a
catastrophic error rate of 3.5%.Comment: 23 pages, 9 figure
Combining Static and Dynamic Features for Multivariate Sequence Classification
Model precision in a classification task is highly dependent on the feature
space that is used to train the model. Moreover, whether the features are
sequential or static will dictate which classification method can be applied as
most of the machine learning algorithms are designed to deal with either one or
another type of data. In real-life scenarios, however, it is often the case
that both static and dynamic features are present, or can be extracted from the
data. In this work, we demonstrate how generative models such as Hidden Markov
Models (HMM) and Long Short-Term Memory (LSTM) artificial neural networks can
be used to extract temporal information from the dynamic data. We explore how
the extracted information can be combined with the static features in order to
improve the classification performance. We evaluate the existing techniques and
suggest a hybrid approach, which outperforms other methods on several public
datasets.Comment: Presented at IEEE DSAA 201
- …