2,642 research outputs found
A pragmatic approach to multi-class classification
We present a novel hierarchical approach to multi-class classification which
is generic in that it can be applied to different classification models (e.g.,
support vector machines, perceptrons), and makes no explicit assumptions about
the probabilistic structure of the problem as it is usually done in multi-class
classification. By adding a cascade of additional classifiers, each of which
receives the previous classifier's output in addition to regular input data,
the approach harnesses unused information that manifests itself in the form of,
e.g., correlations between predicted classes. Using multilayer perceptrons as a
classification model, we demonstrate the validity of this approach by testing
it on a complex ten-class 3D gesture recognition task.Comment: European Symposium on artificial neural networks (ESANN), Apr 2015,
Bruges, Belgium. 201
Learning the LMP-Load Coupling From Data: A Support Vector Machine Based Approach
This paper investigates the fundamental coupling between loads and locational
marginal prices (LMPs) in security-constrained economic dispatch (SCED).
Theoretical analysis based on multi-parametric programming theory points out
the unique one-to-one mapping between load and LMP vectors. Such one-to-one
mapping is depicted by the concept of system pattern region (SPR) and
identifying SPRs is the key to understanding the LMP-load coupling. Built upon
the characteristics of SPRs, the SPR identification problem is modeled as a
classification problem from a market participant's viewpoint, and a Support
Vector Machine based data-driven approach is proposed. It is shown that even
without the knowledge of system topology and parameters, the SPRs can be
estimated by learning from historical load and price data. Visualization and
illustration of the proposed data-driven approach are performed on a 3-bus
system as well as the IEEE 118-bus system
Online Robot Introspection via Wrench-based Action Grammars
Robotic failure is all too common in unstructured robot tasks. Despite
well-designed controllers, robots often fail due to unexpected events. How do
robots measure unexpected events? Many do not. Most robots are driven by the
sense-plan act paradigm, however more recently robots are undergoing a
sense-plan-act-verify paradigm. In this work, we present a principled
methodology to bootstrap online robot introspection for contact tasks. In
effect, we are trying to enable the robot to answer the question: what did I
do? Is my behavior as expected or not? To this end, we analyze noisy wrench
data and postulate that the latter inherently contains patterns that can be
effectively represented by a vocabulary. The vocabulary is generated by
segmenting and encoding the data. When the wrench information represents a
sequence of sub-tasks, we can think of the vocabulary forming a sentence (set
of words with grammar rules) for a given sub-task; allowing the latter to be
uniquely represented. The grammar, which can also include unexpected events,
was classified in offline and online scenarios as well as for simulated and
real robot experiments. Multiclass Support Vector Machines (SVMs) were used
offline, while online probabilistic SVMs were are used to give temporal
confidence to the introspection result. The contribution of our work is the
presentation of a generalizable online semantic scheme that enables a robot to
understand its high-level state whether nominal or abnormal. It is shown to
work in offline and online scenarios for a particularly challenging contact
task: snap assemblies. We perform the snap assembly in one-arm simulated and
real one-arm experiments and a simulated two-arm experiment. This verification
mechanism can be used by high-level planners or reasoning systems to enable
intelligent failure recovery or determine the next most optima manipulation
skill to be used.Comment: arXiv admin note: substantial text overlap with arXiv:1609.0494
On Machine-Learned Classification of Variable Stars with Sparse and Noisy Time-Series Data
With the coming data deluge from synoptic surveys, there is a growing need
for frameworks that can quickly and automatically produce calibrated
classification probabilities for newly-observed variables based on a small
number of time-series measurements. In this paper, we introduce a methodology
for variable-star classification, drawing from modern machine-learning
techniques. We describe how to homogenize the information gleaned from light
curves by selection and computation of real-numbered metrics ("feature"),
detail methods to robustly estimate periodic light-curve features, introduce
tree-ensemble methods for accurate variable star classification, and show how
to rigorously evaluate the classification results using cross validation. On a
25-class data set of 1542 well-studied variable stars, we achieve a 22.8%
overall classification error using the random forest classifier; this
represents a 24% improvement over the best previous classifier on these data.
This methodology is effective for identifying samples of specific science
classes: for pulsational variables used in Milky Way tomography we obtain a
discovery efficiency of 98.2% and for eclipsing systems we find an efficiency
of 99.1%, both at 95% purity. We show that the random forest (RF) classifier is
superior to other machine-learned methods in terms of accuracy, speed, and
relative immunity to features with no useful class information; the RF
classifier can also be used to estimate the importance of each feature in
classification. Additionally, we present the first astronomical use of
hierarchical classification methods to incorporate a known class taxonomy in
the classifier, which further reduces the catastrophic error rate to 7.8%.
Excluding low-amplitude sources, our overall error rate improves to 14%, with a
catastrophic error rate of 3.5%.Comment: 23 pages, 9 figure
A study of hierarchical and flat classification of proteins
Automatic classification of proteins using machine learning is an important problem that has received significant attention in the literature. One feature of this problem is that expert-defined hierarchies of protein classes exist and can potentially be exploited to improve classification performance. In this article we investigate empirically whether this is the case for two such hierarchies. We compare multi-class classification techniques that exploit the information in those class hierarchies and those that do not, using logistic regression, decision trees, bagged decision trees, and support vector machines as the underlying base learners. In particular, we compare hierarchical and flat variants of ensembles of nested dichotomies. The latter have been shown to deliver strong classification performance in multi-class settings. We present experimental results for synthetic, fold recognition, enzyme classification, and remote homology detection data. Our results show that exploiting the class hierarchy improves performance on the synthetic data, but not in the case of the protein classification problems. Based on this we recommend that strong flat multi-class methods be used as a baseline to establish the benefit of exploiting class hierarchies in this area
- …