2,834 research outputs found
Totally Corrective Multiclass Boosting with Binary Weak Learners
In this work, we propose a new optimization framework for multiclass boosting
learning. In the literature, AdaBoost.MO and AdaBoost.ECC are the two
successful multiclass boosting algorithms, which can use binary weak learners.
We explicitly derive these two algorithms' Lagrange dual problems based on
their regularized loss functions. We show that the Lagrange dual formulations
enable us to design totally-corrective multiclass algorithms by using the
primal-dual optimization technique. Experiments on benchmark data sets suggest
that our multiclass boosting can achieve a comparable generalization capability
with state-of-the-art, but the convergence speed is much faster than stage-wise
gradient descent boosting. In other words, the new totally corrective
algorithms can maximize the margin more aggressively.Comment: 11 page
Solving for multi-class using orthogonal coding matrices
A common method of generalizing binary to multi-class classification is the
error correcting code (ECC). ECCs may be optimized in a number of ways, for
instance by making them orthogonal. Here we test two types of orthogonal ECCs
on seven different datasets using three types of binary classifier and compare
them with three other multi-class methods: 1 vs. 1, one-versus-the-rest and
random ECCs. The first type of orthogonal ECC, in which the codes contain no
zeros, admits a fast and simple method of solving for the probabilities.
Orthogonal ECCs are always more accurate than random ECCs as predicted by
recent literature. Improvments in uncertainty coefficient (U.C.) range between
0.4--17.5% (0.004--0.139, absolute), while improvements in Brier score between
0.7--10.7%. Unfortunately, orthogonal ECCs are rarely more accurate than 1 vs.
1. Disparities are worst when the methods are paired with logistic regression,
with orthogonal ECCs never beating 1 vs. 1. When the methods are paired with
SVM, the losses are less significant, peaking at 1.5%, relative, 0.011 absolute
in uncertainty coefficient and 6.5% in Brier scores. Orthogonal ECCs are always
the fastest of the five multi-class methods when paired with linear
classifiers. When paired with a piecewise linear classifier, whose
classification speed does not depend on the number of training samples,
classifications using orthogonal ECCs were always more accurate than the the
remaining three methods and also faster than 1 vs. 1. Losses against 1 vs. 1
here were higher, peaking at 1.9% (0.017, absolute), in U.C. and 39% in Brier
score. Gains in speed ranged between 1.1% and over 100%. Whether the speed
increase is worth the penalty in accuracy will depend on the application
Maximum Margin Multiclass Nearest Neighbors
We develop a general framework for margin-based multicategory classification
in metric spaces. The basic work-horse is a margin-regularized version of the
nearest-neighbor classifier. We prove generalization bounds that match the
state of the art in sample size and significantly improve the dependence on
the number of classes . Our point of departure is a nearly Bayes-optimal
finite-sample risk bound independent of . Although -free, this bound is
unregularized and non-adaptive, which motivates our main result: Rademacher and
scale-sensitive margin bounds with a logarithmic dependence on . As the best
previous risk estimates in this setting were of order , our bound is
exponentially sharper. From the algorithmic standpoint, in doubling metric
spaces our classifier may be trained on examples in time and
evaluated on new points in time
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
This paper describes the participation of the team "TwiSE" in the SemEval
2016 challenge. Specifically, we participated in Task 4, namely "Sentiment
Analysis in Twitter" for which we implemented sentiment classification systems
for subtasks A, B, C and D. Our approach consists of two steps. In the first
step, we generate and validate diverse feature sets for twitter sentiment
evaluation, inspired by the work of participants of previous editions of such
challenges. In the second step, we focus on the optimization of the evaluation
measures of the different subtasks. To this end, we examine different learning
strategies by validating them on the data provided by the task organisers. For
our final submissions we used an ensemble learning approach (stacked
generalization) for Subtask A and single linear models for the rest of the
subtasks. In the official leaderboard we were ranked 9/35, 8/19, 1/11 and 2/14
for subtasks A, B, C and D respectively.\footnote{We make the code available
for research purposes at
\url{https://github.com/balikasg/SemEval2016-Twitter\_Sentiment\_Evaluation}.
Benchmark of structured machine learning methods for microbial identification from mass-spectrometry data
Microbial identification is a central issue in microbiology, in particular in
the fields of infectious diseases diagnosis and industrial quality control. The
concept of species is tightly linked to the concept of biological and clinical
classification where the proximity between species is generally measured in
terms of evolutionary distances and/or clinical phenotypes. Surprisingly, the
information provided by this well-known hierarchical structure is rarely used
by machine learning-based automatic microbial identification systems.
Structured machine learning methods were recently proposed for taking into
account the structure embedded in a hierarchy and using it as additional a
priori information, and could therefore allow to improve microbial
identification systems. We test and compare several state-of-the-art machine
learning methods for microbial identification on a new Matrix-Assisted Laser
Desorption/Ionization Time-of-Flight mass spectrometry (MALDI-TOF MS) dataset.
We include in the benchmark standard and structured methods, that leverage the
knowledge of the underlying hierarchical structure in the learning process. Our
results show that although some methods perform better than others, structured
methods do not consistently perform better than their "flat" counterparts. We
postulate that this is partly due to the fact that standard methods already
reach a high level of accuracy in this context, and that they mainly confuse
species close to each other in the tree, a case where using the known hierarchy
is not helpful
- …