17,721 research outputs found
Totally Corrective Multiclass Boosting with Binary Weak Learners
In this work, we propose a new optimization framework for multiclass boosting
learning. In the literature, AdaBoost.MO and AdaBoost.ECC are the two
successful multiclass boosting algorithms, which can use binary weak learners.
We explicitly derive these two algorithms' Lagrange dual problems based on
their regularized loss functions. We show that the Lagrange dual formulations
enable us to design totally-corrective multiclass algorithms by using the
primal-dual optimization technique. Experiments on benchmark data sets suggest
that our multiclass boosting can achieve a comparable generalization capability
with state-of-the-art, but the convergence speed is much faster than stage-wise
gradient descent boosting. In other words, the new totally corrective
algorithms can maximize the margin more aggressively.Comment: 11 page
Mathematical Programming Decoding of Binary Linear Codes: Theory and Algorithms
Mathematical programming is a branch of applied mathematics and has recently
been used to derive new decoding approaches, challenging established but often
heuristic algorithms based on iterative message passing. Concepts from
mathematical programming used in the context of decoding include linear,
integer, and nonlinear programming, network flows, notions of duality as well
as matroid and polyhedral theory. This survey article reviews and categorizes
decoding methods based on mathematical programming approaches for binary linear
codes over binary-input memoryless symmetric channels.Comment: 17 pages, submitted to the IEEE Transactions on Information Theory.
Published July 201
Solving for multi-class using orthogonal coding matrices
A common method of generalizing binary to multi-class classification is the
error correcting code (ECC). ECCs may be optimized in a number of ways, for
instance by making them orthogonal. Here we test two types of orthogonal ECCs
on seven different datasets using three types of binary classifier and compare
them with three other multi-class methods: 1 vs. 1, one-versus-the-rest and
random ECCs. The first type of orthogonal ECC, in which the codes contain no
zeros, admits a fast and simple method of solving for the probabilities.
Orthogonal ECCs are always more accurate than random ECCs as predicted by
recent literature. Improvments in uncertainty coefficient (U.C.) range between
0.4--17.5% (0.004--0.139, absolute), while improvements in Brier score between
0.7--10.7%. Unfortunately, orthogonal ECCs are rarely more accurate than 1 vs.
1. Disparities are worst when the methods are paired with logistic regression,
with orthogonal ECCs never beating 1 vs. 1. When the methods are paired with
SVM, the losses are less significant, peaking at 1.5%, relative, 0.011 absolute
in uncertainty coefficient and 6.5% in Brier scores. Orthogonal ECCs are always
the fastest of the five multi-class methods when paired with linear
classifiers. When paired with a piecewise linear classifier, whose
classification speed does not depend on the number of training samples,
classifications using orthogonal ECCs were always more accurate than the the
remaining three methods and also faster than 1 vs. 1. Losses against 1 vs. 1
here were higher, peaking at 1.9% (0.017, absolute), in U.C. and 39% in Brier
score. Gains in speed ranged between 1.1% and over 100%. Whether the speed
increase is worth the penalty in accuracy will depend on the application
Enhanced Integrated Scoring for Cleaning Dirty Texts
An increasing number of approaches for ontology engineering from text are
gearing towards the use of online sources such as company intranet and the
World Wide Web. Despite such rise, not much work can be found in aspects of
preprocessing and cleaning dirty texts from online sources. This paper presents
an enhancement of an Integrated Scoring for Spelling error correction,
Abbreviation expansion and Case restoration (ISSAC). ISSAC is implemented as
part of a text preprocessing phase in an ontology engineering system. New
evaluations performed on the enhanced ISSAC using 700 chat records reveal an
improved accuracy of 98% as compared to 96.5% and 71% based on the use of only
basic ISSAC and of Aspell, respectively.Comment: More information is available at
http://explorer.csse.uwa.edu.au/reference
- …