74,860 research outputs found
Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees
Greedy optimization methods such as Matching Pursuit (MP) and Frank-Wolfe
(FW) algorithms regained popularity in recent years due to their simplicity,
effectiveness and theoretical guarantees. MP and FW address optimization over
the linear span and the convex hull of a set of atoms, respectively. In this
paper, we consider the intermediate case of optimization over the convex cone,
parametrized as the conic hull of a generic atom set, leading to the first
principled definitions of non-negative MP algorithms for which we give explicit
convergence rates and demonstrate excellent empirical performance. In
particular, we derive sublinear () convergence on general
smooth and convex objectives, and linear convergence () on
strongly convex objectives, in both cases for general sets of atoms.
Furthermore, we establish a clear correspondence of our algorithms to known
algorithms from the MP and FW literature. Our novel algorithms and analyses
target general atom sets and general objective functions, and hence are
directly applicable to a large variety of learning settings.Comment: NIPS 201
An extended orthogonal forward regression algorithm for system identification using entropy
In this paper, a fast identification algorithm for nonlinear dynamic stochastic system identification is presented. The algorithm extends the classical Orthogonal Forward Regression (OFR) algorithm so that instead of using the Error Reduction Ratio (ERR) for term selection, a new optimality criterion —Shannon’s Entropy Power Reduction Ratio(EPRR) is introduced to deal with both Gaussian and non-Gaussian signals. It is shown that the new algorithm is both fast and reliable and examples are provided to illustrate the effectiveness of the new approach
Solving for multi-class using orthogonal coding matrices
A common method of generalizing binary to multi-class classification is the
error correcting code (ECC). ECCs may be optimized in a number of ways, for
instance by making them orthogonal. Here we test two types of orthogonal ECCs
on seven different datasets using three types of binary classifier and compare
them with three other multi-class methods: 1 vs. 1, one-versus-the-rest and
random ECCs. The first type of orthogonal ECC, in which the codes contain no
zeros, admits a fast and simple method of solving for the probabilities.
Orthogonal ECCs are always more accurate than random ECCs as predicted by
recent literature. Improvments in uncertainty coefficient (U.C.) range between
0.4--17.5% (0.004--0.139, absolute), while improvements in Brier score between
0.7--10.7%. Unfortunately, orthogonal ECCs are rarely more accurate than 1 vs.
1. Disparities are worst when the methods are paired with logistic regression,
with orthogonal ECCs never beating 1 vs. 1. When the methods are paired with
SVM, the losses are less significant, peaking at 1.5%, relative, 0.011 absolute
in uncertainty coefficient and 6.5% in Brier scores. Orthogonal ECCs are always
the fastest of the five multi-class methods when paired with linear
classifiers. When paired with a piecewise linear classifier, whose
classification speed does not depend on the number of training samples,
classifications using orthogonal ECCs were always more accurate than the the
remaining three methods and also faster than 1 vs. 1. Losses against 1 vs. 1
here were higher, peaking at 1.9% (0.017, absolute), in U.C. and 39% in Brier
score. Gains in speed ranged between 1.1% and over 100%. Whether the speed
increase is worth the penalty in accuracy will depend on the application
On fast multiplication of a matrix by its transpose
We present a non-commutative algorithm for the multiplication of a
2x2-block-matrix by its transpose using 5 block products (3 recursive calls and
2 general products) over C or any finite field.We use geometric considerations
on the space of bilinear forms describing 2x2 matrix products to obtain this
algorithm and we show how to reduce the number of involved additions.The
resulting algorithm for arbitrary dimensions is a reduction of multiplication
of a matrix by its transpose to general matrix product, improving by a constant
factor previously known reductions.Finally we propose schedules with low memory
footprint that support a fast and memory efficient practical implementation
over a finite field.To conclude, we show how to use our result in LDLT
factorization.Comment: ISSAC 2020, Jul 2020, Kalamata, Greec
- …