851 research outputs found
Fast Conical Hull Algorithms for Near-separable Non-negative Matrix Factorization
The separability assumption (Donoho & Stodden, 2003; Arora et al., 2012)
turns non-negative matrix factorization (NMF) into a tractable problem.
Recently, a new class of provably-correct NMF algorithms have emerged under
this assumption. In this paper, we reformulate the separable NMF problem as
that of finding the extreme rays of the conical hull of a finite set of
vectors. From this geometric perspective, we derive new separable NMF
algorithms that are highly scalable and empirically noise robust, and have
several other favorable properties in relation to existing methods. A parallel
implementation of our algorithm demonstrates high scalability on shared- and
distributed-memory machines.Comment: 15 pages, 6 figure
Relative Comparison Kernel Learning with Auxiliary Kernels
In this work we consider the problem of learning a positive semidefinite
kernel matrix from relative comparisons of the form: "object A is more similar
to object B than it is to C", where comparisons are given by humans. Existing
solutions to this problem assume many comparisons are provided to learn a high
quality kernel. However, this can be considered unrealistic for many real-world
tasks since relative assessments require human input, which is often costly or
difficult to obtain. Because of this, only a limited number of these
comparisons may be provided. In this work, we explore methods for aiding the
process of learning a kernel with the help of auxiliary kernels built from more
easily extractable information regarding the relationships among objects. We
propose a new kernel learning approach in which the target kernel is defined as
a conic combination of auxiliary kernels and a kernel whose elements are
learned directly. We formulate a convex optimization to solve for this target
kernel that adds only minor overhead to methods that use no auxiliary
information. Empirical results show that in the presence of few training
relative comparisons, our method can learn kernels that generalize to more
out-of-sample comparisons than methods that do not utilize auxiliary
information, as well as similar methods that learn metrics over objects
OSQP: An Operator Splitting Solver for Quadratic Programs
We present a general-purpose solver for convex quadratic programs based on
the alternating direction method of multipliers, employing a novel operator
splitting technique that requires the solution of a quasi-definite linear
system with the same coefficient matrix at almost every iteration. Our
algorithm is very robust, placing no requirements on the problem data such as
positive definiteness of the objective function or linear independence of the
constraint functions. It can be configured to be division-free once an initial
matrix factorization is carried out, making it suitable for real-time
applications in embedded systems. In addition, our technique is the first
operator splitting method for quadratic programs able to reliably detect primal
and dual infeasible problems from the algorithm iterates. The method also
supports factorization caching and warm starting, making it particularly
efficient when solving parametrized problems arising in finance, control, and
machine learning. Our open-source C implementation OSQP has a small footprint,
is library-free, and has been extensively tested on many problem instances from
a wide variety of application areas. It is typically ten times faster than
competing interior-point methods, and sometimes much more when factorization
caching or warm start is used. OSQP has already shown a large impact with tens
of thousands of users both in academia and in large corporations
A Stochastic View of Optimal Regret through Minimax Duality
We study the regret of optimal strategies for online convex optimization
games. Using von Neumann's minimax theorem, we show that the optimal regret in
this adversarial setting is closely related to the behavior of the empirical
minimization algorithm in a stochastic process setting: it is equal to the
maximum, over joint distributions of the adversary's action sequence, of the
difference between a sum of minimal expected losses and the minimal empirical
loss. We show that the optimal regret has a natural geometric interpretation,
since it can be viewed as the gap in Jensen's inequality for a concave
functional--the minimizer over the player's actions of expected loss--defined
on a set of probability distributions. We use this expression to obtain upper
and lower bounds on the regret of an optimal strategy for a variety of online
learning problems. Our method provides upper bounds without the need to
construct a learning algorithm; the lower bounds provide explicit optimal
strategies for the adversary
Efficient Multi-Template Learning for Structured Prediction
Conditional random field (CRF) and Structural Support Vector Machine
(Structural SVM) are two state-of-the-art methods for structured prediction
which captures the interdependencies among output variables. The success of
these methods is attributed to the fact that their discriminative models are
able to account for overlapping features on the whole input observations. These
features are usually generated by applying a given set of templates on labeled
data, but improper templates may lead to degraded performance. To alleviate
this issue, in this paper, we propose a novel multiple template learning
paradigm to learn structured prediction and the importance of each template
simultaneously, so that hundreds of arbitrary templates could be added into the
learning model without caution. This paradigm can be formulated as a special
multiple kernel learning problem with exponential number of constraints. Then
we introduce an efficient cutting plane algorithm to solve this problem in the
primal, and its convergence is presented. We also evaluate the proposed
learning paradigm on two widely-studied structured prediction tasks,
\emph{i.e.} sequence labeling and dependency parsing. Extensive experimental
results show that the proposed method outperforms CRFs and Structural SVMs due
to exploiting the importance of each template. Our complexity analysis and
empirical results also show that our proposed method is more efficient than
OnlineMKL on very sparse and high-dimensional data. We further extend this
paradigm for structured prediction using generalized -block norm
regularization with , and experiments show competitive performances when
Algorithm Engineering in Robust Optimization
Robust optimization is a young and emerging field of research having received
a considerable increase of interest over the last decade. In this paper, we
argue that the the algorithm engineering methodology fits very well to the
field of robust optimization and yields a rewarding new perspective on both the
current state of research and open research directions.
To this end we go through the algorithm engineering cycle of design and
analysis of concepts, development and implementation of algorithms, and
theoretical and experimental evaluation. We show that many ideas of algorithm
engineering have already been applied in publications on robust optimization.
Most work on robust optimization is devoted to analysis of the concepts and the
development of algorithms, some papers deal with the evaluation of a particular
concept in case studies, and work on comparison of concepts just starts. What
is still a drawback in many papers on robustness is the missing link to include
the results of the experiments again in the design
- …