16,785 research outputs found
Conic Optimization Theory: Convexification Techniques and Numerical Algorithms
Optimization is at the core of control theory and appears in several areas of
this field, such as optimal control, distributed control, system
identification, robust control, state estimation, model predictive control and
dynamic programming. The recent advances in various topics of modern
optimization have also been revamping the area of machine learning. Motivated
by the crucial role of optimization theory in the design, analysis, control and
operation of real-world systems, this tutorial paper offers a detailed overview
of some major advances in this area, namely conic optimization and its emerging
applications. First, we discuss the importance of conic optimization in
different areas. Then, we explain seminal results on the design of hierarchies
of convex relaxations for a wide range of nonconvex problems. Finally, we study
different numerical algorithms for large-scale conic optimization problems.Comment: 18 page
Convex Optimization for Binary Classifier Aggregation in Multiclass Problems
Multiclass problems are often decomposed into multiple binary problems that
are solved by individual binary classifiers whose results are integrated into a
final answer. Various methods, including all-pairs (APs), one-versus-all (OVA),
and error correcting output code (ECOC), have been studied, to decompose
multiclass problems into binary problems. However, little study has been made
to optimally aggregate binary problems to determine a final answer to the
multiclass problem. In this paper we present a convex optimization method for
an optimal aggregation of binary classifiers to estimate class membership
probabilities in multiclass problems. We model the class membership probability
as a softmax function which takes a conic combination of discrepancies induced
by individual binary classifiers, as an input. With this model, we formulate
the regularized maximum likelihood estimation as a convex optimization problem,
which is solved by the primal-dual interior point method. Connections of our
method to large margin classifiers are presented, showing that the large margin
formulation can be considered as a limiting case of our convex formulation.
Numerical experiments on synthetic and real-world data sets demonstrate that
our method outperforms existing aggregation methods as well as direct methods,
in terms of the classification accuracy and the quality of class membership
probability estimates.Comment: Appeared in Proceedings of the 2014 SIAM International Conference on
Data Mining (SDM 2014
Linear system identification using stable spline kernels and PLQ penalties
The classical approach to linear system identification is given by parametric
Prediction Error Methods (PEM). In this context, model complexity is often
unknown so that a model order selection step is needed to suitably trade-off
bias and variance. Recently, a different approach to linear system
identification has been introduced, where model order determination is avoided
by using a regularized least squares framework. In particular, the penalty term
on the impulse response is defined by so called stable spline kernels. They
embed information on regularity and BIBO stability, and depend on a small
number of parameters which can be estimated from data. In this paper, we
provide new nonsmooth formulations of the stable spline estimator. In
particular, we consider linear system identification problems in a very broad
context, where regularization functionals and data misfits can come from a rich
set of piecewise linear quadratic functions. Moreover, our anal- ysis includes
polyhedral inequality constraints on the unknown impulse response. For any
formulation in this class, we show that interior point methods can be used to
solve the system identification problem, with complexity O(n3)+O(mn2) in each
iteration, where n and m are the number of impulse response coefficients and
measurements, respectively. The usefulness of the framework is illustrated via
a numerical experiment where output measurements are contaminated by outliers.Comment: 8 pages, 2 figure
Data-Driven Estimation in Equilibrium Using Inverse Optimization
Equilibrium modeling is common in a variety of fields such as game theory and
transportation science. The inputs for these models, however, are often
difficult to estimate, while their outputs, i.e., the equilibria they are meant
to describe, are often directly observable. By combining ideas from inverse
optimization with the theory of variational inequalities, we develop an
efficient, data-driven technique for estimating the parameters of these models
from observed equilibria. We use this technique to estimate the utility
functions of players in a game from their observed actions and to estimate the
congestion function on a road network from traffic count data. A distinguishing
feature of our approach is that it supports both parametric and
\emph{nonparametric} estimation by leveraging ideas from statistical learning
(kernel methods and regularization operators). In computational experiments
involving Nash and Wardrop equilibria in a nonparametric setting, we find that
a) we effectively estimate the unknown demand or congestion function,
respectively, and b) our proposed regularization technique substantially
improves the out-of-sample performance of our estimators.Comment: 36 pages, 5 figures Additional theorems for generalization guarantees
and statistical analysis adde
Hessian barrier algorithms for linearly constrained optimization problems
In this paper, we propose an interior-point method for linearly constrained
optimization problems (possibly nonconvex). The method - which we call the
Hessian barrier algorithm (HBA) - combines a forward Euler discretization of
Hessian Riemannian gradient flows with an Armijo backtracking step-size policy.
In this way, HBA can be seen as an alternative to mirror descent (MD), and
contains as special cases the affine scaling algorithm, regularized Newton
processes, and several other iterative solution methods. Our main result is
that, modulo a non-degeneracy condition, the algorithm converges to the
problem's set of critical points; hence, in the convex case, the algorithm
converges globally to the problem's minimum set. In the case of linearly
constrained quadratic programs (not necessarily convex), we also show that the
method's convergence rate is for some
that depends only on the choice of kernel function (i.e., not on the problem's
primitives). These theoretical results are validated by numerical experiments
in standard non-convex test functions and large-scale traffic assignment
problems.Comment: 27 pages, 6 figure
Quadratic Projection Based Feature Extraction with Its Application to Biometric Recognition
This paper presents a novel quadratic projection based feature extraction
framework, where a set of quadratic matrices is learned to distinguish each
class from all other classes. We formulate quadratic matrix learning (QML) as a
standard semidefinite programming (SDP) problem. However, the con- ventional
interior-point SDP solvers do not scale well to the problem of QML for
high-dimensional data. To solve the scalability of QML, we develop an efficient
algorithm, termed DualQML, based on the Lagrange duality theory, to extract
nonlinear features. To evaluate the feasibility and effectiveness of the
proposed framework, we conduct extensive experiments on biometric recognition.
Experimental results on three representative biometric recogni- tion tasks,
including face, palmprint, and ear recognition, demonstrate the superiority of
the DualQML-based feature extraction algorithm compared to the current
state-of-the-art algorithm
- …