1,015 research outputs found
An Efficient Dual Approach to Distance Metric Learning
Distance metric learning is of fundamental interest in machine learning
because the distance metric employed can significantly affect the performance
of many learning methods. Quadratic Mahalanobis metric learning is a popular
approach to the problem, but typically requires solving a semidefinite
programming (SDP) problem, which is computationally expensive. Standard
interior-point SDP solvers typically have a complexity of (with
the dimension of input data), and can thus only practically solve problems
exhibiting less than a few thousand variables. Since the number of variables is
, this implies a limit upon the size of problem that can
practically be solved of around a few hundred dimensions. The complexity of the
popular quadratic Mahalanobis metric learning approach thus limits the size of
problem to which metric learning can be applied. Here we propose a
significantly more efficient approach to the metric learning problem based on
the Lagrange dual formulation of the problem. The proposed formulation is much
simpler to implement, and therefore allows much larger Mahalanobis metric
learning problems to be solved. The time complexity of the proposed method is
, which is significantly lower than that of the SDP approach.
Experiments on a variety of datasets demonstrate that the proposed method
achieves an accuracy comparable to the state-of-the-art, but is applicable to
significantly larger problems. We also show that the proposed method can be
applied to solve more general Frobenius-norm regularized SDP problems
approximately
Large-scale Binary Quadratic Optimization Using Semidefinite Relaxation and Applications
In computer vision, many problems such as image segmentation, pixel
labelling, and scene parsing can be formulated as binary quadratic programs
(BQPs). For submodular problems, cuts based methods can be employed to
efficiently solve large-scale problems. However, general nonsubmodular problems
are significantly more challenging to solve. Finding a solution when the
problem is of large size to be of practical interest, however, typically
requires relaxation. Two standard relaxation methods are widely used for
solving general BQPs--spectral methods and semidefinite programming (SDP), each
with their own advantages and disadvantages. Spectral relaxation is simple and
easy to implement, but its bound is loose. Semidefinite relaxation has a
tighter bound, but its computational complexity is high, especially for large
scale problems. In this work, we present a new SDP formulation for BQPs, with
two desirable properties. First, it has a similar relaxation bound to
conventional SDP formulations. Second, compared with conventional SDP methods,
the new SDP formulation leads to a significantly more efficient and scalable
dual optimization approach, which has the same degree of complexity as spectral
methods. We then propose two solvers, namely, quasi-Newton and smoothing Newton
methods, for the dual problem. Both of them are significantly more efficiently
than standard interior-point methods. In practice, the smoothing Newton solver
is faster than the quasi-Newton solver for dense or medium-sized problems,
while the quasi-Newton solver is preferable for large sparse/structured
problems. Our experiments on a few computer vision applications including
clustering, image segmentation, co-segmentation and registration show the
potential of our SDP formulation for solving large-scale BQPs.Comment: Fixed some typos. 18 pages. Accepted to IEEE Transactions on Pattern
Analysis and Machine Intelligenc
Regularization and Kernelization of the Maximin Correlation Approach
Robust classification becomes challenging when each class consists of
multiple subclasses. Examples include multi-font optical character recognition
and automated protein function prediction. In correlation-based
nearest-neighbor classification, the maximin correlation approach (MCA)
provides the worst-case optimal solution by minimizing the maximum
misclassification risk through an iterative procedure. Despite the optimality,
the original MCA has drawbacks that have limited its wide applicability in
practice. That is, the MCA tends to be sensitive to outliers, cannot
effectively handle nonlinearities in datasets, and suffers from having high
computational complexity. To address these limitations, we propose an improved
solution, named regularized maximin correlation approach (R-MCA). We first
reformulate MCA as a quadratically constrained linear programming (QCLP)
problem, incorporate regularization by introducing slack variables in the
primal problem of the QCLP, and derive the corresponding Lagrangian dual. The
dual formulation enables us to apply the kernel trick to R-MCA so that it can
better handle nonlinearities. Our experimental results demonstrate that the
regularization and kernelization make the proposed R-MCA more robust and
accurate for various classification tasks than the original MCA. Furthermore,
when the data size or dimensionality grows, R-MCA runs substantially faster by
solving either the primal or dual (whichever has a smaller variable dimension)
of the QCLP.Comment: Submitted to IEEE Acces
Parameter Selection and Pre-Conditioning for a Graph Form Solver
In a recent paper, Parikh and Boyd describe a method for solving a convex
optimization problem, where each iteration involves evaluating a proximal
operator and projection onto a subspace. In this paper we address the critical
practical issues of how to select the proximal parameter in each iteration, and
how to scale the original problem variables, so as the achieve reliable
practical performance. The resulting method has been implemented as an
open-source software package called POGS (Proximal Graph Solver), that targets
multi-core and GPU-based systems, and has been tested on a wide variety of
practical problems. Numerical results show that POGS can solve very large
problems (with, say, more than a billion coefficients in the data), to modest
accuracy in a few tens of seconds. As just one example, a radiation treatment
planning problem with around 100 million coefficients in the data can be solved
in a few seconds, as compared to around one hour with an interior-point method.Comment: 28 pages, 1 figure, 1 open source implementatio
Worst-Case Linear Discriminant Analysis as Scalable Semidefinite Feasibility Problems
In this paper, we propose an efficient semidefinite programming (SDP)
approach to worst-case linear discriminant analysis (WLDA). Compared with the
traditional LDA, WLDA considers the dimensionality reduction problem from the
worst-case viewpoint, which is in general more robust for classification.
However, the original problem of WLDA is non-convex and difficult to optimize.
In this paper, we reformulate the optimization problem of WLDA into a sequence
of semidefinite feasibility problems. To efficiently solve the semidefinite
feasibility problems, we design a new scalable optimization method with
quasi-Newton methods and eigen-decomposition being the core components. The
proposed method is orders of magnitude faster than standard interior-point
based SDP solvers.
Experiments on a variety of classification problems demonstrate that our
approach achieves better performance than standard LDA. Our method is also much
faster and more scalable than standard interior-point SDP solvers based WLDA.
The computational complexity for an SDP with constraints and matrices of
size by is roughly reduced from to
( in our case).Comment: 14 page
- …