12,175 research outputs found
Large-scale Binary Quadratic Optimization Using Semidefinite Relaxation and Applications
In computer vision, many problems such as image segmentation, pixel
labelling, and scene parsing can be formulated as binary quadratic programs
(BQPs). For submodular problems, cuts based methods can be employed to
efficiently solve large-scale problems. However, general nonsubmodular problems
are significantly more challenging to solve. Finding a solution when the
problem is of large size to be of practical interest, however, typically
requires relaxation. Two standard relaxation methods are widely used for
solving general BQPs--spectral methods and semidefinite programming (SDP), each
with their own advantages and disadvantages. Spectral relaxation is simple and
easy to implement, but its bound is loose. Semidefinite relaxation has a
tighter bound, but its computational complexity is high, especially for large
scale problems. In this work, we present a new SDP formulation for BQPs, with
two desirable properties. First, it has a similar relaxation bound to
conventional SDP formulations. Second, compared with conventional SDP methods,
the new SDP formulation leads to a significantly more efficient and scalable
dual optimization approach, which has the same degree of complexity as spectral
methods. We then propose two solvers, namely, quasi-Newton and smoothing Newton
methods, for the dual problem. Both of them are significantly more efficiently
than standard interior-point methods. In practice, the smoothing Newton solver
is faster than the quasi-Newton solver for dense or medium-sized problems,
while the quasi-Newton solver is preferable for large sparse/structured
problems. Our experiments on a few computer vision applications including
clustering, image segmentation, co-segmentation and registration show the
potential of our SDP formulation for solving large-scale BQPs.Comment: Fixed some typos. 18 pages. Accepted to IEEE Transactions on Pattern
Analysis and Machine Intelligenc
Doubly Robust Smoothing of Dynamical Processes via Outlier Sparsity Constraints
Coping with outliers contaminating dynamical processes is of major importance
in various applications because mismatches from nominal models are not uncommon
in practice. In this context, the present paper develops novel fixed-lag and
fixed-interval smoothing algorithms that are robust to outliers simultaneously
present in the measurements {\it and} in the state dynamics. Outliers are
handled through auxiliary unknown variables that are jointly estimated along
with the state based on the least-squares criterion that is regularized with
the -norm of the outliers in order to effect sparsity control. The
resultant iterative estimators rely on coordinate descent and the alternating
direction method of multipliers, are expressed in closed form per iteration,
and are provably convergent. Additional attractive features of the novel doubly
robust smoother include: i) ability to handle both types of outliers; ii)
universality to unknown nominal noise and outlier distributions; iii)
flexibility to encompass maximum a posteriori optimal estimators with reliable
performance under nominal conditions; and iv) improved performance relative to
competing alternatives at comparable complexity, as corroborated via simulated
tests.Comment: Submitted to IEEE Trans. on Signal Processin
Efficient First Order Methods for Linear Composite Regularizers
A wide class of regularization problems in machine learning and statistics
employ a regularization term which is obtained by composing a simple convex
function \omega with a linear transformation. This setting includes Group Lasso
methods, the Fused Lasso and other total variation methods, multi-task learning
methods and many more. In this paper, we present a general approach for
computing the proximity operator of this class of regularizers, under the
assumption that the proximity operator of the function \omega is known in
advance. Our approach builds on a recent line of research on optimal first
order optimization methods and uses fixed point iterations for numerically
computing the proximity operator. It is more general than current approaches
and, as we show with numerical simulations, computationally more efficient than
available first order methods which do not achieve the optimal rate. In
particular, our method outperforms state of the art O(1/T) methods for
overlapping Group Lasso and matches optimal O(1/T^2) methods for the Fused
Lasso and tree structured Group Lasso.Comment: 19 pages, 8 figure
Multilevel Artificial Neural Network Training for Spatially Correlated Learning
Multigrid modeling algorithms are a technique used to accelerate relaxation
models running on a hierarchy of similar graphlike structures. We introduce and
demonstrate a new method for training neural networks which uses multilevel
methods. Using an objective function derived from a graph-distance metric, we
perform orthogonally-constrained optimization to find optimal prolongation and
restriction maps between graphs. We compare and contrast several methods for
performing this numerical optimization, and additionally present some new
theoretical results on upper bounds of this type of objective function. Once
calculated, these optimal maps between graphs form the core of Multiscale
Artificial Neural Network (MsANN) training, a new procedure we present which
simultaneously trains a hierarchy of neural network models of varying spatial
resolution. Parameter information is passed between members of this hierarchy
according to standard coarsening and refinement schedules from the multiscale
modelling literature. In our machine learning experiments, these models are
able to learn faster than default training, achieving a comparable level of
error in an order of magnitude fewer training examples.Comment: Manuscript (24 pages) and Supplementary Material (4 pages). Updated
January 2019 to reflect new formulation of MsANN structure and new training
procedur
- …