4 research outputs found
Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization
We propose a new majorization-minimization (MM) method for non-smooth and
non-convex programs, which is general enough to include the existing MM
methods. Besides the local majorization condition, we only require that the
difference between the directional derivatives of the objective function and
its surrogate function vanishes when the number of iterations approaches
infinity, which is a very weak condition. So our method can use a surrogate
function that directly approximates the non-smooth objective function. In
comparison, all the existing MM methods construct the surrogate function by
approximating the smooth component of the objective function. We apply our
relaxed MM methods to the robust matrix factorization (RMF) problem with
different regularizations, where our locally majorant algorithm shows
advantages over the state-of-the-art approaches for RMF. This is the first
algorithm for RMF ensuring, without extra assumptions, that any limit point of
the iterates is a stationary point.Comment: AAAI1
Proximal Alternating Direction Network: A Globally Converged Deep Unrolling Framework
Deep learning models have gained great success in many real-world
applications. However, most existing networks are typically designed in
heuristic manners, thus lack of rigorous mathematical principles and
derivations. Several recent studies build deep structures by unrolling a
particular optimization model that involves task information. Unfortunately,
due to the dynamic nature of network parameters, their resultant deep
propagation networks do \emph{not} possess the nice convergence property as the
original optimization scheme does. This paper provides a novel proximal
unrolling framework to establish deep models by integrating experimentally
verified network architectures and rich cues of the tasks. More importantly, we
\emph{prove in theory} that 1) the propagation generated by our unrolled deep
model globally converges to a critical-point of a given variational energy, and
2) the proposed framework is still able to learn priors from training data to
generate a convergent propagation even when task information is only partially
available. Indeed, these theoretical results are the best we can ask for,
unless stronger assumptions are enforced. Extensive experiments on various
real-world applications verify the theoretical convergence and demonstrate the
effectiveness of designed deep models
Truncated Inference for Latent Variable Optimization Problems: Application to Robust Estimation and Learning
Optimization problems with an auxiliary latent variable structure in addition
to the main model parameters occur frequently in computer vision and machine
learning. The additional latent variables make the underlying optimization task
expensive, either in terms of memory (by maintaining the latent variables), or
in terms of runtime (repeated exact inference of latent variables). We aim to
remove the need to maintain the latent variables and propose two formally
justified methods, that dynamically adapt the required accuracy of latent
variable inference. These methods have applications in large scale robust
estimation and in learning energy-based models from labeled data.Comment: 16 page