7 research outputs found
Learning Infinite RBMs with Frank-Wolfe
Abstract In this work, we propose an infinite restricted Boltzmann machine (RBM), whose maximum likelihood estimation (MLE) corresponds to a constrained convex optimization. We consider the Frank-Wolfe algorithm to solve the program, which provides a sparse solution that can be interpreted as inserting a hidden unit at each iteration, so that the optimization process takes the form of a sequence of finite models of increasing complexity. As a side benefit, this can be used to easily and efficiently identify an appropriate number of hidden units during the optimization. The resulting model can also be used as an initialization for typical state-of-the-art RBM training algorithms such as contrastive divergence, leading to models with consistently higher test likelihood than random initialization
Recursive Frank-Wolfe algorithms
In the last decade there has been a resurgence of interest in Frank-Wolfe
(FW) style methods for optimizing a smooth convex function over a polytope.
Examples of recently developed techniques include {\em Decomposition-invariant
Conditional Gradient} (DiCG), {\em Blended Condition Gradient} (BCG), and {\em
Frank-Wolfe with in-face directions} (IF-FW) methods. We introduce two
extensions of these techniques. First, we augment DiCG with the {\em working
set} strategy, and show how to optimize over the working set using {\em shadow
simplex steps}. Second, we generalize in-face Frank-Wolfe directions to
polytopes in which faces cannot be efficiently computed, and also describe a
generic recursive procedure that can be used in conjunction with several
FW-style techniques. Experimental results indicate that these extensions are
capable of speeding up original algorithms by orders of magnitude for certain
applications
Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets
In this paper, we propose approximate Frank-Wolfe (FW) algorithms to solve
convex optimization problems over graph-structured support sets where the
\textit{linear minimization oracle} (LMO) cannot be efficiently obtained in
general. We first demonstrate that two popular approximation assumptions
(\textit{additive} and \textit{multiplicative gap errors)}, are not valid for
our problem, in that no cheap gap-approximate LMO oracle exists in general.
Instead, a new \textit{approximate dual maximization oracle} (DMO) is proposed,
which approximates the inner product rather than the gap. When the objective is
-smooth, we prove that the standard FW method using a -approximate
DMO converges as in general, and as over a
-relaxation of the constraint set. Additionally, when the objective is
-strongly convex and the solution is unique, a variant of FW converges to
with the same per-iteration
complexity. Our empirical results suggest that even these improved bounds are
pessimistic, with significant improvement in recovering real-world images with
graph-structured sparsity.Comment: 30 pages, 8 figure