7 research outputs found

    Learning Infinite RBMs with Frank-Wolfe

    Get PDF
    Abstract In this work, we propose an infinite restricted Boltzmann machine (RBM), whose maximum likelihood estimation (MLE) corresponds to a constrained convex optimization. We consider the Frank-Wolfe algorithm to solve the program, which provides a sparse solution that can be interpreted as inserting a hidden unit at each iteration, so that the optimization process takes the form of a sequence of finite models of increasing complexity. As a side benefit, this can be used to easily and efficiently identify an appropriate number of hidden units during the optimization. The resulting model can also be used as an initialization for typical state-of-the-art RBM training algorithms such as contrastive divergence, leading to models with consistently higher test likelihood than random initialization

    Recursive Frank-Wolfe algorithms

    Full text link
    In the last decade there has been a resurgence of interest in Frank-Wolfe (FW) style methods for optimizing a smooth convex function over a polytope. Examples of recently developed techniques include {\em Decomposition-invariant Conditional Gradient} (DiCG), {\em Blended Condition Gradient} (BCG), and {\em Frank-Wolfe with in-face directions} (IF-FW) methods. We introduce two extensions of these techniques. First, we augment DiCG with the {\em working set} strategy, and show how to optimize over the working set using {\em shadow simplex steps}. Second, we generalize in-face Frank-Wolfe directions to polytopes in which faces cannot be efficiently computed, and also describe a generic recursive procedure that can be used in conjunction with several FW-style techniques. Experimental results indicate that these extensions are capable of speeding up original algorithms by orders of magnitude for certain applications

    Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets

    Full text link
    In this paper, we propose approximate Frank-Wolfe (FW) algorithms to solve convex optimization problems over graph-structured support sets where the \textit{linear minimization oracle} (LMO) cannot be efficiently obtained in general. We first demonstrate that two popular approximation assumptions (\textit{additive} and \textit{multiplicative gap errors)}, are not valid for our problem, in that no cheap gap-approximate LMO oracle exists in general. Instead, a new \textit{approximate dual maximization oracle} (DMO) is proposed, which approximates the inner product rather than the gap. When the objective is LL-smooth, we prove that the standard FW method using a δ\delta-approximate DMO converges as O(L/δt+(1δ)(δ1+δ2))\mathcal{O}(L / \delta t + (1-\delta)(\delta^{-1} + \delta^{-2})) in general, and as O(L/(δ2(t+2)))\mathcal{O}(L/(\delta^2(t+2))) over a δ\delta-relaxation of the constraint set. Additionally, when the objective is μ\mu-strongly convex and the solution is unique, a variant of FW converges to O(L2log(t)/(μδ6t2))\mathcal{O}(L^2\log(t)/(\mu \delta^6 t^2)) with the same per-iteration complexity. Our empirical results suggest that even these improved bounds are pessimistic, with significant improvement in recovering real-world images with graph-structured sparsity.Comment: 30 pages, 8 figure
    corecore