2,155 research outputs found

    Manifold Optimization Over the Set of Doubly Stochastic Matrices: A Second-Order Geometry

    Get PDF
    Convex optimization is a well-established research area with applications in almost all fields. Over the decades, multiple approaches have been proposed to solve convex programs. The development of interior-point methods allowed solving a more general set of convex programs known as semi-definite programs and second-order cone programs. However, it has been established that these methods are excessively slow for high dimensions, i.e., they suffer from the curse of dimensionality. On the other hand, optimization algorithms on manifold have shown great ability in finding solutions to nonconvex problems in reasonable time. This paper is interested in solving a subset of convex optimization using a different approach. The main idea behind Riemannian optimization is to view the constrained optimization problem as an unconstrained one over a restricted search space. The paper introduces three manifolds to solve convex programs under particular box constraints. The manifolds, called the doubly stochastic, symmetric and the definite multinomial manifolds, generalize the simplex also known as the multinomial manifold. The proposed manifolds and algorithms are well-adapted to solving convex programs in which the variable of interest is a multidimensional probability distribution function. Theoretical analysis and simulation results testify the efficiency of the proposed method over state of the art methods. In particular, they reveal that the proposed framework outperforms conventional generic and specialized solvers, especially in high dimensions

    Geometrical methods for non-negative ICA: Manifolds, Lie groups and toral subalgebras

    Get PDF
    We explore the use of geometrical methods to tackle the non-negative independent component analysis (non-negative ICA) problem, without assuming the reader has an existing background in differential geometry. We concentrate on methods that achieve this by minimizing a cost function over the space of orthogonal matrices. We introduce the idea of the manifold and Lie group SO(n) of special orthogonal matrices that we wish to search over, and explain how this is related to the Lie algebra so(n) of skew-symmetric matrices. We describe how familiar optimization methods such as steepest-descent and conjugate gradients can be transformed into this Lie group setting, and how the Newton update step has an alternative Fourier version in SO(n). Finally we introduce the concept of a toral subgroup generated by a particular element of the Lie group or Lie algebra, and explore how this commutative subgroup might be used to simplify searches on our constraint surface. No proofs are presented in this article

    Efficient Methods for Unsupervised Learning of Probabilistic Models

    Full text link
    In this thesis I develop a variety of techniques to train, evaluate, and sample from intractable and high dimensional probabilistic models. Abstract exceeds arXiv space limitations -- see PDF

    Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization

    Get PDF
    The problem of learning using connectionist networks, in which network connection strengths are modified systematically so that the response of the network increasingly approximates the desired response can be structured as an optimization problem. The widely used back propagation method of connectionist learning [19, 21, 18] is set in the context of nonlinear optimization. In this framework, the issues of stability, convergence and parallelism are considered. As a form of gradient descent with fixed step size, back propagation is known to be unstable, which is illustrated using Rosenbrock\u27s function. This is contrasted with stable methods which involve a line search in the gradient direction. The convergence criterion for connectionist problems involving binary functions is discussed relative to the behavior of gradient descent in the vicinity of local minima. A minimax criterion is compared with the least squares criterion. The contribution of the momentum term [19, 18] to more rapid convergence is interpreted relative to the geometry of the weight space. It is shown that in plateau regions of relatively constant gradient, the momentum term acts to increase the step size by a factor of 1/1-μ, where μ is the momentum term. In valley regions with steep sides, the momentum constant acts to focus the search direction toward the local minimum by averaging oscillations in the gradient
    • …
    corecore