141 research outputs found

    Fast and Guaranteed Tensor Decomposition via Sketching

    Get PDF
    Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tensor contractions via FFTs, without explicitly forming the tensors. Such tensor contractions are encountered in decomposition methods such as tensor power iterations and alternating least squares. We also design novel colliding hashes for symmetric tensors to further save time in computing the sketches. We then combine these sketching ideas with existing whitening and tensor power iterative techniques to obtain the fastest algorithm on both sparse and dense tensors. The quality of approximation under our method does not depend on properties such as sparsity, uniformity of elements, etc. We apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information Processing Systems (NIPS), held at Montreal, Canada in 201

    Bayesian Methods in Tensor Analysis

    Full text link
    Tensors, also known as multidimensional arrays, are useful data structures in machine learning and statistics. In recent years, Bayesian methods have emerged as a popular direction for analyzing tensor-valued data since they provide a convenient way to introduce sparsity into the model and conduct uncertainty quantification. In this article, we provide an overview of frequentist and Bayesian methods for solving tensor completion and regression problems, with a focus on Bayesian methods. We review common Bayesian tensor approaches including model formulation, prior assignment, posterior computation, and theoretical properties. We also discuss potential future directions in this field.Comment: 32 pages, 8 figures, 2 table

    Structured Mixture Models

    Get PDF
    Finite mixture models are a staple of model-based clustering approaches for distinguishing subgroups. A common mixture model is the finite Gaussian mixture model, whose degrees of freedom scales quadratically with increasing data dimension. Methods in the literature often tackle the degrees of freedom of the Gaussian mixture model by sharing parameters between the eigendecomposition of covariance matrices across all mixture components. We posit finite Gaussian mixture models with alternate forms of parameter sharing by imposing additional structure on the parameters, such as sharing parameters with other components as a convex combination of the corresponding parent components or by imposing a sequence of hierarchical clustering structure in orthogonal subspaces with common parameters across levels. Estimation procedures using the Expectation-Maximization (EM) algorithm are derived throughout, with application to simulated and real-world datasets. As well, the proposed model structures have an interpretable meaning that can shed light on clustering analyses performed by practitioners in the context of their data. The EM algorithm is a popular estimation method for tackling issues of latent data, such as in finite mixture models where component memberships are often latent. One aspect of the EM algorithm that hampers estimation is a slow rate of convergence, which affects the estimation of finite Gaussian mixture models. To explore avenues of improvement, we explore the extrapolation of the sequence of conditional expectations admitting general EM procedures, with minimal modifications for many common models. With the same mindset of accelerating iterative algorithms, we also examine the use of approximate sketching methods in estimating generalized linear models via iteratively re-weighted least squares, with emphasis on practical data infrastructure constraints. We propose a sketching method that controls for both data transfer and computation costs, the former of which is often overlooked in asymptotic complexity analyses, and are able to achieve an approximate result in much faster wall-clock time compared to the exact solution on real-world hardware, and can estimate standard errors in addition to point estimates

    Marriages of Mathematics and Physics: A Challenge for Biology

    Get PDF
    The human attempts to access, measure and organize physical phenomena have led to a manifold construction of mathematical and physical spaces. We will survey the evolution of geometries from Euclid to the Algebraic Geometry of the 20th century. The role of Persian/Arabic Algebra in this transition and its Western symbolic development is emphasized. In this relation, we will also discuss changes in the ontological attitudes toward mathematics and its applications. Historically, the encounter of geometric and algebraic perspectives enriched the mathematical practices and their foundations. Yet, the collapse of Euclidean certitudes, of over 2300 years, and the crisis in the mathematical analysis of the 19th century, led to the exclusion of “geometric judgments” from the foundations of Mathematics. After the success and the limits of the logico-formal analysis, it is necessary to broaden our foundational tools and re-examine the interactions with natural sciences. In particular, the way the geometric and algebraic approaches organize knowledge is analyzed as a cross-disciplinary and cross-cultural issue and will be examined in Mathematical Physics and Biology. We finally discuss how the current notions of mathematical (phase) “space” should be revisited for the purposes of life sciences
    • …
    corecore