8,603 research outputs found

    Geometric Inference in Bayesian Hierarchical Models with Applications to Topic Modeling

    Full text link
    Unstructured data is available in abundance with the rapidly growing size of digital information. Labeling such data is expensive and impractical, making unsupervised learning an increasingly important field. Big data collections often have rich latent structure that statistical modeler is challenged to uncover. Bayesian hierarchical modeling is a particularly suitable approach for complex latent patterns. Graphical model formalism has been prominent in developing various procedures for inference in Bayesian models, however the corresponding computational limits often fall behind the demands of the modern data sizes. In this thesis we develop new approaches for scalable approximate Bayesian inference. In particular, our approaches are driven by the analysis of latent geometric structures induced by the models. Our specific contributions include the following. We develop full geometric recipe of the Latent Dirichlet Allocation topic model. Next, we study several approaches for exploiting the latent geometry to first arrive at a fast weighted clustering procedure augmented with geometric corrections for topic inference, and then a nonparametric approach based on the analysis of the concentration of mass and angular geometry of the topic simplex, a convex polytope constructed by taking the convex hull of vertices representing the latent topics. Estimates produced by our methods are shown to be statistically consistent under some conditions. Finally, we develop a series of models for temporal dynamics of the latent geometric structures where inference can be performed in online and distributed fashion. All our algorithms are evaluated with extensive experiments on simulated and real datasets, culminating at a method several orders of magnitude faster than existing state-of-the-art topic modeling approaches, as demonstrated by experiments working with several million documents in a dozen minutes.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/146051/1/moonfolk_1.pd

    Online Tensor Methods for Learning Latent Variable Models

    Get PDF
    We introduce an online tensor decomposition based approach for two latent variable modeling problems namely, (1) community detection, in which we learn the latent communities that the social actors in social networks belong to, and (2) topic modeling, in which we infer hidden topics of text articles. We consider decomposition of moment tensors using stochastic gradient descent. We conduct optimization of multilinear operations in SGD and avoid directly forming the tensors, to save computational and storage costs. We present optimized algorithm in two platforms. Our GPU-based implementation exploits the parallelism of SIMD architectures to allow for maximum speed-up by a careful optimization of storage and data transfer, whereas our CPU-based implementation uses efficient sparse matrix computations and is suitable for large sparse datasets. For the community detection problem, we demonstrate accuracy and computational efficiency on Facebook, Yelp and DBLP datasets, and for the topic modeling problem, we also demonstrate good performance on the New York Times dataset. We compare our results to the state-of-the-art algorithms such as the variational method, and report a gain of accuracy and a gain of several orders of magnitude in the execution time.Comment: JMLR 201

    Online Bayesian Learning in Probabilistic Graphical Models using Moment Matching with Applications

    Get PDF
    Probabilistic Graphical Models are often used to e fficiently encode uncertainty in real world problems as probability distributions. Bayesian learning allows us to compute a posterior distribution over the parameters of these distributions based on observed data. One of the main challenges in Bayesian learning is that the posterior distribution can become exponentially complex as new data becomes available. Secondly, many algorithms require all the data to be present in memory before the parameters can be learned and may require retraining when new data becomes available. This is problematic for big data and expensive for streaming applications where new data arrives constantly. In this work I have proposed an online moment matching algorithm for Bayesian learning called Bayesian Moment Matching (BMM). This algorithm is based on Assumed Density Filtering (ADF) and allows us to update the posterior in a constant amount of time as new data arrives. In BMM, after new data is received, the exact posterior is projected onto a family of distributions indexed by a set of parameters. This projection is accomplished by matching the moments of this approximate posterior with those of the exact one. This allows us to update the posterior at each step in constant time. The eff ectiveness of this technique has been demonstrated on two real world problems. - Topic Modelling: Latent Dirichlet Allocation (LDA) is a statistical topic model that examines a set of documents and based on the statistics of the words in each document, discovers what is the distribution over topics for each document. - Activity Recognition: Tung et al have developed an instrumented rolling walker with sensors and cameras to autonomously monitor the user outside the laboratory setting. I have developed automated techniques to identify the activities performed by users with respect to the walker (e.g.,walking, standing, turning) using a Bayesian network called Hidden Markov Model. This problem is signi cant for applied health scientists who are studying the eff ectiveness of walkers to prevent falls. My main contributions in this work are: - In this work, I have given a novel interpretation of moment matching by showingthat there exists a set of initial distributions (di erent from the prior) for which exact Bayesian learning yields the same first and second order moments in the posterior as moment matching. Hence the Bayesian Moment matching algorithm is exact with respect to an implicit posterior. - Label switching is a problem which arises in unsupervised learning because labels can be assigned to hidden variables in a Hidden Markov Model in all possible permutations without changing the model. I also show that even though the exact posterior has n! components each corresponding to a permutation of the hidden states, moment matching for a slightly di fferent distribution can allow us to compute the moments without enumerating all the permutations. - In traditional ADF, the approximate posterior at every time step is constructed by minimizing KL divergence between the approximate and exact posterior. In case the prior is from the exponential family, this boils down to matching the "natural" moments. This can lead to a time complexity which is the order of the number of variables in the problem at every time step. This can become problematic particularly in LDA, where the number of variables is of the order of the dictionary size which can be very large. I have derived an algorithm for moment matching called Linear Moment Matching which updates all the moments in O(n) where n is the number of hidden states. - I have derived a Bayesian Moment Matching algorithm (BMM) for LDA and compared the performance of BMM against existing techniques for topic modelling using multiple real world data sets. -I have developed a model for activity recognition using Hidden Markov Models (HMMs). I also analyse existing parameter learning techniques for HMMs in terms of accuracy. The accuracy of the generative HMM model is also compared to that of a discriminative CRF model. - I have also derived a Bayesian Moment Matching algorithm for Activity Recognition. The e ffectiveness of this algorithm on learning model parameters is analysed using two experiments conducted with real patients and a control group of walker users

    Tensor Computation: A New Framework for High-Dimensional Problems in EDA

    Get PDF
    Many critical EDA problems suffer from the curse of dimensionality, i.e. the very fast-scaling computational burden produced by large number of parameters and/or unknown variables. This phenomenon may be caused by multiple spatial or temporal factors (e.g. 3-D field solvers discretizations and multi-rate circuit simulation), nonlinearity of devices and circuits, large number of design or optimization parameters (e.g. full-chip routing/placement and circuit sizing), or extensive process variations (e.g. variability/reliability analysis and design for manufacturability). The computational challenges generated by such high dimensional problems are generally hard to handle efficiently with traditional EDA core algorithms that are based on matrix and vector computation. This paper presents "tensor computation" as an alternative general framework for the development of efficient EDA algorithms and tools. A tensor is a high-dimensional generalization of a matrix and a vector, and is a natural choice for both storing and solving efficiently high-dimensional EDA problems. This paper gives a basic tutorial on tensors, demonstrates some recent examples of EDA applications (e.g., nonlinear circuit modeling and high-dimensional uncertainty quantification), and suggests further open EDA problems where the use of tensor computation could be of advantage.Comment: 14 figures. Accepted by IEEE Trans. CAD of Integrated Circuits and System
    • …
    corecore