844 research outputs found

    Topic Uncovering and Image Annotation via Scalable Probit Normal Correlated Topic Models

    Get PDF
    Topic uncovering of the latent topics have become an active research area for more than a decade and continuous to receive contributions from all disciplines including computer science, information science and statistics. Since the introduction of Latent Dirichlet Allocation in 2003, many intriguing extension models have been proposed. One such extension model is the logistic normal correlated topic model, which not only uncovers hidden topic of a document, but also extract a meaningful topical relationship among a large number of topics. In this model, the Logistic normal distribution was adapted via the transformation of multivariate Gaussian variables to model the topical distribution of documents in the presence of correlations among topics. In this thesis, we propose a Probit normal alternative approach to modelling correlated topical structures. Our use of the Probit model in the context of topic discovery is novel, as many authors have so far concentrated solely of the logistic model partly due to the formidable inefficiency of the multinomial Probit model even in the case of very small topical spaces. We herein circumvent the inefficiency of multinomial Probit estimation by using an adaptation of the Diagonal Orthant Multinomial Probit (DO-Probit) in the topic models context, resulting in the ability of our topic modelling scheme to handle corpuses with a large number of latent topics. In addition, we extended our model and implement it into the context of image annotation by developing an efficient Collapsed Gibbs Sampling scheme. Furthermore, we employed various high performance computing techniques such as memory-aware Map Reduce, SpareseLDA implementation, vectorization and block sampling as well as some numerical efficiency strategy to allow fast and efficient sampling of our algorithm

    Probit Normal Correlated Topic Models

    Get PDF
    The logistic normal distribution has recently been adapted via the transformation of multivariate Gaussian variables to model the topical distribution of documents in the presence of correlations among topics. In this paper, we propose a probit normal alternative approach to modelling correlated topical structures. Our use of the probit model in the context of topic discovery is novel, as many authors have so far concentrated solely of the logistic model partly due to the formidable inefficiency of the multinomial probit model even in the case of very small topical spaces. We herein circumvent the inefficiency of multinomial probit estimation by using an adaptation of the diagonal orthant multinomial probit in the topic models context, resulting in the ability of our topic modelling scheme to handle corpuses with a large number of latent topics. An additional and very important benefit of our method lies in the fact that unlike with the logistic normal model whose non-conjugacy leads to the need for sophisticated sampling schemes, our approach exploits the natural conjugacy inherent in the auxiliary formulation of the probit model to achieve greater simplicity. The application of our proposed scheme to a well known Associated Press corpus not only helps discover a large number of meaningful topics but also reveals the capturing of compellingly intuitive correlations among certain topics. Besides, our proposed approach lends itself to even further scalability thanks to various existing high performance algorithms and architectures capable of handling millions of documents

    Towards zero re-training for long-term hand gesture recognition via ultrasound sensing

    Get PDF

    Estimation of Reference Voltages for Time-difference Electrical Impedance Tomography

    Get PDF
    corecore