5 research outputs found

    Dirichlet belief networks for topic structure learning

    Full text link
    Recently, considerable research effort has been devoted to developing deep architectures for topic models to learn topic structures. Although several deep models have been proposed to learn better topic proportions of documents, how to leverage the benefits of deep structures for learning word distributions of topics has not yet been rigorously studied. Here we propose a new multi-layer generative process on word distributions of topics, where each layer consists of a set of topics and each topic is drawn from a mixture of the topics of the layer above. As the topics in all layers can be directly interpreted by words, the proposed model is able to discover interpretable topic hierarchies. As a self-contained module, our model can be flexibly adapted to different kinds of topic models to improve their modelling accuracy and interpretability. Extensive experiments on text corpora demonstrate the advantages of the proposed model.Comment: accepted in NIPS 201

    On Measuring Social Dynamics of Online Social Media

    No full text
    Due to the complex nature of human behaviour and to our inability to directly measure thoughts and feelings, social psychology has long struggled for empirical grounding for its theories and models. Traditional techniques involving groups of people in controlled environments are limited to small numbers and may not be a good analogue for real social interactions in natural settings due to their controlled and artificial nature. Their application as a foundation for simulation of social processes suffers similarly. The proliferation of online social media offers new opportunities to observe social phenomena “in the wild” that have only just begun to be realised. To date, analysis of social media data has been largely focussed on specific, commercially relevant goals (such as sentiment analysis) that are of limited use to social psychology, and the dynamics critical to an understanding of social processes is rarely addressed or even present in collected data. This thesis addresses such shortfalls by: (i) presenting a novel data collection strategy and system for rich dynamic data from communities operating on Twitter; (ii) a data set encompassing longitudinal dynamic information over two and a half years from the online pro-ana (pro-anorexia) movement; and (iii) two approaches to identifying active social psychological processes in collections of online text and network metadata: an approach linking traditional psychometric studies with topic models and an algorithm combining community detection in user networks with topic models of the social media text they generate, enabling identification of community specific topic usage

    Modelling Sequential Text with an Adaptive Topic Model

    No full text
    Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent semantic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both the previous segment and the parent document. For this proposed model, a Gibbs sampler is developed for doing posterior inference. Experimental results show that with topic adaptation, our model significantly improves over existing approaches in terms of perplexity, and is able to uncover clear sequential structure on, for example, Herman Melville’s book “Moby Dick”.

    Modelling Sequential Text with an Adaptive Topic Model

    No full text
    corecore