5 research outputs found
Dirichlet belief networks for topic structure learning
Recently, considerable research effort has been devoted to developing deep
architectures for topic models to learn topic structures. Although several deep
models have been proposed to learn better topic proportions of documents, how
to leverage the benefits of deep structures for learning word distributions of
topics has not yet been rigorously studied. Here we propose a new multi-layer
generative process on word distributions of topics, where each layer consists
of a set of topics and each topic is drawn from a mixture of the topics of the
layer above. As the topics in all layers can be directly interpreted by words,
the proposed model is able to discover interpretable topic hierarchies. As a
self-contained module, our model can be flexibly adapted to different kinds of
topic models to improve their modelling accuracy and interpretability.
Extensive experiments on text corpora demonstrate the advantages of the
proposed model.Comment: accepted in NIPS 201
On Measuring Social Dynamics of Online Social Media
Due to the complex nature of human behaviour and to our inability
to directly measure thoughts and feelings, social psychology has
long struggled for empirical grounding for its theories and
models. Traditional techniques involving groups of people in
controlled environments are limited to small numbers and may not
be a good analogue for real social interactions in natural
settings due to their controlled and artificial nature. Their
application as a foundation for simulation of social processes
suffers similarly.
The proliferation of online social media offers new opportunities
to observe social phenomena “in the wild” that have only just
begun to be realised. To date, analysis of social media data has
been largely focussed on specific, commercially relevant goals
(such as sentiment analysis) that are of limited use to social
psychology, and the dynamics critical to an understanding of
social processes is rarely addressed or even present in collected
data.
This thesis addresses such shortfalls by: (i) presenting a novel
data collection strategy and system for rich dynamic data from
communities operating on Twitter; (ii) a data set encompassing
longitudinal dynamic information over two and a half years from
the online pro-ana (pro-anorexia) movement; and (iii) two
approaches to identifying active social psychological processes
in collections of online text and network metadata: an approach
linking traditional psychometric studies with topic models and an
algorithm combining community detection in user networks with
topic models of the social media text they generate, enabling
identification of community specific topic usage
Modelling Sequential Text with an Adaptive Topic Model
Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent semantic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both the previous segment and the parent document. For this proposed model, a Gibbs sampler is developed for doing posterior inference. Experimental results show that with topic adaptation, our model significantly improves over existing approaches in terms of perplexity, and is able to uncover clear sequential structure on, for example, Herman Melville’s book “Moby Dick”.