Search CORE

585 research outputs found

Integrating Document Clustering and Topic Modeling

Author: Xie Pengtao
Xing Eric P.
Publication venue
Publication date: 26/09/2013
Field of study

Document clustering and topic modeling are two closely related tasks which can mutually benefit each other. Topic modeling can project documents into a topic space which facilitates effective document clustering. Cluster labels discovered by document clustering can be incorporated into topic models to extract local topics specific to each cluster and global topics shared by all clusters. In this paper, we propose a multi-grain clustering topic model (MGCTM) which integrates document clustering and topic modeling into a unified framework and jointly performs the two tasks to achieve the overall best performance. Our model tightly couples two components: a mixture component used for discovering latent groups in document collection and a topic model component used for mining multi-grain topics including local topics specific to each cluster and global topics shared across clusters.We employ variational inference to approximate the posterior of hidden variables and learn model parameters. Experiments on two datasets demonstrate the effectiveness of our model.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013

arXiv.org e-Print Archive

CiteSeerX

From the User to the Medium: Neural Profiling Across Web Communities

Author: Akbari Mohammad
Chunara Rumi
Elghafari Anas
Relia Kunal
Publication venue
Publication date: 15/06/2018
Field of study

Online communities provide a unique way for individuals to access information from those in similar circumstances, which can be critical for health conditions that require daily and personalized management. As these groups and topics often arise organically, identifying the types of topics discussed is necessary to understand their needs. As well, these communities and people in them can be quite diverse, and existing community detection methods have not been extended towards evaluating these heterogeneities. This has been limited as community detection methodologies have not focused on community detection based on semantic relations between textual features of the user-generated content. Thus here we develop an approach, NeuroCom, that optimally finds dense groups of users as communities in a latent space inferred by neural representation of published contents of users. By embedding of words and messages, we show that NeuroCom demonstrates improved clustering and identifies more nuanced discussion topics in contrast to other common unsupervised learning approaches

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Learning Behavioural Context

Author: A. Gupta
A. Rabinovich
C. Galleguillos
C.C. Loy
D.M. Blei
G. Heitz
H. Buxton
I. Biederman
J. Li
J. Li
J. Sherrah
K.P. Murphy
L. Wolf
L. Zelnik-Manor
M. Bar
M. Bar
M. Bar
M. Marszalek
M. Yang
P. Carbonetto
S. Ali
S. Gong
S. Gong
S. Kumar
S. Palmer
T. Hofmann
T. Hofmann
W. Zheng
W. Zheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The original publication is available at www.springerlink.co

Crossref

Queen Mary Research Online