Search CORE

3 research outputs found

Comparison of Two-pass Algorithms for Dynamic Topic Modelling Based on Matrix Decompositions

Author: Alexandrov Mikhail
Cardiff John
Skitalinskaya Gabriella
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2017
Field of study

In this paper we present a two-pass algorithm based on different matrix decompositions, such as LSI, PCA, ICA and NMF, which allows tracking of the evolution of topics over time. The proposed dynamic topic models as output give an easily interpreted overview of topics found in a sequentially organized set of documents that does not require further processing. Each topic is presented by a user-specified number of top-terms. Such an approach to topic modeling if applied to, for example, a news article data set, can be convenient and useful for economists, sociologists, political scientists. The proposed approach allows to achieve results comparable to those obtained using complex probabilistic models, such as LDA

Arrow@TUDublin

Modeling term associations for ad-hoc retrieval performance within language modeling framework

Author: Croft B
Wei X
Publication venue: Springer (Berlin, Germany)
Publication date: 01/01/2007
Field of study

Previous research has shown that using term associations could improve the effectiveness of information retrieval (IR) systems. However, most of the existing approaches focus on query reformulation. Document reformulation has just begun to be studied recently. In this paper, we study how to utilize term association measures to do document modeling, and what types of measures are effective in document language models. We propose a probabilistic term association measure, compare it to some traditional methods, such as the similarity co-efficient and window-based methods, in the language modeling (LM) framework, and show that significant improvements over query likelihood (QL) retrieval can be obtained. We also compare the method with state-of-the-art document modeling techniques based on latent mixture models

CiteSeerX

RMIT Research Repository

Extracting and exploiting word relationships for information retrieval

Author: Cao Guihong
Publication venue
Publication date: 01/01/2008
Field of study

Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal

Dépôt Institutionnel Numérique