Search CORE

12,323 research outputs found

Language Models

Author: Hiemstra D.
Publication venue: Springer Verlag
Publication date: 01/01/2009
Field of study

Contains fulltext : 227630.pdf (preprint version ) (Open Access

Radboud Repository

University of Twente Research Information

Exploring Topic-based Language Models for Effective Web Information Retrieval

Author: Hiemstra Djoerd
Kamps Jaap
Kaptein Rianne
Li Rongmei
Publication venue: Neslia Paniculata
Publication date: 01/01/2008
Field of study

The main obstacle for providing focused search is the relative opaqueness of search request -- searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can lead to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topical models with other language models based on cross-entropy. We first define our topical categories and train our topical models on the .GOV2 corpus by building parsimonious language models. We then test the topic-based model on TREC8 small Web data collection for ad-hoc search.Our experimental results show that the topic-based model outperforms the standard language model and parsimonious model

University of Twente Research Information

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Rhetorical relations for information retrieval

Author: Larsen Birger
Lioma Christina
Lu Wei
Publication venue
Publication date: 05/04/2017
Field of study

Typically, every part in most coherent text has some plausible reason for its presence, some function that it performs to the overall semantics of the text. Rhetorical relations, e.g. contrast, cause, explanation, describe how the parts of a text are linked to each other. Knowledge about this socalled discourse structure has been applied successfully to several natural language processing tasks. This work studies the use of rhetorical relations for Information Retrieval (IR): Is there a correlation between certain rhetorical relations and retrieval performance? Can knowledge about a document's rhetorical relations be useful to IR? We present a language model modification that considers rhetorical relations when estimating the relevance of a document to a query. Empirical evaluation of different versions of our model on TREC settings shows that certain rhetorical relations can benefit retrieval effectiveness notably (> 10% in mean average precision over a state-of-the-art baseline)

arXiv.org e-Print Archive

CiteSeerX

Word-Entity Duet Representations for Document Ranking

Author: Callan Jamie
Liu Tie-Yan
Xiong Chenyan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/06/2017
Field of study

This paper presents a word-entity duet framework for utilizing knowledge bases in ad-hoc retrieval. In this work, the query and documents are modeled by word-based representations and entity-based representations. Ranking features are generated by the interactions between the two representations, incorporating information from the word space, the entity space, and the cross-space connections through the knowledge graph. To handle the uncertainties from the automatically constructed entity representations, an attention-based ranking model AttR-Duet is developed. With back-propagation from ranking labels, the model learns simultaneously how to demote noisy entities and how to rank documents with the word-entity duet. Evaluation results on TREC Web Track ad-hoc task demonstrate that all of the four-way interactions in the duet are useful, the attention mechanism successfully steers the model away from noisy entities, and together they significantly outperform both word-based and entity-based learning to rank systems

arXiv.org e-Print Archive

Crossref