Search CORE

536 research outputs found

Explicit relevance models in intent-oriented information retrieval diversification

Author: Castells Pablo
Vallet Weadon David Jordi
Vargas Saúl
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, http://dx.doi.org/10.1145/2348283.2348297.The intent-oriented search diversification methods developed in the field so far tend to build on generative views of the retrieval system to be diversified. Core algorithm components in particular redundancy assessment are expressed in terms of the probability to observe documents, rather than the probability that the documents be relevant. This has been sometimes described as a view considering the selection of a single document in the underlying task model. In this paper we propose an alternative formulation of aspect-based diversification algorithms which explicitly includes a formal relevance model. We develop means for the effective computation of the new formulation, and we test the resulting algorithm empirically. We report experiments on search and recommendation tasks showing competitive or better performance than the original diversification algorithms. The relevance-based formulation has further interesting properties, such as unifying two well-known state of the art algorithms into a single version. The relevance-based approach opens alternative possibilities for further formal connections and developments as natural extensions of the framework. We illustrate this by modeling tolerance to redundancy as an explicit configurable parameter, which can be set to better suit the characteristics of the IR task, or the evaluation metrics, as we illustrate empirically.This work was supported by the national Spanish projects TIN2011-28538-C02-01 and S2009TIC-1542

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities

Author: Ananiadou
Arai
Baraldi
Baraldi
Beil
Belkin
Bengio
Bengio
Bharambe
Carmel
Carpineto
Chang
Chen
Cheng
Cover
Cribbin
Cristianini
Cutting
Deerwester
Domingos
Drineas
Dubin
Duda
Eckart
Frantzi
Geraci
Globerson
Hatzivassiloglou
Haykin
Hearst
Hussain
Jain
Jayabharathy
Jones
Kohonen
Korkontzelos
Koshman
Kovács
Lagus
Lam
Lan
Li
Li
Luxburg
Mu
Mu
Mu
Noel
Osiński
Osiński
Ouyang
Rooneya
Salton
Stefanowski
Syed
Theodosiou
Thomas
Torgerson
Tseng
Wang
Xu
Xu
Zeng
Zhang
Publication venue: 'Wiley'
Publication date: 03/12/2014
Field of study

Descriptive document clustering aims at discovering clusters of semantically interrelated documents together with meaningful labels to summarize the content of each document cluster. In this work, we propose a novel descriptive clustering framework, referred to as CEDL. It relies on the formulation and generation of 2 types of heterogeneous objects, which correspond to documents and candidate phrases, using multilevel similarity information. CEDL is composed of 5 main processing stages. First, it simultaneously maps the documents and candidate phrases into a common co‐embedded space that preserves higher‐order, neighbor‐based proximities between the combined sets of documents and phrases. Then, it discovers an approximate cluster structure of documents in the common space. The third stage extracts promising topic phrases by constructing a discriminant model where documents along with their cluster memberships are used as training instances. Subsequently, the final cluster labels are selected from the topic phrases using a ranking scheme using multiple scores based on the extracted co‐embedding information and the discriminant output. The final stage polishes the initial clusters to reduce noise and accommodate the multitopic nature of documents. The effectiveness and competitiveness of CEDL is demonstrated qualitatively and quantitatively with experiments using document databases from different application fields

University of Liverpool Repository

Crossref

Edge Hill University Research Information Repository

The University of Manchester - Institutional Repository

Web Page Classification and Hierarchy Adaptation

Author: Qi Xiaoguang
Publication venue: Lehigh Preserve
Publication date
Field of study

Lehigh University: Lehigh Preserve

DIR 2011: Dutch_Belgian Information Retrieval Workshop Amsterdam

Author: Boscarino C.
de Rijke M.
Hofmann K.
Jijkoun V.
Meij E.
Weerkamp W.
Publication venue: University of Amsterdam, Information and Language Processing group
Publication date: 01/01/2011
Field of study

International Migration, Integration and Social Cohesion online publications