Search CORE

5,770 research outputs found

Improving Term Frequency Normalization for Multi-topical Documents, and Application to Language Modeling Approaches

Author: Kang In-Su
Lee Jong-Hyeok
Na Seung-Hoon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/02/2015
Field of study

Term frequency normalization is a serious issue since lengths of documents are various. Generally, documents become long due to two different reasons - verbosity and multi-topicality. First, verbosity means that the same topic is repeatedly mentioned by terms related to the topic, so that term frequency is more increased than the well-summarized one. Second, multi-topicality indicates that a document has a broad discussion of multi-topics, rather than single topic. Although these document characteristics should be differently handled, all previous methods of term frequency normalization have ignored these differences and have used a simplified length-driven approach which decreases the term frequency by only the length of a document, causing an unreasonable penalization. To attack this problem, we propose a novel TF normalization method which is a type of partially-axiomatic approach. We first formulate two formal constraints that the retrieval model should satisfy for documents having verbose and multi-topicality characteristic, respectively. Then, we modify language modeling approaches to better satisfy these two constraints, and derive novel smoothing methods. Experimental results show that the proposed method increases significantly the precision for keyword queries, and substantially improves MAP (Mean Average Precision) for verbose queries.Comment: 8 pages, conference paper, published in ECIR '0

arXiv.org e-Print Archive

CiteSeerX

An Axiomatic Analysis of Diversity Evaluation Metrics: Introducing the Rank-Biased Utility Metric

Author: Alistair Moffat
Collins-Thompson Kevyn
Sakai Tetsuya
Voorhees Ellen M.
Yang Hui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/08/2018
Field of study

Many evaluation metrics have been defined to evaluate the effectiveness ad-hoc retrieval and search result diversification systems. However, it is often unclear which evaluation metric should be used to analyze the performance of retrieval systems given a specific task. Axiomatic analysis is an informative mechanism to understand the fundamentals of metrics and their suitability for particular scenarios. In this paper, we define a constraint-based axiomatic framework to study the suitability of existing metrics in search result diversification scenarios. The analysis informed the definition of Rank-Biased Utility (RBU) -- an adaptation of the well-known Rank-Biased Precision metric -- that takes into account redundancy and the user effort associated to the inspection of documents in the ranking. Our experiments over standard diversity evaluation campaigns show that the proposed metric captures quality criteria reflected by different metrics, being suitable in the absence of knowledge about particular features of the scenario under study.Comment: Original version: 10 pages. Preprint of full paper to appear at SIGIR'18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, July 8-12, 2018, Ann Arbor, MI, USA. ACM, New York, NY, US

arXiv.org e-Print Archive

Crossref

Investigating Retrieval Method Selection with Axiomatic Features

Author: Arora S.
Yates A.
Publication venue
Publication date: 01/01/2019
Field of study

We consider algorithm selection in the context of ad-hoc information retrieval. Given a query and a pair of retrieval methods, we propose a meta-learner that predicts how to combine the methods' relevance scores into an overall relevance score. Inspired by neural models' different properties with regard to IR axioms, these predictions are based on features that quantify axiom-related properties of the query and its top ranked documents. We conduct an evaluation on TREC Web Track data and find that the meta-learner often significantly improves over the individual methods. Finally, we conduct feature and query weight analyses to investigate the meta-learner's behavior

MPG.PuRe

The Sustainability, Preservation and Accessibility of Internal and External Communities by Universities

Author: Barragan Salvador
Trimble Jeffrey A.
Publication venue: Georgia Institute of Technology
Publication date: 20/05/2009
Field of study

4th International Conference on Open RepositoriesThis presentation was part of the session : DSpace User Group PresentationsDate: 2009-05-20 03:30 PM – 05:00 PMThis paper will provide three different cases or examples of how a mid-size University is able to implement DSpace across diverse groups of users. Additionally, one of the cases will show how the DSpace software has been 'repurposed' to serve as the university library's Electronic Reserve and how it has been linked the library's ILS. The paper will show how the university has obtained a consistent level of sustainability, preservation and accessibility to using DSpace with limited resources

Scholarly Materials And Research @ Georgia Tech