5,770 research outputs found
Improving Term Frequency Normalization for Multi-topical Documents, and Application to Language Modeling Approaches
Term frequency normalization is a serious issue since lengths of documents
are various. Generally, documents become long due to two different reasons -
verbosity and multi-topicality. First, verbosity means that the same topic is
repeatedly mentioned by terms related to the topic, so that term frequency is
more increased than the well-summarized one. Second, multi-topicality indicates
that a document has a broad discussion of multi-topics, rather than single
topic. Although these document characteristics should be differently handled,
all previous methods of term frequency normalization have ignored these
differences and have used a simplified length-driven approach which decreases
the term frequency by only the length of a document, causing an unreasonable
penalization. To attack this problem, we propose a novel TF normalization
method which is a type of partially-axiomatic approach. We first formulate two
formal constraints that the retrieval model should satisfy for documents having
verbose and multi-topicality characteristic, respectively. Then, we modify
language modeling approaches to better satisfy these two constraints, and
derive novel smoothing methods. Experimental results show that the proposed
method increases significantly the precision for keyword queries, and
substantially improves MAP (Mean Average Precision) for verbose queries.Comment: 8 pages, conference paper, published in ECIR '0
An Axiomatic Analysis of Diversity Evaluation Metrics: Introducing the Rank-Biased Utility Metric
Many evaluation metrics have been defined to evaluate the effectiveness
ad-hoc retrieval and search result diversification systems. However, it is
often unclear which evaluation metric should be used to analyze the performance
of retrieval systems given a specific task. Axiomatic analysis is an
informative mechanism to understand the fundamentals of metrics and their
suitability for particular scenarios. In this paper, we define a
constraint-based axiomatic framework to study the suitability of existing
metrics in search result diversification scenarios. The analysis informed the
definition of Rank-Biased Utility (RBU) -- an adaptation of the well-known
Rank-Biased Precision metric -- that takes into account redundancy and the user
effort associated to the inspection of documents in the ranking. Our
experiments over standard diversity evaluation campaigns show that the proposed
metric captures quality criteria reflected by different metrics, being suitable
in the absence of knowledge about particular features of the scenario under
study.Comment: Original version: 10 pages. Preprint of full paper to appear at
SIGIR'18: The 41st International ACM SIGIR Conference on Research &
Development in Information Retrieval, July 8-12, 2018, Ann Arbor, MI, USA.
ACM, New York, NY, US
Investigating Retrieval Method Selection with Axiomatic Features
We consider algorithm selection in the context of ad-hoc information retrieval. Given a query and a pair of retrieval methods, we propose a meta-learner that predicts how to combine the methods' relevance scores into an overall relevance score. Inspired by neural models' different properties with regard to IR axioms, these predictions are based on features that quantify axiom-related properties of the query and its top ranked documents. We conduct an evaluation on TREC Web Track data and find that the meta-learner often significantly improves over the individual methods. Finally, we conduct feature and query weight analyses to investigate the meta-learner's behavior
The Sustainability, Preservation and Accessibility of Internal and External Communities by Universities
4th International Conference on Open RepositoriesThis presentation was part of the session : DSpace User Group PresentationsDate: 2009-05-20 03:30 PM – 05:00 PMThis paper will provide three different cases or examples of how a mid-size University is able to implement DSpace across diverse groups of users. Additionally, one of the cases will show how the DSpace software has been 'repurposed' to serve as the university library's Electronic Reserve and how it has been linked the library's ILS. The paper will show how the university has obtained a consistent level of sustainability, preservation and accessibility to using DSpace with limited resources
- …