31 research outputs found
Evaluating implicit feedback models using searcher simulations
In this article we describe an evaluation of relevance feedback (RF) algorithms using searcher simulations. Since these algorithms select additional terms for query modification based on inferences made from searcher interaction, not on relevance information searchers explicitly provide (as in traditional RF), we refer to them as implicit feedback models. We introduce six different models that base their decisions on the interactions of searchers and use different approaches to rank query modification terms. The aim of this article is to determine which of these models should be used to assist searchers in the systems we develop. To evaluate these models we used searcher simulations that afforded us more control over the experimental conditions than experiments with human subjects and allowed complex interaction to be modeled without the need for costly human experimentation. The simulation-based evaluation methodology measures how well the models learn the distribution of terms across relevant documents (i.e., learn what information is relevant) and how well they improve search effectiveness (i.e., create effective search queries). Our findings show that an implicit feedback model based on Jeffrey's rule of conditioning outperformed other models under investigation
The Effect of Adding Relevance Information in a Relevance Feedback Environment
The effects of adding information from relevant documents are examined in the TREC routing environment. A modified Rocchio relevance feedback approach is used, with a varying number of relevant documents retrieved by an initial SMART search, and a varying number of terms from those relevant documents used to expand the initial query. Recall-precision evaluation reveals that as the amount of expansion of the query due to adding terms from relevant documents increases, so does the effectiveness. It is observed for this particular experiment that there seems to be a linear relationship between the log of the number of terms added and the recall-precision effectiveness. There also seems to be a linear relationship between the log of the number of known relevant documents and the recall-precision effectiveness. 1 Introduction Relevance feedback is a commonly accepted method of improving interactive retrieval effectiveness.[1, 2] An initial search is made by the system with a user-supplied ..
An evaluation of term dependence models in information retrieval
In practical retrieval environments the assumption is normally made that the terms assigned to the documents of a collection occur independently of each other. The term independence assnmption is unrealistic in many cases, but its use leads to a simple retrieval algorithm. More realistic retrieval systems take into account dependencies between certain term pairs and possibly between term triples. In this study, methods are outlined for generating dependency factors for term pairs and term triples and for using them in retrieval. Evaluation output is included to demonstrate the effectiveness of the suggested methodologies. i. Term Dependency Models From a decision-theoretic viewpoint, the information retrieval task is con-trolled by two probabilistic parameters which specify for each document of a collec-tion the probability of relevance, and the probability of nonrelevance, with respect to a particular query. The larger the probability of relevance, and the smaller the probability of nonrelevance, the greater is the retrieval probability for the given item. Consider in particular an item ~ in the data base represented by binary attri-butes (Xl,X2,...,Xn), where x i takes on the values i or 0 depending on whether the ith attribute is or is not assigned to item ~. For each item ~ and each query Q, it is in principle possible to generate the two parameters P(xJrel) and P(x[nonrel), representing the probabilities that a relevant and a nonrelevant item, respectively, has vector representation ~. Using decision theoretic considerations, it is easy to show that an optimal retrieval rule will rank the documents in decreasing order according to the expression P(x!ret) P(~[nonrel) (I) That is, given two items x and v, x should be retrieved ahead of ~ whenever the value of expression (I) for x exceeds the corresponding value for ~. [1-5