Search CORE

31 research outputs found

Evaluating implicit feedback models using searcher simulations

Author: Borlund P.
Buckley C.
C. J. Van Rijsbergen
Campbell I.
Chi E. H.
Chi E. H.
Hamming R. W.
Harman D.
Harman D.
Ian Ruthven
Joemon M. Jose
Lam W.
Magennis M.
Morita M.
Paek T.
Pirolli P.
Robertson S. E.
Ruthven I.
Ryen W. White
Salton G.
Saracevic T.
Tombros A.
White R. W.
White R. W.
White R. W.
White R. W.
White R. W.
Zellweger P. T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2005
Field of study

In this article we describe an evaluation of relevance feedback (RF) algorithms using searcher simulations. Since these algorithms select additional terms for query modification based on inferences made from searcher interaction, not on relevance information searchers explicitly provide (as in traditional RF), we refer to them as implicit feedback models. We introduce six different models that base their decisions on the interactions of searchers and use different approaches to rank query modification terms. The aim of this article is to determine which of these models should be used to assist searchers in the systems we develop. To evaluate these models we used searcher simulations that afforded us more control over the experimental conditions than experiments with human subjects and allowed complex interaction to be modeled without the need for costly human experimentation. The simulation-based evaluation methodology measures how well the models learn the distribution of terms across relevant documents (i.e., learn what information is relevant) and how well they improve search effectiveness (i.e., create effective search queries). Our findings show that an implicit feedback model based on Jeffrey's rule of conditioning outperformed other models under investigation

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

Enlighten

An information-theoretic approach to automatic query expansion

Author: BALLERINI J.P.
BIGI B.
BRAJNIK G.
Brigitte Bigi
BUCKLEY C.
CARPINETO C.
CARPINETO C.
CARPINETO C.
Claudio Carpineto
COOPER J.W.
CROFT W.
DEERWESTER S.
DIETTERICH T.
DOSZCOCKS T. E.
EFTHIMIADIS E. N.
FITZPATRICK L.
Giovanni Romano
HARMAN D.
HARPER D.J.
HAWKING D.
HEARST M.A.
KARP D.
KATZ S.
LARKEY L.S.
MITRA M.
PONTE J.M.
PORTER M. F.
Renato de Mori
ROBERTSON S.E.
ROBERTSON S.E.
ROBERTSON S.E.
ROCCHIO J.
SALTON G.
SCHAPIRE R.E.
SINGHAL A.
VAN RIJSBERGEN C. J.
VAN RIJSBERGEN C.J.
VELEZ B.
VOORHEES E.
VOORHEES E. M.
VOORHEES E.M.
XU J.
XU J.
YANG K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

The automatic identification of stop words

Author: C. Buckley
C. Buckley
C.J. van Rijsbergen
G. Salton
G. Salton
G. Salton.
Karl Sirotkin
W. John Wilbur
W.B. Croft
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

The Effect of Adding Relevance Information in a Relevance Feedback Environment

Author: C Buckley
DK Harman
G Salton
G Salton
SE Robertson
SE Robertson
Publication venue: Springer-Verlag
Publication date: 01/01/1994
Field of study

The effects of adding information from relevant documents are examined in the TREC routing environment. A modified Rocchio relevance feedback approach is used, with a varying number of relevant documents retrieved by an initial SMART search, and a varying number of terms from those relevant documents used to expand the initial query. Recall-precision evaluation reveals that as the amount of expansion of the query due to adding terms from relevant documents increases, so does the effectiveness. It is observed for this particular experiment that there seems to be a linear relationship between the log of the number of terms added and the recall-precision effectiveness. There also seems to be a linear relationship between the log of the number of known relevant documents and the recall-precision effectiveness. 1 Introduction Relevance feedback is a commonly accepted method of improving interactive retrieval effectiveness.[1, 2] An initial search is made by the system with a user-supplied ..

CiteSeerX

Crossref

An evaluation of term dependence models in information retrieval

Author: C. Buckley
C. T. Yu
G. Salton
Publication venue
Publication date
Field of study

In practical retrieval environments the assumption is normally made that the terms assigned to the documents of a collection occur independently of each other. The term independence assnmption is unrealistic in many cases, but its use leads to a simple retrieval algorithm. More realistic retrieval systems take into account dependencies between certain term pairs and possibly between term triples. In this study, methods are outlined for generating dependency factors for term pairs and term triples and for using them in retrieval. Evaluation output is included to demonstrate the effectiveness of the suggested methodologies. i. Term Dependency Models From a decision-theoretic viewpoint, the information retrieval task is con-trolled by two probabilistic parameters which specify for each document of a collec-tion the probability of relevance, and the probability of nonrelevance, with respect to a particular query. The larger the probability of relevance, and the smaller the probability of nonrelevance, the greater is the retrieval probability for the given item. Consider in particular an item ~ in the data base represented by binary attri-butes (Xl,X2,...,Xn), where x i takes on the values i or 0 depending on whether the ith attribute is or is not assigned to item ~. For each item ~ and each query Q, it is in principle possible to generate the two parameters P(xJrel) and P(x[nonrel), representing the probabilities that a relevant and a nonrelevant item, respectively, has vector representation ~. Using decision theoretic considerations, it is easy to show that an optimal retrieval rule will rank the documents in decreasing order according to the expression P(x!ret) P(~[nonrel) (I) That is, given two items x and v, x should be retrieved ahead of ~ whenever the value of expression (I) for x exceeds the corresponding value for ~. [1-5

CiteSeerX