481 research outputs found
The use of implicit evidence for relevance feedback in web retrieval
In this paper we report on the application of two contrasting types of relevance feedback for web retrieval. We compare two systems; one using explicit relevance feedback (where searchers explicitly have to mark documents relevant) and one using implicit relevance feedback (where the system endeavours to estimate relevance by mining the searcher's interaction). The feedback is used to update the display according to the user's interaction. Our research focuses on the degree to which implicit evidence of document relevance can be substituted for explicit evidence. We examine the two variations in terms of both user opinion and search effectiveness
One-Shot Labeling for Automatic Relevance Estimation
Dealing with unjudged documents ("holes") in relevance assessments is a
perennial problem when evaluating search systems with offline experiments.
Holes can reduce the apparent effectiveness of retrieval systems during
evaluation and introduce biases in models trained with incomplete data. In this
work, we explore whether large language models can help us fill such holes to
improve offline evaluations. We examine an extreme, albeit common, evaluation
setting wherein only a single known relevant document per query is available
for evaluation. We then explore various approaches for predicting the relevance
of unjudged documents with respect to a query and the known relevant document,
including nearest neighbor, supervised, and prompting techniques. We find that
although the predictions of these One-Shot Labelers (1SL) frequently disagree
with human assessments, the labels they produce yield a far more reliable
ranking of systems than the single labels do alone. Specifically, the strongest
approaches can consistently reach system ranking correlations of over 0.86 with
the full rankings over a variety of measures. Meanwhile, the approach
substantially increases the reliability of t-tests due to filling holes in
relevance assessments, giving researchers more confidence in results they find
to be significant. Alongside this work, we release an easy-to-use software
package to enable the use of 1SL for evaluation of other ad-hoc collections or
systems.Comment: SIGIR 202
Graph Exploration Matters: Improving both individual-level and system-level diversity in WeChat Feed Recommender
There are roughly three stages in real industrial recommendation systems,
candidates generation (retrieval), ranking and reranking. Individual-level
diversity and system-level diversity are both important for industrial
recommender systems. The former focus on each single user's experience, while
the latter focus on the difference among users. Graph-based retrieval
strategies are inevitably hijacked by heavy users and popular items, leading to
the convergence of candidates for users and the lack of system-level diversity.
Meanwhile, in the reranking phase, Determinantal Point Process (DPP) is
deployed to increase individual-level diverisity. Heavily relying on the
semantic information of items, DPP suffers from clickbait and inaccurate
attributes. Besides, most studies only focus on one of the two levels of
diversity, and ignore the mutual influence among different stages in real
recommender systems. We argue that individual-level diversity and system-level
diversity should be viewed as an integrated problem, and we provide an
efficient and deployable solution for web-scale recommenders. Generally, we
propose to employ the retrieval graph information in diversity-based reranking,
by which to weaken the hidden similarity of items exposed to users, and
consequently gain more graph explorations to improve the system-level
diveristy. Besides, we argue that users' propensity for diversity changes over
time in content feed recommendation. Therefore, with the explored graph, we
also propose to capture the user's real-time personalized propensity to the
diversity. We implement and deploy the combined system in WeChat App's Top
Stories used by hundreds of millions of users. Offline simulations and online
A/B tests show our solution can effectively improve both user engagement and
system revenue
Evaluation in natural language processing
quot; European Summer School on Language Logic and Information(ESSLLI 2007)(Trinity College Dublin Ireland 6-17 August 2007
Cross-language Information Retrieval
Two key assumptions shape the usual view of ranked retrieval: (1) that the
searcher can choose words for their query that might appear in the documents
that they wish to see, and (2) that ranking retrieved documents will suffice
because the searcher will be able to recognize those which they wished to find.
When the documents to be searched are in a language not known by the searcher,
neither assumption is true. In such cases, Cross-Language Information Retrieval
(CLIR) is needed. This chapter reviews the state of the art for CLIR and
outlines some open research questions.Comment: 49 pages, 0 figure
- …