33,253 research outputs found
Contextualised Browsing in a Digital Library's Living Lab
Contextualisation has proven to be effective in tailoring \linebreak search
results towards the users' information need. While this is true for a basic
query search, the usage of contextual session information during exploratory
search especially on the level of browsing has so far been underexposed in
research. In this paper, we present two approaches that contextualise browsing
on the level of structured metadata in a Digital Library (DL), (1) one variant
bases on document similarity and (2) one variant utilises implicit session
information, such as queries and different document metadata encountered during
the session of a users. We evaluate our approaches in a living lab environment
using a DL in the social sciences and compare our contextualisation approaches
against a non-contextualised approach. For a period of more than three months
we analysed 47,444 unique retrieval sessions that contain search activities on
the level of browsing. Our results show that a contextualisation of browsing
significantly outperforms our baseline in terms of the position of the first
clicked item in the result set. The mean rank of the first clicked document
(measured as mean first relevant - MFR) was 4.52 using a non-contextualised
ranking compared to 3.04 when re-ranking the result lists based on similarity
to the previously viewed document. Furthermore, we observed that both
contextual approaches show a noticeably higher click-through rate. A
contextualisation based on document similarity leads to almost twice as many
document views compared to the non-contextualised ranking.Comment: 10 pages, 2 figures, paper accepted at JCDL 201
Personalized content retrieval in context using ontological knowledge
Personalized content retrieval aims at improving the retrieval process by taking into account the particular interests of individual users. However, not all user preferences are relevant in all situations. It is well known that human preferences are complex, multiple, heterogeneous, changing, even contradictory, and should be understood in context with the user goals and tasks at hand. In this paper, we propose a method to build a dynamic representation of the semantic context of ongoing retrieval tasks, which is used to activate different subsets of user interests at runtime, in a way that out-of-context preferences are discarded. Our approach is based on an ontology-driven representation of the domain of discourse, providing enriched descriptions of the semantics involved in retrieval actions and preferences, and enabling the definition of effective means to relate preferences and context
PRESY: A Context Based Query Reformulation Tool for Information Retrieval on the Web
Problem Statement: The huge number of information on the web as well as the
growth of new inexperienced users creates new challenges for information
retrieval. It has become increasingly difficult for these users to find
relevant documents that satisfy their individual needs. Certainly the current
search engines (such as Google, Bing and Yahoo) offer an efficient way to
browse the web content. However, the result quality is highly based on uses
queries which need to be more precise to find relevant documents. This task
still complicated for the majority of inept users who cannot express their
needs with significant words in the query. For that reason, we believe that a
reformulation of the initial user's query can be a good alternative to improve
the information selectivity. This study proposes a novel approach and presents
a prototype system called PRESY (Profile-based REformulation SYstem) for
information retrieval on the web. Approach: It uses an incremental approach to
categorize users by constructing a contextual base. The latter is composed of
two types of context (static and dynamic) obtained using the users' profiles.
The architecture proposed was implemented using .Net environment to perform
queries reformulating tests. Results: The experiments gives at the end of this
article show that the precision of the returned content is effectively
improved. The tests were performed with the most popular searching engine (i.e.
Google, Bind and Yahoo) selected in particular for their high selectivity.
Among the given results, we found that query reformulation improve the first
three results by 10.7% and 11.7% of the next seven returned elements. So as we
can see the reformulation of users' initial queries improves the pertinence of
returned content.Comment: 8 page
Closing the loop: assisting archival appraisal and information retrieval in one sweep
In this article, we examine the similarities between the concept of appraisal, a process that takes place within the archives, and the concept of relevance judgement, a process fundamental to the evaluation of information retrieval systems. More specifically, we revisit selection criteria proposed as result of archival research, and work within the digital curation communities, and, compare them to relevance criteria as discussed within information retrieval's literature based discovery. We illustrate how closely these criteria relate to each other and discuss how understanding the relationships between the these disciplines could form a basis for proposing automated selection for archival processes and initiating multi-objective learning with respect to information retrieval
Calendar based contextual information as an Internet content pre-caching tool
Motivated by the need to access internet content on mobile devices with expensive or non-existent network access, this paper discusses the possibility for contextual information extracted from electronic calendars to be used as sources for Internet content predictive retrieval (pre-caching). Our results show that calendar based contextual information is useful for this purpose and that calendar based information can produce web queries that are relevant to the users' task supportive information needs
Counterfactual Estimation and Optimization of Click Metrics for Search Engines
Optimizing an interactive system against a predefined online metric is
particularly challenging, when the metric is computed from user feedback such
as clicks and payments. The key challenge is the counterfactual nature: in the
case of Web search, any change to a component of the search engine may result
in a different search result page for the same query, but we normally cannot
infer reliably from search log how users would react to the new result page.
Consequently, it appears impossible to accurately estimate online metrics that
depend on user feedback, unless the new engine is run to serve users and
compared with a baseline in an A/B test. This approach, while valid and
successful, is unfortunately expensive and time-consuming. In this paper, we
propose to address this problem using causal inference techniques, under the
contextual-bandit framework. This approach effectively allows one to run
(potentially infinitely) many A/B tests offline from search log, making it
possible to estimate and optimize online metrics quickly and inexpensively.
Focusing on an important component in a commercial search engine, we show how
these ideas can be instantiated and applied, and obtain very promising results
that suggest the wide applicability of these techniques
Exploring Topic-based Language Models for Effective Web Information Retrieval
The main obstacle for providing focused search is the relative opaqueness of search request -- searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can lead to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topical models with other language models based on cross-entropy. We first define our topical categories and train our topical models on the .GOV2 corpus by building parsimonious language models. We then test the topic-based model on TREC8 small Web data collection for ad-hoc search.Our experimental results show that the topic-based model outperforms the standard language model and parsimonious model
- …