6,418 research outputs found
User centred evaluation of a recommendation based image browsing system
In this paper, we introduce a novel approach to recommend images by mining user interactions based on implicit feedback of user browsing. The underlying hypothesis is that the interaction implicitly indicates the interests of the users for meeting practical image retrieval tasks. The algorithm mines interaction data and also low-level content of the clicked images to choose diverse images by clustering heterogeneous features. A user-centred, task-oriented, comparative evaluation was undertaken to verify the validity of our approach where two versions of systems { one set up to enable diverse image recommendation { the other allowing browsing only { were compared. Use was made of the two systems by users in simulated work task situations and quantitative and qualitative data collected as indicators of recommendation results and the levels of user's satisfaction. The responses from the users indicate that they nd the more diverse recommendation highly useful
Efficient Diversification of Web Search Results
In this paper we analyze the efficiency of various search results
diversification methods. While efficacy of diversification approaches has been
deeply investigated in the past, response time and scalability issues have been
rarely addressed. A unified framework for studying performance and feasibility
of result diversification solutions is thus proposed. First we define a new
methodology for detecting when, and how, query results need to be diversified.
To this purpose, we rely on the concept of "query refinement" to estimate the
probability of a query to be ambiguous. Then, relying on this novel ambiguity
detection method, we deploy and compare on a standard test set, three different
diversification methods: IASelect, xQuAD, and OptSelect. While the first two
are recent state-of-the-art proposals, the latter is an original algorithm
introduced in this paper. We evaluate both the efficiency and the effectiveness
of our approach against its competitors by using the standard TREC Web
diversification track testbed. Results shown that OptSelect is able to run two
orders of magnitude faster than the two other state-of-the-art approaches and
to obtain comparable figures in diversification effectiveness.Comment: VLDB201
Stochastic Query Covering for Fast Approximate Document Retrieval
We design algorithms that, given a collection of documents and a distribution over user queries, return a
small subset of the document collection in such a way that we can efficiently provide high-quality answers
to user queries using only the selected subset. This approach has applications when space is a constraint
or when the query-processing time increases significantly with the size of the collection. We study our
algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction
of the entire collection, they can provide answers to most user queries, achieving a performance close to the
optimal. To complement our theoretical findings, we experimentally show the versatility of our approach
by considering two important cases in the context of Web search. In the first case, we favor the retrieval of
documents that are relevant to the query, whereas in the second case we aim for document diversification.
Both the theoretical and the experimental analysis provide strong evidence of the potential value of query
covering in diverse application scenarios
Exploiting multimedia in creating and analysing multimedia Web archives
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general
On the suitability of intent spaces for IR diversification
This is an electronic version of the paper presented at the International Workshop on Diversity in Document Retrieval (DDR 2012), held in Seattle on 2012Recent developments in Information Retrieval diversity are based on the consideration of a space of information need aspects, a notion which takes different forms in the literature. The choice of a suitable aspect space for diversification is a critical issue when designing an IR diversification strategy, which has not been explicitly addressed to some depth in the literature. This paper aims to identify relevant properties of the aspect space which may help the system designer in making a suitable choice in selecting and configuring this space, and diagnosing malfunctions of the diversification algorithms. In particular, we identify the mutual information between aspects and documents as a meaningful magnitude, in terms of which anomalous cases can be characterized. We further seek to discern favorable cases through a combination of theoretic and empirical analysis.This work is supported by the Spanish Government (TIN2011-28538-C02-01), and the Government of Madrid (S2009TIC-1542)
The impact of result diversification on search behaviour and performance
Result diversification aims to provide searchers with a broader view of a given topic while attempting to maximise the chances of retrieving relevant material. Diversifying results also aims to reduce search bias by increasing the coverage over different aspects of the topic. As such, searchers should learn more about the given topic in general. Despite diversification algorithms being introduced over two decades ago, little research has explicitly examined their impact on search behaviour and performance in the context of Interactive Information Retrieval (IIR). In this paper, we explore the impact of diversification when searchers undertake complex search tasks that require learning about different aspects of a topic (aspectual retrieval). We hypothesise that by diversifying search results, searchers will be exposed to a greater number of aspects. In turn, this will maximise their coverage of the topic (and thus reduce possible search bias). As a consequence, diversification should lead to performance benefits, regardless of the task, but how does diversification affect search behaviours and search satisfaction?
Based on Information Foraging Theory (IFT), we infer two hypotheses regarding search behaviours due to diversification, namely that (i) it will lead to searchers examining fewer documents per query, and (ii) it will also mean searchers will issue more queries overall. To this end, we performed a within-subjects user study using the TREC AQUAINT collection with 51 participants, examining the differences in search performance and behaviour when using (i) a non-diversified system (BM25) versus (ii) a diversified system (BM25+xQuAD) when the search task is either (a) ad-hoc or (b) aspectual. Our results show a number of notable findings in terms of search behaviour: participants on the diversified system issued more queries and examined fewer documents per query when performing the aspectual search task. Furthermore, we showed that when using the diversified system, participants were: more successful in marking relevant documents, and obtained a greater awareness of the topics (i.e. identified relevant documents containing more novel aspects). These findings show that search behaviour is influenced by diversification and task complexity. They also motivate further research into complex search tasks such as aspectual retrieval -- and how diversity can play an important role in improving the search experience, by providing greater coverage of a topic and mitigating potential bias in search results
Intent-aware search result diversification
Search result diversification has gained momentum as a way to tackle ambiguous queries. An effective approach to this problem is to explicitly model the possible aspects underlying a query, in order to maximise the estimated relevance of the retrieved documents with respect to the different aspects. However, such aspects themselves may represent information needs with rather distinct intents (e.g., informational or navigational). Hence, a diverse ranking could benefit from applying intent-aware retrieval models when estimating the relevance of documents to different aspects. In this paper, we propose to diversify the results retrieved for a given query, by learning the appropriateness of different retrieval models for each of the aspects underlying this query. Thorough experiments within the evaluation framework provided by the diversity task of the TREC 2009 and 2010 Web tracks show that the proposed approach can significantly improve state-of-the-art diversification approaches
Using score differences for search result diversification
We investigate the application of a light-weight approach to result list clustering for the purposes of diversifying search results. We introduce a novel post-retrieval approach, which is independent of external information or even the full-text content of retrieved documents; only the retrieval score of a document is used. Our experiments show that this novel approach is bene cial to e ectiveness, albeit only on certain baseline systems. The fact that the method works indicates that the retrieval score is potentially exploitable in diversity
- …