Search CORE

32 research outputs found

Query-Based Sampling using Snippets

Author: Hiemstra D.
Tigelaar Almer S.
Publication venue: ACM
Publication date: 01/01/2010
Field of study

Query-based sampling is a commonly used approach to model the content of servers. Conventionally, queries are sent to a server and the documents in the search results returned are downloaded in full as representation of the server’s content. We present an approach that uses the document snippets in the search results as samples instead of downloading the entire documents. We show this yields equal or better modeling performance for the same bandwidth consumption depending on collection characteristics, like document length distribution and homogeneity. Query-based sampling using snippets is a useful approach for real-world systems, since it requires no extra operations beyond exchanging queries and search results

Radboud Repository

University of Twente Research Information

Learning to merge search results for efficient Distributed Information Retrieval

Author: Hiemstra Djoerd
Tjin-Kam-Jet Kien-Tsoi T.E.
Publication venue: Radboud University
Publication date: 01/01/2010
Field of study

Merging search results from different servers is a major problem in Distributed Information Retrieval. We used Regression-SVM and Ranking-SVM which would learn a function that merges results based on information that is readily available: i.e. the ranks, titles, summaries and URLs contained in the results pages. By not downloading additional information, such as the full document, we decrease bandwidth usage. CORI and Round Robin merging were used as our baselines; surprisingly, our results show that the SVM-methods do not improve over those baselines

CiteSeerX

Radboud Repository

University of Twente Research Information

Affective feedback: an investigation into the role of emotions in the information seeking process

Author: Arapakis I.
Gray P.D.G.
Jose J.M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

User feedback is considered to be a critical element in the information seeking process, especially in relation to relevance assessment. Current feedback techniques determine content relevance with respect to the cognitive and situational levels of interaction that occurs between the user and the retrieval system. However, apart from real-life problems and information objects, users interact with intentions, motivations and feelings, which can be seen as critical aspects of cognition and decision-making. The study presented in this paper serves as a starting point to the exploration of the role of emotions in the information seeking process. Results show that the latter not only interweave with different physiological, psychological and cognitive processes, but also form distinctive patterns, according to specific task, and according to specific user

Enlighten

Improving the evaluation of web search systems

Author: Gurrin Cathal
Smeaton Alan F.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Linkage analysis as an aid to web search has been assumed to be of significant benefit and we know that it is being implemented by many major Search Engines. Why then have few TREC participants been able to scientifically prove the benefits of linkage analysis over the past three years? In this paper we put forward reasons why disappointing results have been found and we identify the linkage density requirements of a dataset to faithfully support experiments into linkage analysis. We also report a series of linkage-based retrieval experiments on a more densely linked dataset culled from the TREC web documents

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

An Information Retrieval Model Based On Word Concept

Author: Wei Xiang Feng
Wu Chen
Zhang Quan
Publication venue: 'Tsinghua University Press'
Publication date: 01/10/2006
Field of study

PACLIC 20 / Wuhan, China / 1-3 November, 200

Waseda University Repository

User performance versus precision measures for simple search tasks

Author: Scholer F
Turpin A
Publication venue: Association for Computing Machinery (USA)
Publication date: 01/01/2006
Field of study

Several recent studies have demonstrated that the type of improvements in information retrieval system effectiveness reported in forums such as SIGIR and TREC do not translate into a benefit for users. Two of the studies used an instance recall task, and a third used a question answering task, so perhaps it is unsurprising that the precision based measures of IR system effectiveness on one-shot query evaluation do not correlate with user performance on these tasks. In this study, we evaluate two different information retrieval tasks on TREC Web-track data: a precision-based user task, measured by the length of time that users need to find a single document that is relevant to a TREC topic; and, a simple recall-based task, represented by the total number of relevant documents that users can identify within five minutes. Users employ search engines with controlled mean average precision (MAP) of between 55% and 95%. Our results show that there is no significant relationship between system effectiveness measured by MAP and the precision-based task. A significant, but weak relationship is present for the precision at one document returned metric. A weak relationship is present between MAP and the simple recall-based task

RMIT Research Repository

A comparative study of probabilistic and language models for information retrieval

Author: Bennett G
Scholer F
Uitdenbogerd A
Publication venue: CRPIT (Australia)
Publication date: 01/01/2008
Field of study

Language models for information retrieval have received much attention in recent years, with many claims being made about their performance. However, previous studies evaluating the language modelling approach for information retrieval used different query sets and heterogeneous collections, which make reported results difficult to compare. This research is a broad-based study that evaluates language models against a variety of search tasks --- topic finding, named-page finding and topic distillation. The standard Text REtrieval Conference (TREC) methodology is used to compare language models to the probabilistic Okapi BM25 system. Using consistent parameter choices, we compare results of different language models on three different search tasks, multiple query sets and three different text collections. For ad hoc retrieval, the Dirichlet smoothing method was found to be significantly better than Okapi BM25, but for named-page finding Okapi BM25 was more effective than the language modelling methods. Optimal smoothing parameters for each method were found to be dependent on the collection and the query set. For longer queries, the language modelling approaches required more aggressive smoothing but they were found to be more effective than with shorter queries. The choice of smoothing method was also found to have a significant effect on the performance of language models for information retrieval

RMIT Research Repository

Scale-free network clustering in hyperbolic and other random graphs

Author: Stegehuis Clara
van der Hofstad Remco
van Leeuwaarden Johan S. H.
Publication venue: 'IOP Publishing'
Publication date: 07/12/2018
Field of study

Random graphs with power-law degrees can model scale-free networks as sparse topologies with strong degree heterogeneity. Mathematical analysis of such random graphs proved successful in explaining scale-free network properties such as resilience, navigability and small distances. We introduce a variational principle to explain how vertices tend to cluster in triangles as a function of their degrees. We apply the variational principle to the hyperbolic model that quickly gains popularity as a model for scale-free networks with latent geometries and clustering. We show that clustering in the hyperbolic model is non-vanishing and self-averaging, so that a single random graph sample is a good representation in the large-network limit. We also demonstrate the variational principle for some classical random graphs including the preferential attachment model and the configuration model

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Tilburg University Repository

Peer to Peer Information Retrieval: An Overview

Author: Hiemstra Djoerd
Tigelaar Almer S.
Trieschnigg Dolf
Publication venue: ACM
Publication date: 01/01/2012
Field of study

Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

Crossref

Radboud Repository

University of Twente Research Information