Search CORE

9,951 research outputs found

Efficient Diversification of Web Search Results

Author: Capannini Gabriele
Nardini Franco Maria
Perego Raffaele
Silvestri Fabrizio
Publication venue
Publication date: 01/01/2011
Field of study

In this paper we analyze the efficiency of various search results diversification methods. While efficacy of diversification approaches has been deeply investigated in the past, response time and scalability issues have been rarely addressed. A unified framework for studying performance and feasibility of result diversification solutions is thus proposed. First we define a new methodology for detecting when, and how, query results need to be diversified. To this purpose, we rely on the concept of "query refinement" to estimate the probability of a query to be ambiguous. Then, relying on this novel ambiguity detection method, we deploy and compare on a standard test set, three different diversification methods: IASelect, xQuAD, and OptSelect. While the first two are recent state-of-the-art proposals, the latter is an original algorithm introduced in this paper. We evaluate both the efficiency and the effectiveness of our approach against its competitors by using the standard TREC Web diversification track testbed. Results shown that OptSelect is able to run two orders of magnitude faster than the two other state-of-the-art approaches and to obtain comparable figures in diversification effectiveness.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Datamining for Web-Enabled Electronic Business Applications

Author: Nayak Richi
Publication venue: Idea Group
Publication date: 01/01/2003
Field of study

Web-Enabled Electronic Business is generating massive amount of data on customer purchases, browsing patterns, usage times and preferences at an increasing rate. Data mining techniques can be applied to all the data being collected for obtaining useful information. This chapter attempts to present issues associated with data mining for web-enabled electronic-business

Queensland University of Technology ePrints Archive

Towards Affordable Disclosure of Spoken Word Archives

Author: Heeren W.F.L.
Hiemstra D.
Huijbregts M.A.H.
Jong F.M.G. de
Ordelman R.J.F.
Publication venue: ILPS, University of Amsterdam
Publication date: 01/01/2008
Field of study

This paper presents and discusses ongoing work aiming at affordable disclosure of real-world spoken word archives in general, and in particular of a collection of recorded interviews with Dutch survivors of World War II concentration camp Buchenwald. Given such collections, the least we want to be able to provide is search at different levels and a flexible way of presenting results. Strategies for automatic annotation based on speech recognition – supporting e.g., within-document search– are outlined and discussed with respect to the Buchenwald interview collection. In addition, usability aspects of the spoken word search are discussed on the basis of our experiences with the online Buchenwald web portal. It is concluded that, although user feedback is generally fairly positive, automatic annotation performance is still far from satisfactory, and requires additional research

CiteSeerX

Radboud Repository

University of Twente Research Information

Synchronous collaborative information retrieval: techniques and evaluation

Author: A.F. Smeaton
I.J. Aalbersberg
J. Pickens
M.R. Morris
N. Craswell
R.W. White
S.E. Robertson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Synchronous Collaborative Information Retrieval refers to systems that support multiple users searching together at the same time in order to satisfy a shared information need. To date most SCIR systems have focussed on providing various awareness tools in order to enable collaborating users to coordinate the search task. However, requiring users to both search and coordinate the group activity may prove too demanding. On the other hand without effective coordination policies the group search may not be effective. In this paper we propose and evaluate novel system-mediated techniques for coordinating a group search. These techniques allow for an effective division of labour across the group whereby each group member can explore a subset of the search space.We also propose and evaluate techniques to support automated sharing of knowledge across searchers in SCIR, through novel collaborative and complementary relevance feedback techniques. In order to evaluate these techniques, we propose a framework for SCIR evaluation based on simulations. To populate these simulations we extract data from TREC interactive search logs. This work represent the first simulations of SCIR to date and the first such use of this TREC data

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Start Time and Duration Distribution Estimation in Semi-Structured Processes

Author: Iacob Maria-Eugenia
Wombacher Andreas
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2012
Field of study

Semi-structured processes are business workflows, where the execution of the workflow is not completely controlled by a workflow engine, i.e., an implementation of a formal workflow model. Examples are workflows where actors potentially have interaction with customers reporting the result of the interaction in a process aware information system. Building a performance model for resource management in these processes is difficult since the required information is only partially recorded. In this paper we propose a systematic approach for the creation of an event log that is suitable for available process mining tools. This event log is created by an incrementally cleansing of data. The proposed approach is evaluated in an experiment

University of Twente Research Information