Search CORE

287 research outputs found

Learning to merge search results for efficient Distributed Information Retrieval

Author: Hiemstra Djoerd
Tjin-Kam-Jet Kien-Tsoi T.E.
Publication venue: Radboud University
Publication date: 01/01/2010
Field of study

Merging search results from different servers is a major problem in Distributed Information Retrieval. We used Regression-SVM and Ranking-SVM which would learn a function that merges results based on information that is readily available: i.e. the ranks, titles, summaries and URLs contained in the results pages. By not downloading additional information, such as the full document, we decrease bandwidth usage. CORI and Round Robin merging were used as our baselines; surprisingly, our results show that the SVM-methods do not improve over those baselines

CiteSeerX

Radboud Repository

University of Twente Research Information

Reconsideration of the simulated work task situation:A context instrument for evaluation of information retrieval interaction

Author: Borlund Pia
Schneider Jesper Wiborg
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Copenhagen University Research Information System

A Decade of Shared Tasks in Digital Text Forensics at PAN

Author: B Stein
E Amigó
E Stamatatos
E Stamatatos
E Stamatatos
H Asghari
J Holmes
JW Pennebaker
M Koppel
M Potthast
M Potthast
M Potthast
M Potthast
O Halvani
P Rosso
S Argamon
T Gollub
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

[EN] Digital text forensics aims at examining the originality and credibility of information in electronic documents and, in this regard, to extract and analyze information about the authors of these documents. The research field has been substantially developed during the last decade. PAN is a series of shared tasks that started in 2009 and significantly contributed to attract the attention of the research community in well-defined digital text forensics tasks. Several benchmark datasets have been developed to assess the state-of-the-art performance in a wide range of tasks. In this paper, we present the evolution of both the examined tasks and the developed datasets during the last decade. We also briefly introduce the upcoming PAN 2019 shared tasks.We are indebted to many colleagues and friends who contributed greatly to PAN's tasks: Maik Anderka, Shlomo Argamon, Alberto Barrón-Cedeño, Fabio Celli, Fabio Crestani, Walter Daelemans, Andreas Eiselt, Tim Gollub, Parth Gupta, Matthias Hagen, Teresa Holfeld, Patrick Juola, Giacomo Inches, Mike Kestemont, Moshe Koppel, Manuel Montes-y-Gómez, Aurelio Lopez-Lopez, Francisco Rangel, Miguel Angel Sánchez-Pérez, Günther Specht, Michael Tschuggnall, and Ben Verhoeven. Our special thanks go to PAN¿s sponsors throughout the years and not least to the hundreds of participants.Potthast, M.; Rosso, P.; Stamatatos, E.; Stein, B. (2019). A Decade of Shared Tasks in Digital Text Forensics at PAN. Lecture Notes in Computer Science. 11438:291-300. https://doi.org/10.1007/978-3-030-15719-7_39S2913001143

Crossref

RiuNet

Time-based Microblog Distillation

Author: Amati G
Angelini S
Bianchi M
Gambosi G
Rossi G
Publication venue
Publication date: 08/04/2014
Field of study

This paper presents a simple approach for identifying relevant and reliable news from the Twitter stream, as soon as they emerge. The approach is based on a near-real time systems for sentiment analysis on Twitter, implemented by Fondazione Ugo Bordoni, and properly modified in order to detect the most representative tweets in a specified time slot. This work represents a first step towards the implementation of a prototype supporting journalists in discovering and finding news on Twitter

ART

CNIPA, FUB and University of Rome "Tor Vergata" at TREC 2008 Legal Track

Author: Amati G
Bianchi M
Celi A
Draoli M
Gambosi G
Stilo G
Publication venue: NIST (Nat. Inst. Standards and Technology)
Publication date: 01/01/2008
Field of study

ART

Archivio della ricerca- Università di Roma La Sapienza

Web document summarisation: a task-oriented evaluation

Author: Jose J.M.
Ruthven I.
White R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

We present a query-biased summarisation interface for Web searching. The summarisation system has been specifically developed to act as a component in existing Web search interfaces. The summaries allow the user to more effectively assess the content of Web pages. We also present an experimental investigation of this approach. Our experimental results shows the system appears to be more useful and effective in helping users gauge document relevance than the traditional ranked titles/abstracts approach

CiteSeerX

University of Strathclyde Institutional Repository

Enlighten

A Comparative Analysis of Retrievability and PageRank Measures

Author: Mall Priyanshu Raj
Roy Dwaipayan
Sinha Aman
Publication venue
Publication date: 17/11/2023
Field of study

The accessibility of documents within a collection holds a pivotal role in Information Retrieval, signifying the ease of locating specific content in a collection of documents. This accessibility can be achieved via two distinct avenues. The first is through some retrieval model using a keyword or other feature-based search, and the other is where a document can be navigated using links associated with them, if available. Metrics such as PageRank, Hub, and Authority illuminate the pathways through which documents can be discovered within the network of content while the concept of Retrievability is used to quantify the ease with which a document can be found by a retrieval model. In this paper, we compare these two perspectives, PageRank and retrievability, as they quantify the importance and discoverability of content in a corpus. Through empirical experimentation on benchmark datasets, we demonstrate a subtle similarity between retrievability and PageRank particularly distinguishable for larger datasets.Comment: Accepted at FIRE 202

arXiv.org e-Print Archive

A study on evaluation on opinion retrieval systems

Author: Amati G
Amodeo G
Capozio V
Gaibisso G
Gambosi G
Publication venue
Publication date
Field of study

ART

Search of spoken documents retrieves well recognized transcripts

Author: Sanderson M.
Shou X.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2007
Field of study

This paper presents a series of analyses and experiments on spoken document retrieval systems: search engines that retrieve transcripts produced by speech recognizers. Results show that transcripts that match queries well tend to be recognized more accurately than transcripts that match a query less well. This result was described in past literature, however, no study or explanation of the effect has been provided until now. This paper provides such an analysis showing a relationship between word error rate and query length. The paper expands on past research by increasing the number of recognitions systems that are tested as well as showing the effect in an operational speech retrieval system. Potential future lines of enquiry are also described

White Rose Research Online