4,187 research outputs found
EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets
This article introduces a new language-independent approach for creating a
large-scale high-quality test collection of tweets that supports multiple
information retrieval (IR) tasks without running a shared-task campaign. The
adopted approach (demonstrated over Arabic tweets) designs the collection
around significant (i.e., popular) events, which enables the development of
topics that represent frequent information needs of Twitter users for which
rich content exists. That inherently facilitates the support of multiple tasks
that generally revolve around events, namely event detection, ad-hoc search,
timeline generation, and real-time summarization. The key highlights of the
approach include diversifying the judgment pool via interactive search and
multiple manually-crafted queries per topic, collecting high-quality
annotations via crowd-workers for relevancy and in-house annotators for
novelty, filtering out low-agreement topics and inaccessible tweets, and
providing multiple subsets of the collection for better availability. Applying
our methodology on Arabic tweets resulted in EveTAR , the first
freely-available tweet test collection for multiple IR tasks. EveTAR includes a
crawl of 355M Arabic tweets and covers 50 significant events for which about
62K tweets were judged with substantial average inter-annotator agreement
(Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating
existing algorithms in the respective tasks. Results indicate that the new
collection can support reliable ranking of IR systems that is comparable to
similar TREC collections, while providing strong baseline results for future
studies over Arabic tweets
The TREC-2002 video track report
TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open
Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview
of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings
Improving the evaluation of web search systems
Linkage analysis as an aid to web search has been assumed to be of significant benefit and we know that it is being implemented by many major Search Engines. Why then have few TREC participants been able to scientifically prove the benefits of linkage analysis over the past three years? In this paper we put forward reasons why disappointing results have been found and we identify the linkage density requirements of a dataset to faithfully support experiments into linkage analysis. We also report a series of linkage-based retrieval experiments on a more densely linked dataset culled from the TREC web documents
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
DCU@TRECMed 2012: Using ad-hoc baselines for domain-specific retrieval
This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed). We performed some initial experiments on the 2011 TRECMed data based on the BM25 retrieval model. Surprisingly, we found that the standard BM25 model with default parameters, performs comparable to the best automatic runs submitted to TRECMed 2011 and would have resulted in rank four out of 29 participating groups. We expected that some form of domain adaptation would increase performance. However, results on the 2011 data proved otherwise: concept-based query expansion decreased performance, and filtering and reranking by term proximity also decreased performance slightly. We submitted four runs based on the BM25 retrieval model to TRECMed 2012 using standard BM25, standard query expansion, result filtering, and concept-based query expansion. Official results for 2012 confirm that domain-specific knowledge does not increase performance compared to the BM25 baseline as applied by us
Modelling User Preferences using Word Embeddings for Context-Aware Venue Recommendation
Venue recommendation aims to assist users by making personalised
suggestions of venues to visit, building upon data available from
location-based social networks (LBSNs) such as Foursquare. A
particular challenge for this task is context-aware venue recommendation
(CAVR), which additionally takes the surrounding context of
the user (e.g. the userās location and the time of day) into account
in order to provide more relevant venue suggestions. To address the
challenges of CAVR, we describe two approaches that exploit word
embedding techniques to infer the vector-space representations of
venues, usersā existing preferences, and usersā contextual preferences.
Our evaluation upon the test collection of the TREC 2015
Contextual Suggestion track demonstrates that we can significantly
enhance the effectiveness of a state-of-the-art venue recommendation
approach, as well as produce context-aware recommendations
that are at least as effective as the top TREC 2015 systems
- ā¦