Search CORE

825 research outputs found

ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters

Author: Abujabal Abdalghani
Roy Rishiraj Saha
Weikum Gerhard
Yahya Mohamed
Publication venue
Publication date: 01/01/2018
Field of study

To bridge the gap between the capabilities of the state-of-the-art in factoid question answering (QA) and what users ask, we need large datasets of real user questions that capture the various question phenomena users are interested in, and the diverse ways in which these questions are formulated. We introduce ComQA, a large dataset of real user questions that exhibit different challenging aspects such as compositionality, temporal reasoning, and comparisons. ComQA questions come from the WikiAnswers community QA platform, which typically contains questions that are not satisfactorily answerable by existing search engine technology. Through a large crowdsourcing effort, we clean the question dataset, group questions into paraphrase clusters, and annotate clusters with their answers. ComQA contains 11,214 questions grouped into 4,834 paraphrase clusters. We detail the process of constructing ComQA, including the measures taken to ensure its high quality while making effective use of crowdsourcing. We also present an extensive analysis of the dataset and the results achieved by state-of-the-art systems on ComQA, demonstrating that our dataset can be a driver of future research on QA.Comment: 11 pages, NAACL 201

arXiv.org e-Print Archive

MPG.PuRe

Ranking Medical Subject Headings using a factor graph model.

Author: Demner-Fushman Dina
Jiang Xiaoqian
Ohno-Machado Lucila
Wang Shuang
Wei Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2015
Field of study

Automatically assigning MeSH (Medical Subject Headings) to articles is an active research topic. Recent work demonstrated the feasibility of improving the existing automated Medical Text Indexer (MTI) system, developed at the National Library of Medicine (NLM). Encouraged by this work, we propose a novel data-driven approach that uses semantic distances in the MeSH ontology for automated MeSH assignment. Specifically, we developed a graphical model to propagate belief through a citation network to provide robust MeSH main heading (MH) recommendation. Our preliminary results indicate that this approach can reach high Mean Average Precision (MAP) in some scenarios

PubMed Central

eScholarship - University of California

An evaluation resource for geographic information retrieval

Author: Di Nunzio G.
Ferro N.
Gey F.
Mandl T.
Sanderson M.
Santos D.
Womser-Hacker C.
Publication venue
Publication date: 01/01/2008
Field of study

In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic information retrieval requires an evaluation resource which represents realistic information needs and which is geographically challenging. Some experimental results and analysis are reported

White Rose Research Online

Archivio istituzionale della ricerca - Università di Padova

The Multilingual Question Answering Track at CLEF

Author: Aunimo Lili
Ayache Christelle
Giampiccolo Danilo
Magnini Bernardo
Osenova Petya
Peñas Anselmo
Rijke Maarten de
Sacaleanu Bogdan
Santos Diana
Sutcliffe Richard
Publication venue
Publication date: 06/11/2008
Field of study

Repositório Comum

Analysing definition questions by two machine learning approaches

Author: López López Aurelio
Martínez Carmen
Publication venue
Publication date: 01/08/2006
Field of study

In automatic question answering, the identification of the correct target term (i.e. the term to define) in a definition question is critical since if the target term is not correctly identified, then all subsequent modules have no chance of providing relevant nuggets. In this paper, we present a method to tag a question sentence experimenting with two learning approaches: QTag and Hidden Markov Model. We tested the methods in five collections of questions, PILOT, TREC 2003, TREC 2004, CLEF 2004 and CLEF 2005. We performed ten-fold cross validation for each collection and we also tested with all questions together. The best accuracy rates for each collection were obtained using QTag, but with all questions together the best accuracy rate is obtained using HMM.IFIP International Conference on Artificial Intelligence in Theory and Practice - Speech and Natural LanguageRed de Universidades con Carreras en Informática (RedUNCI

Overview of the CLEF 2006 Multilingual Question Answering Track

Author: Ayache Christelle
Forner Pamela
Giampiccolo Danilo
Jijkoun Valentin
Magnini Bernardo
Osenova Petya
Peñas Anselmo
Rocha Paulo
Sacaleanu Bogdan
Sutcliffe Richard
Publication venue
Publication date: 25/10/2006
Field of study

Repositório Comum

iCLEF 2006 Overview: Searching the Flickr WWW photo-sharing repository

Author: Clough Paul
Gonzalo Julio
Karlgren Jussi
Publication venue
Publication date: 01/01/2006
Field of study

This paper summarizes the task design for iCLEF 2006 (the CLEF interactive track). Compared to previous years, we have proposed a radically new task: searching images in a naturally multilingual database, Flickr, which has millions of photographs shared by people all over the planet, tagged and described in a wide variety of languages. Participants are expected to build a multilingual search front-end to Flickr (using Flickr’s search API) and study the behaviour of the users for a given set of searching tasks. The emphasis is put on studying the process, rather than evaluating its outcome

Crossref

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

White Rose Research Online

Cross-language evaluation forum : objectives, results, achievements

Author: Braschler Martin
Peters Carol
Publication venue: Springer
Publication date: 01/01/2004
Field of study

ZHAW digitalcollection

Evaluation campaigns and TRECVid

Author: Kraaij Wessel
Over Paul
Smeaton Alan F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

The TREC Video Retrieval Evaluation (TRECVid) is an international benchmarking activity to encourage research in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video corpus, automatic detection of a variety of semantic and low-level video features, shot boundary detection and the detection of story boundaries in broadcast TV news. This paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether such campaigns are a good thing or a bad thing. There are arguments for and against these campaigns and we present some of them in the paper concluding that on balance they have had a very positive impact on research progress

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Overview of the CLEF 2005 Multilingual Question Answering Track

Author: Aunimo Lili
Ayache Christelle
Giampiccolo Danilo
Magnini Bernardo
Osenova Petya
Peñas Anselmo
Rijke Maarten de
Sacaleanu Bogdan
Santos Diana
Sutcliffe Richard
Vallin Alessandro
Publication venue: Centromedia
Publication date: 13/10/2009
Field of study

Repositório Comum