Search CORE

689 research outputs found

Combining Wikipedia and Newswire Texts for Question Answering in Spanish

Author: Al-Jumaily Harith T.
González-Ledesma Ana
Martínez Fernández José Luis
Martínez Paloma
Moreno-Sandoval Antonio
Pablo-Sánchez César de
Samy Doaa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

4 pages, 1 figure.-- Contributed to: Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross-Language Evaluation Forum (CLEF 2007, Budapest, Hungary, Sep 19-21, 2007).This paper describes the adaptations of the MIRACLE group QA system in order to participate in the Spanish monolingual question answering task at QA@CLEF 2007. A system, initially developed for the EFE collection, was reused for Wikipedia. Answers from both collections were combined using temporal information extracted from questions and collections. Reusing the EFE subsystem has proven not feasible, and questions with answers only in Wikipedia have obtained low accuracy. Besides, a co-reference module based on heuristics was introduced for processing topic-related questions. This module achieves good coverage in different situations but it is hindered by the moderate accuracy of the base system and the chaining of incorrect answers.This work has been partially supported by the Regional Government of Madrid under the Research Network MAVIR (S-0505/TIC-0267) and projects by the Spanish Ministry of Education and Science (TIN2004/07083,TIN2004-07588-C03-02,TIN2007-67407-C03-01).Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

An evaluation resource for geographic information retrieval

Author: Di Nunzio G.
Ferro N.
Gey F.
Mandl T.
Sanderson M.
Santos D.
Womser-Hacker C.
Publication venue
Publication date: 01/01/2008
Field of study

In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic information retrieval requires an evaluation resource which represents realistic information needs and which is geographically challenging. Some experimental results and analysis are reported

White Rose Research Online

Archivio istituzionale della ricerca - Università di Padova

Overview of the ImageCLEFphoto 2008 photographic retrieval task

Author: Arni T.
Clough P.
Grubinger M.
Sanderson M.
Publication venue
Publication date: 01/01/2008
Field of study

ImageCLEFphoto 2008 is an ad-hoc photo retrieval task and part of the ImageCLEF evaluation campaign. This task provides both the resources and the framework necessary to perform comparative laboratory-style evaluation of visual information retrieval systems. In 2008, the evaluation task concentrated on promoting diversity within the top 20 results from a multilingual image collection. This new challenge attracted a record number of submissions: a total of 24 participating groups submitting 1,042 system runs. Some of the findings include that the choice of annotation language is almost negligible and the best runs are by combining concept and content-based retrieval methods

White Rose Research Online

An evaluation resource for Geographical Information Retrieval

Author: Ferro Nicola
Gey Fredric
Mandl Thomas
Nunzio Giorgio di
Sanderson Mark
Santos Diana
Womser-Hacker Christa
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2008
Field of study

CiteSeerX

Repositório Comum

MIRACLE at ImageCLEFannot 2008: Classification of Image Features for Medical Image Annotation

Author: González Cristóbal José Carlos
Goñi Menoyo José Miguel
Lana Serrano Sara
Villena Román Julio
Publication venue: E.U.I.T. Telecomunicación (UPM)
Publication date: 01/01/2008
Field of study

This paper describes the participation of MIRACLE research consortium at the ImageCLEF Medical Image Annotation task of ImageCLEF 2008. A lot of effort was invested this year to develop our own image analysis system, based on MATLAB, to be used in our experiments. This system extracts a variety of global and local features including histogram, image statistics, Gabor features, fractal dimension, DCT and DWT coefficients, Tamura features and coocurrency matrix statistics. Then a k-Nearest Neighbour algorithm analyzes the extracted image feature vectors to determine the IRMA code associated to a given image. The focus of our experiments is mainly to test and evaluate this system in-depth and to make a comparison among diverse configuration parameters such as number of images for the relevance feedback to use in the classification module

Archivo Digital UPM

MIRACLE at ImageCLEFanot 2007: Machine Learning Experiments on Medical Image Annotation

Author: González Cristóbal José Carlos
Goñi Menoyo José Miguel
Lana Serrano Sara
Villena Román Julio
Publication venue: E.U.I.T. Telecomunicación (UPM)
Publication date: 01/01/2007
Field of study

This paper describes the participation of MIRACLE research consortium at the ImageCLEF Medical Image Annotation task of ImageCLEF 2007. Our areas of expertise do not include image analysis, thus we approach this task as a machine-learning problem, regardless of the domain. FIRE is used as a black-box algorithm to extract different groups of image features that are later used for training different classifiers in order to predict the IRMA code. Three types of classifiers are built. The first type is a single classifier that predicts the complete IRMA code. The second type is a two level classifier composed of four classifiers that individually predict each axis of the IRMA code. The third type is similar to the second one but predicts a combined pair of axes. The main idea behind the definition of our experiments is to evaluate whether an axis-by-axis prediction is better than a prediction by pairs of axes or the complete code, or vice versa. We submitted 30 experiments to be evaluated and results are disappointing compared to other groups. However, the main conclusion that can be drawn from the experiments is that, irrespective of the selected image features, the axis-by-axis prediction achieves more accurate results not only than the prediction of a combined pair of axes but also, in turn, than the prediction of the complete IRMA code. In addition, data normalization seems to improve the predictions and vector-based features are preferred over histogram-based ones

Archivo Digital UPM

An evaluation of Bradfordizing effects

Author: Bates Marcia J.
Bonitz Manfred.
Brookes B C.
Brookes B C.
Buckland M. K.
Garfield Eugene.
Harman Donna K.
Hood William W.
Lockett M. W.
Mayr Philipp
Mayr Philipp
Mayr Philipp
Mutschke Peter.
Nicolaisen Jeppe
Peritz Bluma C.
Petras Vivien
Pontigo J.
Tenopir Carol.
Umstätter Walther
Vickery Brian C.
Wagner-Döbler R.
White Howard D.
Wilson Concepción S.
Worthen D. B.
Publication venue
Publication date: 01/01/2008
Field of study

The purpose of this paper is to apply and evaluate the bibliometric method Bradfordizing for information retrieval (IR) experiments. Bradfordizing is used for generating core document sets for subject-specific questions and to reorder result sets from distributed searches. The method will be applied and tested in a controlled scenario of scientific literature databases from social and political sciences, economics, psychology and medical science (SOLIS, SoLit, USB Köln Opac, CSA Sociological Abstracts, World Affairs Online, Psyndex and Medline) and 164 standardized topics. An evaluation of the method and its effects is carried out in two laboratory-based information retrieval experiments (CLEF and KoMoHe) using a controlled document corpus and human relevance assessments. The results show that Bradfordizing is a very robust method for re-ranking the main document types (journal articles and monographs) in today’s digital libraries (DL). The IR tests show that relevance distributions after re-ranking improve at a significant level if articles in the core are compared with articles in the succeeding zones. The items in the core are significantly more often assessed as relevant, than items in zone 2 (z2) or zone 3 (z3). The improvements between the zones are statistically significant based on the Wilcoxon signed-rank test and the paired T-Test

arXiv.org e-Print Archive

CiteSeerX

E-LIS

Crossref

SSOAR - Social Science Open Access Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

DCU and UTA at ImageCLEFPhoto 2007

Author: Adamek Tomasz
Airio Eija
Jones Gareth J.F.
Järvelin Anni
Wilkins Peter
Publication venue
Publication date: 01/09/2007
Field of study

Dublin City University (DCU) and University of Tampere(UTA) participated in the ImageCLEF 2007 photographic ad-hoc retrieval task with several monolingual and bilingual runs. Our approach was language independent: text retrieval based on fuzzy s-gram query translation was combined with visual retrieval. Data fusion between text and image content was performed using unsupervised query-time weight generation approaches. Our baseline was a combination of dictionary-based query translation and visual retrieval, which achieved the best result. The best mixed modality runs using fuzzy s-gram translation achieved on average around 83% of the performance of the baseline. Performance was more similar when only top rank precision levels of P10 and P20 were considered. This suggests that fuzzy sgram query translation combined with visual retrieval is a cheap alternative for cross-lingual image retrieval where only a small number of relevant items are required. Both sets of results emphasize the merit of our query-time weight generation schemes for data fusion, with the fused runs exhibiting marked performance increases over single modalities, this is achieved without the use of any prior training data

Irish Universities

DCU Online Research Access Service

Domain-speciﬁc query translation for multilingual access to digital libraries

Author: Fantino Fabio
Fuller Marguerite
Jones Gareth J.F.
Newman Eamonn
Zhang Ying
Publication venue
Publication date: 15/06/2009
Field of study

Accurate high-coverage translation is a vital component of reliable cross language information access (CLIR) systems. This is particularly true of access to archives such as Digital Libraries which are often speciﬁc to certain domains. While general machine translation (MT) has been shown to be effective for CLIR tasks in information retrieval evaluation workshops, it is not well suited to specialized tasks where domain speciﬁc translations are required. We demonstrate that effective query translation in the domain of cultural heritage (CH) can be achieved by augmenting a standard MT system with domain-speciﬁc phrase dictionaries automatically mined from the online Wikipedia. Experiments using our hybrid translation system with sample query logs from users of CH websites demonstrate a large improvement in the accuracy of domain speciﬁc phrase detection and translation

Irish Universities

DCU Online Research Access Service

CLEF NewsREEL 2016: Comparing Multi-Dimensional Offline and Online Evaluation of News Recommender Systems

Author: Brodt Torben
Hopfgartner Frank
Kille Benjamin
Larson Martha
Lommatzsch Andreas
Malagoli Davide
Seiler Jonas
Sereny Andras
Publication venue: CEUR workshop proceedings
Publication date: 01/01/2016
Field of study

Running in its third year at CLEF, NewsREEL challenged participants to develop news recommendation algorithms and have them benchmarked in an online (Task 1) and offline setting (Task 2), respectively. This paper provides an overview of the NewsREEL scenario, outlines this year’s campaign, presents results of both tasks, and discusses the approaches of participating teams. Moreover, it overviews ideas on living lab evaluation that have been presented as part of a “New Ideas” track at the conference in Portugal. Presented results illustrate potentials for multi-dimensional evaluation of recommendation algorithms in a living lab and simulation based evaluation setting

Enlighten