Search CORE

49 research outputs found

DCU@TRECMed 2012: Using ad-hoc baselines for domain-specific retrieval

Author: Goeuriot Lorraine
Jones Gareth J.F.
Kelly Liadh
Leveling Johannes
Publication venue
Publication date: 09/11/2012
Field of study

This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed). We performed some initial experiments on the 2011 TRECMed data based on the BM25 retrieval model. Surprisingly, we found that the standard BM25 model with default parameters, performs comparable to the best automatic runs submitted to TRECMed 2011 and would have resulted in rank four out of 29 participating groups. We expected that some form of domain adaptation would increase performance. However, results on the 2011 data proved otherwise: concept-based query expansion decreased performance, and filtering and reranking by term proximity also decreased performance slightly. We submitted four runs based on the BM25 retrieval model to TRECMed 2012 using standard BM25, standard query expansion, result filtering, and concept-based query expansion. Official results for 2012 confirm that domain-specific knowledge does not increase performance compared to the BM25 baseline as applied by us

Irish Universities

DCU Online Research Access Service

Método híbrido para categorización de texto basado en aprendizaje y reglas

Author: Collada Pérez Sonia
González Cristóbal José Carlos
Lana Serrano Sara
Villena Román Julio
Publication venue: E.U.I.T. Telecomunicación (UPM)
Publication date: 01/01/2011
Field of study

En este artículo se presenta un nuevo método híbrido de categorización automática de texto, que combina un algoritmo de aprendizaje computacional, que permite construir un modelo base de clasificación sin mucho esfuerzo a partir de un corpus etiquetado, con un sistema basado en reglas en cascada que se emplea para filtrar y reordenar los resultados de dicho modelo base. El modelo puede afinarse añadiendo reglas específicas para aquellas categorías difíciles que no se han entrenado de forma satisfactoria. Se describe una implementación realizada mediante el algoritmo kNN y un lenguaje básico de reglas basado en listas de términos que aparecen en el texto a clasificar. El sistema se ha evaluado en diferentes escenarios incluyendo el corpus de noticias Reuters-21578 para comparación con otros enfoques, y los modelos IPTC y EUROVOC. Los resultados demuestran que el sistema obtiene una precisión y cobertura comparables con las de los mejores métodos del estado del arte

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Examining the validity of cross-lingual word sense disambiguation

Author: Hoste Veronique
Lefever Els
Publication venue
Publication date: 01/01/2011
Field of study

Ghent University Academic Bibliography

How geographical was GikiCLEF? A GIR-critical review

Author: Cabral Luís Miguel
Cardoso Nuno
Santos Diana
Publication venue
Publication date: 01/01/2010
Field of study

Repositório Comum

GikiP: Evaluating geographical answers from Wikipedia

Author: Cardoso Nuno
Santos Diana
Publication venue
Publication date: 01/01/2008
Field of study

Repositório Comum

Universidade de Lisboa: Repositório.UL

Fusion of Retrieval Models at CLEF 2008 Ad Hoc Persian Track

Author: Aghazade Zahra
AleAhmad Abolfazel
Amiri Hadi
Dehghani Nazanin
Farzinvash Leili
Oroumchian Farhad
Rahimi Razieh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Metasearch engines submit the user query to several under- lying search engines and then merge their retrieved results to generate a single list that is more e®ective to the users information needs. According to the idea behind metasearch engines, it seems that merging the results retrieved from di®erent retrieval models will improve the search coverage and precision. In this study, we have investigated the e®ect of fusion of di®erent retrieval techniques on the performance of Persian retrieval. We use an extension of Ordered Weighted Average (OWA) operator called IOWA and a weighting schema, NOWA for merging the results. Our ex- perimental results show that merging by OWA operators produces better MAP

Crossref

Research Online

DAEDALUS at ImageCLEF Medical Retrieval 2011: Textual, Visual and Multimodal Experiments

Author: González Cristóbal José Carlos
Lana Serrano Sara
Villena Román Julio
Publication venue: E.U.I.T. Telecomunicación (UPM)
Publication date: 01/01/2011
Field of study

This paper describes the participation of DAEDALUS at ImageCLEF 2011 Medical Retrieval task. We have focused on multimodal (or mixed) experiments that combine textual and visual retrieval. The main objective of our research has been to evaluate the effect on the medical retrieval process of the existence of an extended corpus that is annotated with the image type, associated to both the image itself and also to its textual description. For this purpose, an image classifier has been developed to tag each document with its class (1st level of the hierarchy: Radiology, Microscopy, Photograph, Graphic, Other) and subclass (2nd level: AN, CT, MR, etc.). For the textual-based experiments, several runs using different semantic expansion techniques have been performed. For the visual-based retrieval, different runs are defined by the corpus used in the retrieval process and the strategy for obtaining the class and/or subclass. The best results are achieved in runs that make use of the image subclass based on the classification of the sample images. Although different multimodal strategies have been submitted, none of them has shown to be able to provide results that are at least comparable to the ones achieved by the textual retrieval alone. We believe that we have been unable to find a metric for the assessment of the relevance of the results provided by the visual and textual processe

Archivo Digital UPM

Resumo da actividade da Linguateca de 1 de Janeiro de 2009 a 31 de Dezembro de 2009

Author: Cabral Luís Miguel
Costa Luís
Santos Diana
Publication venue: Linguateca
Publication date: 01/01/2009
Field of study

Repositório Comum