33 research outputs found

    Report of MIRACLE team for the Ad-Hoc track in CLEF 2006

    Get PDF
    This paper presents the 2006 MIRACLE’s team approach to the AdHoc Information Retrieval track. The experiments for this campaign keep on testing our IR approach. First, a baseline set of runs is obtained, including standard components: stemming, transforming, filtering, entities detection and extracting, and others. Then, a extended set of runs is obtained using several types of combinations of these baseline runs. The improvements introduced for this campaign have been a few ones: we have used an entity recognition and indexing prototype tool into our tokenizing scheme, and we have run more combining experiments for the robust multilingual case than in previous campaigns. However, no significative improvements have been achieved. For the this campaign, runs were submitted for the following languages and tracks: - Monolingual: Bulgarian, French, Hungarian, and Portuguese. - Bilingual: English to Bulgarian, French, Hungarian, and Portuguese; Spanish to French and Portuguese; and French to Portuguese. - Robust monolingual: German, English, Spanish, French, Italian, and Dutch. - Robust bilingual: English to German, Italian to Spanish, and French to Dutch. - Robust multilingual: English to robust monolingual languages. We still need to work harder to improve some aspects of our processing scheme, being the most important, to our knowledge, the entities recognition and normalization

    Report of MIRACLE team for the Ad-Hoc track in CLEF 2007

    Get PDF
    This paper presents the 2007 MIRACLE’s team approach to the AdHoc Information Retrieval track. The work carried out for this campaign has been reduced to monolingual experiments, in the standard and in the robust tracks. No new approaches have been attempted in this campaign, following the procedures established in our participation in previous campaigns. For this campaign, runs were submitted for the following languages and tracks: - Monolingual: Bulgarian, Hungarian, and Czech. - Robust monolingual: French, English and Portuguese. There is still some room for improvement around multilingual named entities recognition

    Re-ranking of Yahoo snippets with the JIRS passage retrieval system

    Get PDF
    ComunicaciĂłn presentada en: Workshop on Cross Lingual Information Access, CLIA-2007, 20th International Joint Conference on Artificial Intelligence, IJCAI-07, Hyderabad, India, January 6-12, 2007Passage Retrieval (PR) systems are used as first step of the actual Question Answering (QA) systems. Usually, PR systems are traditional information retrieval systems which are not oriented to the specific problem of QA. In fact, these systems only search for the question keywords. JIRS Distance Density n-gram system is a QA-oriented PR system which has given good results in QA tasks when this is applied over static document collections. JIRS is able to search for the question structure in the document collection in order to find the passages with the greatest probability to contain the answer. JIRS is a language-independent PR system which has been already adapted to a few non-agglutinative European languages (such as Spanish, Italian, English and French) as well as to the Arabic language. A first attempt to adapt it to the Urdu Indian language was also made. In this paper, we investigate the possibility of basing on the web the JIRS retrieval of passages. The experiments we carried out show that JIRS allow to improve the coverage of the correct answers re-ranking the snippets obtained with Yahoo search engine.ICT EU-India; TEXT-MESS CICY

    MIRACLE Progress in Monolingual Information Retrieval at Ad-Hoc CLEF 2007

    Full text link
    This paper presents the 2007 MIRACLE’s team approach to the Ad-Hoc Information Retrieval track. The main work carried out for this campaign has been around monolingual experiments, in the standard and in the robust tracks. The most important contributions have been the general introduction of automatic named-entities extraction and the use of Wikipedia resources. For the2007 campaign, runs were submitted for the following languages and tracks: a) Monolingual: Bulgarian, Hungarian, and Czech. b) Robust monolingual: French, English and Portuguese

    GeoCLEF 2007: the CLEF 2007 cross-language geographic information retrieval track overview

    Get PDF
    GeoCLEF ran as a regular track for the second time within the Cross Language Evaluation Forum (CLEF) 2007. The purpose of GeoCLEF is to test and evaluate cross-language geographic information retrieval (GIR): retrieval for topics with a geographic specification. GeoCLEF 2007 consisted of two sub tasks. A search task ran for the third time and a query classification task was organized for the first. For the GeoCLEF 2007 search task, twenty-five search topics were defined by the organizing groups for searching English, German, Portuguese and Spanish document collections. All topics were translated into English, Indonesian, Portuguese, Spanish and German. Several topics in 2007 were geographically challenging. Thirteen groups submitted 108 runs. The groups used a variety of approaches. For the classification task, a query log from a search engine was provided and the groups needed to identify the queries with a geographic scope and the geographic components within the local queries

    Esfinge at CLEF 2008: Experimenting with answer retrieval patterns. Can they help?

    Get PDF
    Esfinge is a general domain Portuguese question answering system which has been participating at QA@CLEF since 2004. It uses the information available in the ?official? document collections used in QA@CLEF (newspaper text and Wikipedia), but additionally it also uses information from the Web as an additional resource when searching for answers. Where it regards the use of external tools, Esfinge uses a syntactic analyzer, a morphological analyzer and a named entity recognizer. This year an alternative approach to retrieve answers was tested: whereas in previous years, search patterns were used to retrieve relevant documents, this year a new type of search patterns was also used to extract the answers themselves. Besides that we took advantage of the main novelty introduced this year by QA@CLEF organization which was that the systems could return up to three answers for each question, instead of the single answer allowed in previous editions. This enabled the investigation about how good were the second and third best answers returned by Esfinge (when the first answer is not correct). The experiments revealed that the answer retrieval patterns created for this participation improve the results, but only for definition questions. Regarding the study of the three answers returned by Esfinge, the conclusion was that when Esfinge answers correctly a question, it does so usually with its first answer
    corecore