3 research outputs found

    Report of MIRACLE team for the Ad-Hoc track in CLEF 2006

    Get PDF
    This paper presents the 2006 MIRACLE’s team approach to the AdHoc Information Retrieval track. The experiments for this campaign keep on testing our IR approach. First, a baseline set of runs is obtained, including standard components: stemming, transforming, filtering, entities detection and extracting, and others. Then, a extended set of runs is obtained using several types of combinations of these baseline runs. The improvements introduced for this campaign have been a few ones: we have used an entity recognition and indexing prototype tool into our tokenizing scheme, and we have run more combining experiments for the robust multilingual case than in previous campaigns. However, no significative improvements have been achieved. For the this campaign, runs were submitted for the following languages and tracks: - Monolingual: Bulgarian, French, Hungarian, and Portuguese. - Bilingual: English to Bulgarian, French, Hungarian, and Portuguese; Spanish to French and Portuguese; and French to Portuguese. - Robust monolingual: German, English, Spanish, French, Italian, and Dutch. - Robust bilingual: English to German, Italian to Spanish, and French to Dutch. - Robust multilingual: English to robust monolingual languages. We still need to work harder to improve some aspects of our processing scheme, being the most important, to our knowledge, the entities recognition and normalization

    Combining visual and textual systems within the context of user feedback

    Get PDF
    It has been proven experimentally, that a combination of textual and visual representations can improve the retrieval performance ([20], [23]). It is due to the fact, that the textual and visual feature spaces often represent complementary yet correlated aspects of the same image, thus forming a composite system. In this paper, we present a model for the combination of visual and textual sub-systems within the user feedback context. The model was inspired by the measurement utilized in quantum mechanics (QM) and the tensor product of co-occurrence (density) matrices, which represents a density matrix of the composite system in QM. It provides a sound and natural framework to seamlessly integrate multiple feature spaces by considering them as a composite system, as well as a new way of measuring the relevance of an image with respect to a context. The proposed approach takes into account both intra (via co-occurrence matrices) and inter (via tensor operator) relationships between features’ dimensions. It is also computationally cheap and scalable to large data collections. We test our approach on ImageCLEF2007photo data collection and present interesting findings

    Experiences in evaluating multilingual and text-image information retrieval

    Get PDF
    23 pages, 8 figures.One important step during the development of information retrieval (IR) processes is the evaluation of the output regarding the information needs of the user. The "high quality" of the output is related to the integration of different methods to be applied in the IR process and the information included in the retrieved documents, but how can "quality" be measured? Although some of these methods can be tested in a stand-alone way, it is not always clear what will happen when several methods are integrated. For this reason, much effort has been put into establishing a good combination of several methods or to correctly tuning some of the algorithms involved. The current approach is to measure the precision and recall figures yielded when different combinations of methods are included in an IR process. In this article, a short description of the current techniques and methods included in an IR system is given, paying special attention to the multilingual aspect of the problem. Also a discussion of their influence on the final performance of the IR process is presented by explaining previous experiences in the evaluation process followed in two projects (MIRACLE and OmniPaper) related to multilingual information retrieval.This work has been partially supported by the projects OmniPaper (European Union, 5th Framework Programme for Research and Technological Development, IST-2001-32174), NEDINE (E-Content project Ref.: 22225), and GPS Project—Software Process Management Platform: modeling, reuse, and measurement (National Research Plan, TIN2004-07083).Publicad
    corecore