3,557 research outputs found

    Sheffield University CLEF 2000 submission - bilingual track: German to English

    Get PDF
    We investigated dictionary based cross language information retrieval using lexical triangulation. Lexical triangulation combines the results of different transitive translations. Transitive translation uses a pivot language to translate between two languages when no direct translation resource is available. We took German queries and translated then via Spanish, or Dutch into English. We compared the results of retrieval experiments using these queries, with other versions created by combining the transitive translations or created by direct translation. Direct dictionary translation of a query introduces considerable ambiguity that damages retrieval, an average precision 79% below monolingual in this research. Transitive translation introduces more ambiguity, giving results worse than 88% below direct translation. We have shown that lexical triangulation between two transitive translations can eliminate much of the additional ambiguity introduced by transitive translation

    Cross-lingual document retrieval categorisation and navigation based on distributed services

    Get PDF
    The widespread use of the Internet across countries has increased the need for access to document collections that are often written in languages different from a user’s native language. In this paper we describe Clarity, a Cross Language Information Retrieval (CLIR) system for English, Finnish, Swedish, Latvian and Lithuanian. Clarity is a fully-fledged retrieval system that supports the user during the whole process of query formulation, text retrieval and document browsing. We address four of the major aspects of Clarity: (i) the user-driven methodology that formed the basis for the iterative design cycle and framework in the project, (ii) the system architecture that was developed to support the interaction and coordination of Clarity’s distributed services, (iii) the data resources and methods for query translation, and (iv) the support for Baltic languages. Clarity is an example of a distributed CLIR system built with minimal translation resources and, to our knowledge, the only such system that currently supports Baltic languages

    Keep It Simple Sheffield – a KISS approach to the Arabic track

    Get PDF
    Sheffield’s participation in the inaugural Arabic cross language track is described here. Our goal was to examine how well one could achieve retrieval of Arabic text with the minimum of resources and adaptation of existing retrieval systems. To this end the public translators used for query translation and the minimal changes to our retrieval system are described. While the effectiveness of our resulting system is not as high as one might desire, it nevertheless provides reasonable performance particularly in the monolingual track: on average, just under four relevant documents were found in the 10 top ranked documents

    Transitive probabilistic CLIR models.

    Get PDF
    Transitive translation could be a useful technique to enlarge the number of supported language pairs for a cross-language information retrieval (CLIR) system in a cost-effective manner. The paper describes several setups for transitive translation based on probabilistic translation models. The transitive CLIR models were evaluated on the CLEF test collection and yielded a retrieval effectiveness\ud up to 83% of monolingual performance, which is significantly better than a baseline using the synonym operator

    Beyond English text: Multilingual and multimedia information retrieval.

    Get PDF
    Non

    Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval

    Get PDF
    Although more and more language pairs are covered by machine translation services, there are still many pairs that lack translation resources. Cross-language information retrieval (CLIR) is an application which needs translation functionality of a relatively low level of sophistication since current models for information retrieval (IR) are still based on a bag-of-words. The Web provides a vast resource for the automatic construction of parallel corpora which can be used to train statistical translation models automatically. The resulting translation models can be embedded in several ways in a retrieval model. In this paper, we will investigate the problem of automatically mining parallel texts from the Web and different ways of integrating the translation models within the retrieval process. Our experiments on standard test collections for CLIR show that the Web-based translation models can surpass commercial MT systems in CLIR tasks. These results open the perspective of constructing a fully automatic query translation device for CLIR at a very low cost.Comment: 37 page

    Adaptation of the Wound Healing Questionnaire universal-reporter outcome measure for use in global surgery trials (TALON-1 study): mixed-methods study and Rasch analysis

    Get PDF
    Background The Bluebelle Wound Healing Questionnaire (WHQ) is a universal-reporter outcome measure developed in the UK for remote detection of surgical-site infection after abdominal surgery. This study aimed to explore cross-cultural equivalence, acceptability, and content validity of the WHQ for use across low- and middle-income countries, and to make recommendations for its adaptation. Methods This was a mixed-methods study within a trial (SWAT) embedded in an international randomized trial, conducted according to best practice guidelines, and co-produced with community and patient partners (TALON-1). Structured interviews and focus groups were used to gather data regarding cross-cultural, cross-contextual equivalence of the individual items and scale, and conduct a translatability assessment. Translation was completed into five languages in accordance with Mapi recommendations. Next, data from a prospective cohort (SWAT) were interpreted using Rasch analysis to explore scaling and measurement properties of the WHQ. Finally, qualitative and quantitative data were triangulated using a modified, exploratory, instrumental design model. Results In the qualitative phase, 10 structured interviews and six focus groups took place with a total of 47 investigators across six countries. Themes related to comprehension, response mapping, retrieval, and judgement were identified with rich cross-cultural insights. In the quantitative phase, an exploratory Rasch model was fitted to data from 537 patients (369 excluding extremes). Owing to the number of extreme (floor) values, the overall level of power was low. The single WHQ scale satisfied tests of unidimensionality, indicating validity of the ordinal total WHQ score. There was significant overall model misfit of five items (5, 9, 14, 15, 16) and local dependency in 11 item pairs. The person separation index was estimated as 0.48 suggesting weak discrimination between classes, whereas Cronbach’s α was high at 0.86. Triangulation of qualitative data with the Rasch analysis supported recommendations for cross-cultural adaptation of the WHQ items 1 (redness), 3 (clear fluid), 7 (deep wound opening), 10 (pain), 11 (fever), 15 (antibiotics), 16 (debridement), 18 (drainage), and 19 (reoperation). Changes to three item response categories (1, not at all; 2, a little; 3, a lot) were adopted for symptom items 1 to 10, and two categories (0, no; 1, yes) for item 11 (fever). Conclusion This study made recommendations for cross-cultural adaptation of the WHQ for use in global surgical research and practice, using co-produced mixed-methods data from three continents. Translations are now available for implementation into remote wound assessment pathways
    • 

    corecore