41,740 research outputs found
User experiments with the Eurovision cross-language image retrieval system
In this paper we present Eurovision, a text-based system for cross-language (CL) image retrieval.
The system is evaluated by multilingual users for two search tasks with the system configured in
English and five other languages. To our knowledge this is the first published set of user
experiments for CL image retrieval. We show that: (1) it is possible to create a usable multilingual
search engine using little knowledge of any language other than English, (2) categorizing images
assists the user's search, and (3) there are differences in the way users search between the proposed
search tasks. Based on the two search tasks and user feedback, we describe important aspects of
any CL image retrieval system
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
Disambiguation strategies for cross-language information retrieval
This paper gives an overview of tools and methods for Cross-Language Information Retrieval (CLIR) that are developed within the Twenty-One project. The tools and methods are evaluated with the TREC CLIR task document collection using Dutch queries on the English document base. The main issue addressed here is an evaluation of two approaches to disambiguation. The underlying question is whether a lot of effort should be put in finding the correct translation for each query term before searching, or whether searching with more than one possible translation leads to better results? The experimental study suggests that the quality of search methods is more important than the quality of disambiguation methods. Good retrieval methods are able to disambiguate translated queries implicitly during searching
User requirement elicitation for cross-language information retrieval
Who are the users of a cross-language retrieval system? Under what circumstances do they need to perform such multi-language searches? How will the task and the context
of use affect successful interaction with the system? Answers to these questions were explored in a user study performed as part of the design stages of Clarity, a EU
founded project on cross-language information retrieval. The findings resulted in a rethink of the planned user interface and a consequent expansion of the set of services
offered. This paper reports on the methodology and techniques used for the elicitation of user requirements as well as how these were in turn transformed into new design
solutions
Language Models
Contains fulltext :
227630.pdf (preprint version ) (Open Access
Dublin City University at CLEF 2004: experiments with the ImageCLEF St Andrew's collection
For the CLEF 2004 ImageCLEF St Andrew's Collection task
the Dublin City University group carried out three sets of experiments: standard cross-language information retrieval (CLIR) runs using topic translation via machine translation (MT), combination of this run with image matching results from the VIPER system, and a novel document rescoring approach based on automatic MT evaluation metrics. Our standard MT-based CLIR works well on this task. Encouragingly combination with image matching lists is also observed to produce small positive changes in the retrieval output. However, rescoring using the MT evaluation metrics in their current form significantly reduced retrieval
effectiveness
Multilingual search for cultural heritage archives via combining multiple translation resources
The linguistic features of material in Cultural Heritage (CH) archives may be in various languages requiring a facility for effective multilingual search. The specialised
language often associated with CH content introduces problems for automatic translation to support search applications. The MultiMatch project is focused on enabling
users to interact with CH content across different media types and languages. We present results from a MultiMatch study exploring various translation techniques for
the CH domain. Our experiments examine translation techniques for the English language CLEF 2006 Cross-Language
Speech Retrieval (CL-SR) task using Spanish, French and German queries. Results compare effectiveness of our query
translation against a monolingual baseline and show improvement when combining a domain-specific translation lexicon with a standard machine translation system
- …