9,607 research outputs found

    Dublin City University at CLEF 2007: Cross-Language Speech Retrieval Experiments

    Get PDF
    The Dublin City University participation in the CLEF 2007 CL-SR English task concentrated primarily on issues of topic translation. Our retrieval system used the BM25F model and pseudo relevance feedback. Topics were translated into English using the Yahoo! BabelFish free online service combined with domain-specific translation lexicons gathered automatically from Wikipedia. We explored alternative topic translation methods using these resources. Our results indicate that extending machine translation tools using automatically generated domainspecific translation lexicons can provide improved CLIR effectiveness for this task

    Recuperación eficiente de datos para el refinamiento de traducciones automáticas, utilizando la web como base de conocimiento

    Get PDF
    82 p.La refinación de las traducciones automáticas pretende corregir ciertas ambigüedades que ocurren con los algoritmos que actualmente se usan en los motores de traducción, de modo de obtener textos cuya comprensión por parte del usuario final sea mejor. Nuestra propuesta se basa en tomar como fuente de conocimiento la Web, entorno donde en cada momento se esta publicando nueva información (revistas, libros, tesis, papers, investigaciones, noticias etc.), en diferentes lugares del mundo. En este estudio se considera que la unidad mínima de traducción es una frase. La maquina comunicacional considerada para llevar acabo el proceso de refinamiento considera tres etapas a saber: (1) La captura de información textual de la Web (que deja como resultado una base de datos que crece constantemente); (2) El indexamiento de la información recopilada; y posteriormente ,(3) algoritmos de búsqueda por similaridad que permiten encontrar la frase mas adecuada al contenido(contexto) del texto traducido. Esta memoria se enfoca principalmente en el indexamiento de frases y la resolución eficiente de diversos tipos de consultas para recuperaci´on de frases utilizando diferentes estructuras de datos y algoritmos que servir´an de apoyo al momento de tomar la decisi´on de las frases m´as apropiadas a fin de mejorar la calidad de la traducci´on. Esta memoria esta inmersa en un proyecto m´as complejo en que se estudian y proponen t´ecnicas que permite refinar las traducciones autom´aticas, previamente realizadas por motores de traducci´on al idioma espa˜nol, por ejemplo: Google Translate,Yahoo! Babel Fish, Systran entre otros. Dicha memoria se enfoca principalmente en la recuperaci´on eficiente de informacion textual como medio de apoyo (reflejada en un m´odulo de la aplicación) al proceso de refinamiento de traducciones automáticas, indexando y estructurando la informaci´on que ha sido obtenida de la Web. Palabras Clave: Recuperacion de Informacion, Indexaci´on, Indices invertidos, Lenguajes de consulta./ABSTRACT: Automatic translation refining aims to correct some ambiguities that occur when using algorithms that are currently operating on translation engines, so as to obtain texts than are more understandable by the end user. Our proposal is based on taking the Web as a knowledge source where every time new information (magazines,books, theses, papers, research, news, and so on) is being published in different parts of the world. In this study we consider that the minimum unit of translation is a phrase. The whole machinery required to carry out the refining process considers three stages, namely, (1) to capture textual information from the Web (this stage produces a phrase database that is constantly growing); (2) to index the information collected; and (3) similarity search algorithms that find the most appropriate phrase according to the content (context) of the translated text. In this study we consider that the minimum unit of translation is a phrase. While the whole project studies and proposes techniques that allow to refine automatic translations previously translated by a engine translations (for instance, Google Translate,Yahoo! Babel Fish, Systran) into Spanish; this work focuses primarily on the efficient retrieval of textual information as a mean of supporting the process of refining machine translations, indexing and structuring the information that has been obtained from the Web

    "Revolution? What Revolution?" Successes and limits of computing technologies in philosophy and religion

    Get PDF
    Computing technologies like other technological innovations in the modern West are inevitably introduced with the rhetoric of "revolution". Especially during the 1980s (the PC revolution) and 1990s (the Internet and Web revolutions), enthusiasts insistently celebrated radical changes— changes ostensibly inevitable and certainly as radical as those brought about by the invention of the printing press, if not the discovery of fire.\ud These enthusiasms now seem very "1990s�—in part as the revolution stumbled with the dot.com failures and the devastating impacts of 9/11. Moreover, as I will sketch out below, the patterns of diffusion and impact in philosophy and religion show both tremendous success, as certain revolutionary promises are indeed kept—as well as (sometimes spectacular) failures. Perhaps we use revolutionary rhetoric less frequently because the revolution has indeed succeeded: computing technologies, and many of the powers and potentials they bring us as scholars and religionists have become so ubiquitous and normal that they no longer seem "revolutionary at all. At the same time, many of the early hopes and promises instantiated in such specific projects as Artificial Intelligence and anticipations of virtual religious communities only have been dashed against the apparently intractable limits of even these most remarkable technologies. While these failures are usually forgotten they leave in their wake a clearer sense of what these new technologies can, and cannot do

    Towards a Universal Wordnet by Learning from Combined Evidenc

    Get PDF
    Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification

    Translation and human-computer interaction

    Get PDF
    This paper seeks to characterise translation as a form of human-computer interaction. The evolution of translator-computer interaction is explored and the challenges and benefits are enunciated. The concept of cognitive ergonomics is drawn on to argue for a more caring and inclusive approach towards the translator by developers of translation technology. A case is also made for wider acceptance by the translation community of the benefits of the technology at their disposal and for more humanistic research on the impact of technology on the translator, the translation profession and the translation process

    Digital libraries and minority languages

    Get PDF
    Digital libraries have a pivotal role to play in the preservation and maintenance of international cultures in general and minority languages in particular. This paper outlines a software tool for building digital libraries that is well adapted for creating and distributing local information collections in minority languages, and describes some contexts in which it is used. The system can make multilingual documents available in structured collections and allows them to be accessed via multilingual interfaces. It is issued under a free open-source licence, which encourages participatory design of the software, and an end-user interface allows community-based localization of the various language interfaces - of which there are many

    Observing Users - Designing clarity a case study on the user-centred design of a cross-language information retrieval system

    Get PDF
    This paper presents a case study of the development of an interface to a novel and complex form of document retrieval: searching for texts written in foreign languages based on native language queries. Although the underlying technology for achieving such a search is relatively well understood, the appropriate interface design is not. A study involving users (with such searching needs) from the start of the design process is described covering initial examination of user needs and tasks; preliminary design and testing of interface components; building, testing, and further refining an interface; before finally conducting usability tests of the system. Lessons are learned at every stage of the process leading to a much more informed view of how such an interface should be built
    corecore