Search CORE

66 research outputs found

Proceedings of the 6th Dutch-Belgian Information Retrieval Workshop

Author
Publication venue: Neslia Paniculata
Publication date: 01/03/2006
Field of study

University of Twente Research Information

Language technology in multimedia information retrieval:Proceedings of the fourteenth International Twente Workshop on Language Technology

Author
Publication venue: 'University Library/University of Twente'
Publication date: 01/12/1998
Field of study

University of Twente Research Information

Word-to-Word Models of Translational Equivalence

Author: Melamed I. Dan
Publication venue
Publication date: 01/01/1997
Field of study

Parallel texts (bitexts) have properties that distinguish them from other kinds of parallel data. First, most words translate to only one other word. Second, bitext correspondence is noisy. This article presents methods for biasing statistical translation models to reflect these properties. Analysis of the expected behavior of these biases in the presence of sparse data predicts that they will result in more accurate models. The prediction is confirmed by evaluation with respect to a gold standard -- translation models that are biased in this fashion are significantly more accurate than a baseline knowledge-poor model. This article also shows how a statistical translation model can take advantage of various kinds of pre-existing knowledge that might be available about particular language pairs. Even the simplest kinds of language-specific knowledge, such as the distinction between content words and function words, is shown to reliably boost translation model performance on some tasks. Statistical models that are informed by pre-existing knowledge about the model domain combine the best of both the rationalist and empiricist traditions

arXiv.org e-Print Archive

CiteSeerX

Multilingual modelling of cross-lingual spelling variants

Author: Linden Krister
Publication venue
Publication date: 01/01/2006
Field of study

Peer reviewe

CiteSeerX

Helsingin yliopiston digitaalinen arkisto

Quantifying Cross-lingual Semantic Similarity for Natural Language Processing Applications

Author: Wäschle Katharina
Publication venue
Publication date: 01/01/2015
Field of study

Translation and cross-lingual access to information are key technologies in a global economy. Even though the quality of machine translation (MT) output is still far from the level of human translations, many real-world applications have emerged, for which MT can be employed. Machine translation supports human translators in computer-assisted translation (CAT), providing the opportunity to improve translation systems based on human interaction and feedback. Besides, many tasks that involve natural language processing operate in a cross-lingual setting, where there is no need for perfectly fluent translations and the transfer of meaning can be modeled by employing MT technology. This thesis describes cumulative work in the field of cross-lingual natural language processing in a user-oriented setting. A common denominator of the presented approaches is their anchoring in an alignment between texts in two different languages to quantify the similarity of their content

Heidelberger Dokumentenserver

Chinese-English Cross-Lingual Information Retrieval in Biomedicine Using Ontology-Based Query Expansion

Author: Wang Xinkai
Publication venue
Publication date: 01/08/2012
Field of study

The University of Manchester - Institutional Repository

Toponym Disambiguation in Information Retrieval

Author: Buscaldi Davide
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 12/11/2010
Field of study

In recent years, geography has acquired a great importance in the context of Information Retrieval (IR) and, in general, of the automated processing of information in text. Mobile devices that are able to surf the web and at the same time inform about their position are now a common reality, together with applications that can exploit this data to provide users with locally customised information, such as directions or advertisements. Therefore, it is important to deal properly with the geographic information that is included in electronic texts. The majority of such kind of information is contained as place names, or toponyms. Toponym ambiguity represents an important issue in Geographical Information Retrieval (GIR), due to the fact that queries are geographically constrained. There has been a struggle to nd speci c geographical IR methods that actually outperform traditional IR techniques. Toponym ambiguity may constitute a relevant factor in the inability of current GIR systems to take advantage from geographical knowledge. Recently, some Ph.D. theses have dealt with Toponym Disambiguation (TD) from di erent perspectives, from the development of resources for the evaluation of Toponym Disambiguation (Leidner (2007)) to the use of TD to improve geographical scope resolution (Andogah (2010)). The Ph.D. thesis presented here introduces a TD method based on WordNet and carries out a detailed study of the relationship of Toponym Disambiguation to some IR applications, such as GIR, Question Answering (QA) and Web retrieval. The work presented in this thesis starts with an introduction to the applications in which TD may result useful, together with an analysis of the ambiguity of toponyms in news collections. It could not be possible to study the ambiguity of toponyms without studying the resources that are used as placename repositories; these resources are the equivalent to language dictionaries, which provide the di erent meanings of a given word.Buscaldi, D. (2010). Toponym Disambiguation in Information Retrieval [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8912Palanci

RiuNet

Proceedings of the Third Dutch-Belgian Information Retrieval Workshop (DIR 2002)

Author: Moens M.F.
Publication venue: Katholieke Universiteit Leuven
Publication date: 06/12/2002
Field of study

University of Twente Research Information