Skip to main content
Article thumbnail
Location of Repository

Translation prediction using word cooccurrence graphs

By Marianna Apidianaki

Abstract sense disambiguation (WSD) is a thorny subject in natural language processing.<br />It is implicated in many NLP tasks at varying degrees, where it usually constitutes an intermediate stage of processing and not a goal in itself. Applications relative to<br />translation (machine translation, bilingual lexicon building etc.) are highly concerned with WSD. The polysemy of source (SL) and target language (TL) lexical items<br />influences the strategies adopted during the translation process and the final translation choices. <br />It also complicates the detection of relations between polysemous items and their equivalents in texts, which are rarely one-to-one. The strategies imposed on the translator by SL lexical items are described in Salkie (2002) by a continuum, which goes from those imposed by items that are always translated in the same way in the TL and are thus translationally systematic, to those imposed by translationally asystematic items, being translated differently every time they occur in texts. These last cases, as well as intermediate ones, are quite demanding for human or automatic treatment and they often require the resolution of lexical polysemy. Observations coming from bilingual or multilingual translation corpora – consisting of original texts in one language and their translations in one or more other languages – can provide new insights to these questions. Corpus work offers the possibility to empirically test the validity of well-established assumptions about language, providing a better understanding of actual phenomena. The greater availability of corpora of this type makes it possible to extent this kind of analysis in the field of translation as well as in that of contrastive language studies.<br />In this paper, we will propose a method of translation prediction for polysemous lexical items. The first step in this process will be the disambiguation of SL items,<br />which will subsequently permit the definition of fine-grained translation correspondences. So in cases of multiple translation candidates for new occurrences<br />of polysemous items, the most suitable translation will be found by a combination of monolingual and bilingual information. But let's first take a look at the way polysemy is considered in a translation context

Topics: disambiguation, translation prediction, polysemy, cooccurrence, [SHS.LANGUE] Humanities and Social Sciences/Linguistics, [SCCO.LING] Cognitive science/Linguistics, [SCCO.COMP] Cognitive science/Computer science, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL]
Publisher: HAL CCSD
Year: 2005
OAI identifier: oai:HAL:halshs-00010277v1
Provided by: Hal-Diderot
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • https://halshs.archives-ouvert... (external link)
  • https://halshs.archives-ouvert... (external link)
  • https://halshs.archives-ouvert... (external link)
  • https://halshs.archives-ouvert... (external link)
  • https://halshs.archives-ouvert... (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.