36,694 research outputs found

    A Pattern Matching method for finding Noun and Proper Noun Translations from Noisy Parallel Corpora

    Full text link
    We present a pattern matching method for compiling a bilingual lexicon of nouns and proper nouns from unaligned, noisy parallel texts of Asian/Indo-European language pairs. Tagging information of one language is used. Word frequency and position information for high and low frequency words are represented in two different vector forms for pattern matching. New anchor point finding and noise elimination techniques are introduced. We obtained a 73.1\% precision. We also show how the results can be used in the compilation of domain-specific noun phrases.Comment: 8 pages, uuencoded compressed postscript file. To appear in the Proceedings of the 33rd AC

    MT and Proper Nouns : how a German Model Became a Boat Operator

    Get PDF
    Writers and translators have difficulties treating proper nouns correctly. These designations represent concepts that are very likely not common knowledge. While humans can research, machines can only apply data provided. It is therefore important that proper nouns are documented in term bases and made available to MT engines.Tant redactors com traductors tenen dificultats per realitzar el tractament correcte dels noms propis. Aquestes denominacions representen conceptes que probablement no pertanyen al coneixement comú. Mentre que els humans poden recercar el concepte, les maquines només poden aplicar les dades de què disposen. Per aquest motiu, és important que els noms propis estiguin documentats a la base de dades terminològiques i que estiguin a disposició dels motors de traducció automàtica.Tanto redactores como traductores tienen dificultades para realizar el tratamiento correcto de los nombres propios. Estas denominaciones representan conceptos que probablemente no pertenezcan al conocimiento común. Mientras que los humanos pueden investigar el concepto, las máquinas únicamente pueden aplicar los datos de los que disponen. Por este motivo, es importante que los nombres propios estén documentados en una base de datos terminológicos y que estén a disposición de los motores de traducción automática

    Properhood

    Get PDF
    A history of the notion of PROPERHOOD in philosophy and linguistics is given. Two long-standing ideas, (i) that proper names have no sense, and (ii) that they are expressions whose purpose is to refer to individuals, cannot be made to work comprehensively while PROPER is understood as a subcategory of linguistic units, whether of lexemes or phrases. Phrases of the type the old vicarage, which are potentially ambiguous with regard to properhood, encourage the suggestion that PROPER is best understood as mode of reference contrasting with SEMANTIC reference; in the former, the intension/sense of any lexical items within the referring expression, and any entailments they give rise to, are canceled. PROPER NAMES are all those expressions that refer nonintensionally. Linguistic evidence is given that this opposition can be grammaticalized, speculation is made about its neurological basis, and psycholinguistic evidence is adduced in support. The PROPER NOUN,asa lexical category, is argued to be epiphenomenal on proper names as newly defined. Some consequences of the view that proper names have no sense in the act of reference are explored; they are not debarred from having senses (better: synchronic etymologies) accessible during other (meta)linguistic activities

    Annotation guidelines for labeling English-Dutch cognate pairs (version 1.0)

    Get PDF

    Recognition and translation Arabic-French of Named Entities: case of the Sport places

    Get PDF
    The recognition of Arabic Named Entities (NE) is a problem in different domains of Natural Language Processing (NLP) like automatic translation. Indeed, NE translation allows the access to multilingual in-formation. This translation doesn't always lead to expected result especially when NE contains a person name. For this reason and in order to ameliorate translation, we can transliterate some part of NE. In this context, we propose a method that integrates translation and transliteration together. We used the linguis-tic NooJ platform that is based on local grammars and transducers. In this paper, we focus on sport domain. We will firstly suggest a refinement of the typological model presented at the MUC Conferences we will describe the integration of an Arabic transliteration module into translation system. Finally, we will detail our method and give the results of the evaluation

    On the Similarities Between Native, Non-native and Translated Texts

    Full text link
    We present a computational analysis of three language varieties: native, advanced non-native, and translation. Our goal is to investigate the similarities and differences between non-native language productions and translations, contrasting both with native language. Using a collection of computational methods we establish three main results: (1) the three types of texts are easily distinguishable; (2) non-native language and translations are closer to each other than each of them is to native language; and (3) some of these characteristics depend on the source or native language, while others do not, reflecting, perhaps, unified principles that similarly affect translations and non-native language.Comment: ACL2016, 12 page
    • …
    corecore