1,322 research outputs found

    Definiteness and number in japanese to german machine translation

    Get PDF
    Eins der signifikanten Probleme in der maschinellen Übersetzung japanische in deutsche Sprache ist die fehlende Information und Definitheit im japanischen Analyse-Output. Eine effiziente Lösung dieses Problems ist es, die Suche nach der relevanten Information in den Transfer zu integrieren. Transferregeln werden mit Präferenzregeln und Default-Regeln kombiniert. Dadurch wird Information über lexikalische Restriktionen der Zielsprache, über die Domäne und über den Diskurs zugänglich

    Preferences and defaults for definiteness and number in japanese to german machine translation

    Get PDF
    A significant problem when translating Japanese dialogues into German is the missing information on number and definiteness in the Japanese analysis output. The integration of the search for such information into the transfer process provides an efficient solution. General transfer includes conditions to make it possible to consider external knowledge. Thereby, grammatical and lexical knowledge of the source language, knowledge of lexical restrictions on the target language, domain knowledge and discourse knowledge are accessible

    Preferences and Defaults for Definiteness and Number in Japanese to German Machine Translation

    Get PDF

    Example-based machine translation of the Basque language

    Get PDF
    Basque is both a minority and a highly inflected language with free order of sentence constituents. Machine Translation of Basque is thus both a real need and a test bed for MT techniques. In this paper, we present a modular Data-Driven MT system which includes different chunkers as well as chunk aligners which can deal with the free order of sentence constituents of Basque. We conducted Basque to English translation experiments, evaluated on a large corpus (270, 000 sentence pairs). The experimental results show that our system significantly outperforms state-of-the-art approaches according to several common automatic evaluation metrics

    A Joint Matrix Factorization Analysis of Multilingual Representations

    Full text link
    We present an analysis tool based on joint matrix factorization for comparing latent representations of multilingual and monolingual models. An alternative to probing, this tool allows us to analyze multiple sets of representations in a joint manner. Using this tool, we study to what extent and how morphosyntactic features are reflected in the representations learned by multilingual pre-trained models. We conduct a large-scale empirical study of over 33 languages and 17 morphosyntactic categories. Our findings demonstrate variations in the encoding of morphosyntactic information across upper and lower layers, with category-specific differences influenced by language properties. Hierarchical clustering of the factorization outputs yields a tree structure that is related to phylogenetic trees manually crafted by linguists. Moreover, we find the factorization outputs exhibit strong associations with performance observed across different cross-lingual tasks. We release our code to facilitate future research.Comment: Accepted to Findings of EMNLP 202

    Cross-linguistic differences and similarities in image descriptions

    Get PDF
    Automatic image description systems are commonly trained and evaluated on large image description datasets. Recently, researchers have started to collect such datasets for languages other than English. An unexplored question is how different these datasets are from English and, if there are any differences, what causes them to differ. This paper provides a cross-linguistic comparison of Dutch, English, and German image descriptions. We find that these descriptions are similar in many respects, but the familiarity of crowd workers with the subjects of the images has a noticeable influence on description specificity.Comment: Accepted for INLG 2017, Santiago de Compostela, Spain, 4-7 September, 2017. Camera-ready version. See the ACL anthology for full bibliographic informatio

    Definiteness across languages

    Get PDF
    Definiteness has been a central topic in theoretical semantics since its modern foundation. However, despite its significance, there has been surprisingly scarce research on its cross-linguistic expression. With the purpose of contributing to filling this gap, the present volume gathers thirteen studies exploiting insights from formal semantics and syntax, typological and language specific studies, and, crucially, semantic fieldwork and cross-linguistic semantics, in order to address the expression and interpretation of definiteness in a diverse group of languages, most of them understudied. The papers presented in this volume aim to establish a dialogue between theory and data in order to answer the following questions: What formal strategies do natural languages employ to encode definiteness? What are the possible meanings associated to this notion across languages? Are there different types of definite reference? Which other functions (besides marking definite reference) are associated with definite descriptions? Each of the papers contained in this volume addresses at least one of these questions and, in doing so, they aim to enrich our understanding of definiteness

    Definiteness across languages

    Get PDF
    Definiteness has been a central topic in theoretical semantics since its modern foundation. However, despite its significance, there has been surprisingly scarce research on its cross-linguistic expression. With the purpose of contributing to filling this gap, the present volume gathers thirteen studies exploiting insights from formal semantics and syntax, typological and language specific studies, and, crucially, semantic fieldwork and cross-linguistic semantics, in order to address the expression and interpretation of definiteness in a diverse group of languages, most of them understudied
    corecore