725 research outputs found

    Multilingual domain modeling in Twenty-One: automatic creation of a bi-directional translation lexicon from a parallel corpus

    Get PDF
    Within the project Twenty-One, which aims at the effective dissemination of information on ecology and sustainable development, a sytem is developed that supports cross-language information retrieval in any of the four languages Dutch, English, French and German. Knowledge of this application domain is needed to enhance existing translation resources for the purpose of lexical disambiguation. This paper describes an algorithm for the automated acquisition of a translation lexicon from a parallel corpus. New about the presented algorithm is the statistical language model used. Because the algorithm is based on a symmetric translation model it becomes possible to identify one-to-many and many-to-one relations between words of a language pair. We claim that the presented method has two advantages over algorithms that have been published before. Firstly, because the translation model is more powerful, the resulting bilingual lexicon will be more accurate. Secondly, the resulting bilingual lexicon can be used to translate in both directions between a language pair. Different versions of the algorithm were evaluated on the Dutch and English version of the Agenda 21 corpus, which is a UN document on the application domain of sustainable development

    Integrated Use of Internal and External Evidence in the Alignment of Multi-Word Named Entities

    Get PDF
    This paper proposes a method of extracting English multi-word named entities and their Japanese equivalents from a parallel corpus. The aim of our research is to extract multi-word named entities which are not listed in a dictionary of an English-to-Japanese MT system and appear infrequently in a parallel corpus. Our method makes its alignment on the basis of two kinds of external evidence provided by the context in which a bilingual pair appears, as well as two kinds of internal evidence within the pair. Each evidence is accompanied by a score, and the aggregate score is computed as a weighted sum of the scores. The appropriate weights are estimated with the logistic regression analysis. An experiment using a parallel corpus of Yomiuri Shimbun and The Daily Yomiuri satisfactorily found that 86.36% of the extracted bilingual pairs with the highest scores were judged to be correct

    Corpus language input, corpus processes in learning, learner corpus product. Introduction

    Get PDF
    International audienc

    Developing Communicative and Textual Competence through Genres

    Get PDF
    In recent years the concept of translation competence has steadily gained acceptance up to the point where it has now become the most widely discussed issue in relation to translator training. Proof of this can be seen, for example, in the work carried out by Hurtado in the PACTE group (2001) or that of Kelly (2002, 2005, 2006). Translation competence is a complex, multifaceted concept that takes in a number of different aspects. Many researchers have adapted the literary studies tradition focused on text genres to both the field of linguistics and language teaching (Swales, 1990, and Bhatia, 1993, among others) and to translation (Hatim and Mason, 1990; or, for example, the work of the GENTT team, and more especially García Izquierdo, ed. 2005). In this article we reconsider the value of the concept of text genre in translator training (and, therefore, in the make-up of translation competence), as well as in research on translation. Here, text genre is understood to be a conventionalised, and at the same time dynamic and hybrid, text form (Kress, 1985) that represents an interface between text and context, and between the source text and the target text (Montalt, 2003; GENTT, 2005). The aim of this study is to go a step further in this line of thinking and explore the relation between genre and translation competence, on the one hand, and the communicative and textual sub-competence, on the other (Kelly, 2005). Indeed, the value of the concept of text genre in the acquisition of translation competence has already been addressed in previous works (Montalt, 2003; Montalt, Ezpeleta and García de Toro, 2005; Ezpeleta, 2005; or García Izquierdo, 2005a). Now, as we have said above, translation competence is a multifaceted concept that is made up of a number of sub-competencies and we believe it is possible to define in greater detail exactly which particular translation sub-competencies could be acquired by using text genre as a teaching aid. More specifically, the main hypothesis we will attempt to illustrate here is that this concept would be especially useful for acquiring what is known as communicative and textual subcompetence. The acquisition of translation competence is a gradual process that is strongly influenced by the degree of complexity of the texts/genres the translator is working with. The greater the complexity of the text is, the higher the level of competence required of the translator will be. This explains why the relation between text genres and the communicative and textual sub-competence is also affected by the level of complexity and/or specialisation of the texts that the translator has to deal with. Thus, following on with the line taken by the Gentt research team (www.gentt.uji.es), we will be focusing on the analysis of some genres from specialised fields (mainly medical/health care and technical genres) in an attempt to show that the relation between text genre and communicative and textual sub-competence, among others, can be very fruitfu

    Automatic Acquisition of Class-based Rules for Word Alignment

    Get PDF

    Introducing a new lexicographical model: AlphaConceptual+ (and how it could be applied to dictionaries for Luganda)

    Get PDF
    In this article we explore the possibility of amalgamating the semasiological (i.e. alphabetical), onomasiological (i.e. conceptual) and visual approaches to dictionary compilation, here termed an alphaconceptual+ (i.e. alphaconceptual 'plus') dictionary, using Luganda as a brief case study. Such a dictionary would combine the strong points of alphabetical and conceptual lexicography, with all entries also linked to relevant picture plates. In Section 1 we expound on the history of Luganda lexicography, highlighting the different types of dictionaries in the language since the early 1900s. Section 2 is an exposition of semasiological and onomasiological lexicography. In Sections 3 and 4 we study the actual dictionary market and scholarly lexicographic literature, in Africa and the rest of the world respectively. In Section 5 a case for language-independent alphaconceptual+ lexicography is argued, and its proposed compilation approach is sketched out in Section 6, followed by the conclusion in Section 7

    Lexical typology : a programmatic sketch

    Get PDF
    The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar
    corecore