8 research outputs found

    Towards a rule-based Spanish to Spanish sign language translation: from written forms to phonological representations

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones. Fecha de lectura: noviembre de 2014This thesis addresses several aspects about the automatic translation from Castilian Spanish to Spanish Sign Language (LSE), two typologically distant languages with not enough linguistics resources enabling statistical approaches to translation. For this reason, a rule-based approach grounded on contrastive grammatical studies on both languages is used. An architecture following the analysis, transfer and generation model has been chosen. Transfer is performed at the grammatical function level, which is delivered by a Spanish dependency parser without incurring into the complexities of a more deeper analysis. The bilingual base lexicon is obtained from the Diccionario normativo de la lengua de signos española (DILSE-III), which contains the correspondences between Spanish lemmas and their SEA (Sistema de escritura alfabética) representation of signs. The lexicon is extended in two different ways: taking advantage of the difference in flexibility between the part-of-speech systems of Spanish and LSE and exploiting several lexical semantic relations, such as synonymy, hyponymy and meronymy. During the structural transfer phase, some nodes of the dependency analysis are transformed, others are removed and new nodes are inserted. Some classifier predicates are generated in this phase. Surface order generation of signs is obtained by means of the topological ordering of the graph of precedence relations between signs. Pairs of signs having head-dependent relations or sharing the same head are examined in order to determine if its relative ordering is marked or not. The system is evaluated at this point and results are compared to those obtained with statistical models. Best results are obtained with the rule-based approach, with a 0.30 BLEU (Bilingual Evaluation Understudy) and a 42% TER (Translation Error Rate). A linguistic-oriented analysis of errors is provided. Finally, in the morphological generation phase, glosses with morphological annotations are replaced by the HamNoSys (Hamburg Sign Language Notation System) phonological representations produced by a computational morphology. These representations are used for animation synthesis with avatars. The computational morphology that has been implemented uses inflection, introflection and suppletion to model a significant fragment of the LSE morphology. Among the phenomena considered, it has been implemented deictics, nominal plural, aspect marking, verbal agreement, adjectival modification and degree.Esta tesis aborda varios aspectos sobre traducción automática ed español a lengua de signos española (LSE), dos lenguas tipológicamente distantes y con insuficientes recursos lingüísticos que hagan posible aproximaciones estadísticas a la traducción. Por ese motivo, se propone una estrategia basada en reglas lingüísticas fundamentadas en los estudios gramaticales contrastivos existentes entre ambas lenguas. Se ha optado por una arquitectura para la traducción siguiendo el modelo de análisis, transferencia y generación, en la que la transferencia se realiza al nivel de las funciones gramaticales proporcionadas por un analizador de dependencias, evitando así las complejidades asociadas a un análisis lingüístico mas profundo para el español. El lexicón bilíngüe base para la transferencia léxica se ha obtenido de las entradas del Diccionario normativo de la lengua de signos española (DILSE-III), que contiene las correspondencias entre lemas en español y la representación SEA (Sistema de escritura alfabética) de los signos. Este lexicón se ha ampliado por dos vías: Aprovechando las diferencias de flexibilidad entre las clase de palabras del español y la LSE, y explotando relaciones semánticas como la sinonimia, la hiperonimia y la meronimia. Durante la transferencia estructural, algunos nodos del árbol de análisis de dependencias son transformados, otros son borrados y son insertados nuevos nodos. Algunos predicados clasificadores son generados en esta fase. La generación del orden superficial de los signos se obtiene mediante la ordenación topológica del grafo de relaciones de precedencia entre signos. Los pares de signos en nodos que mantienen la relación núcleodependiente o son dependientes de un mismo signo son examinados para determinar si su orden relativo está marcado o no. El sistema de traducción es evaluado en este punto utilizando un corpus y comparado con el resultado obtenido con distintos modelos de traducción estadística. Sobre un corpus de control de glosas, el sistema basado en reglas obtiene mejores resultados, con un BLEU (Bilingual Evaluation Understudy) del 0,30 y un TER (Translation Error Rate) del 42%. Sobre los resultados se ha realizado un análisis de los errores. Finalmente, para la generación morfológica, las glosas junto con sus correspondientes anotaciones morfológicas son reemplazadas por las representaciones fonológicas Ham- NoSys producidas por una morfología computacional y usables para la síntesis de animaciones mediante avatares. La morfología implementada usa flexión, introflexión y supleción para modelar un fragmento bastante amplio de la LSE. Entre los fenómenos tratados se incluyen la deixis, la realización de los distintos tipos de plural nominal, el aspecto, la concordancia argumental del verbo, la modificación adjetival y el grado

    Lexicography of coronavirus-related neologisms

    Get PDF
    This volume brings together contributions by international experts reflecting on Covid19-related neologisms and their lexicographic processing and representation. The papers analyze new words, new meanings of existing words, and new multiword units, where they come from, how they are transmitted (or differ) across languages, and how their use and meaning are reflected in dictionaries of all sorts. Recent trends in as many as ten languages are considered, including general and specialized language, monolingual as well as bilingual and printed as well as online dictionaries

    Lexicography of Coronavirus-related Neologisms

    Get PDF
    This volume brings together contributions by international experts reflecting on Covid19-related neologisms and their lexicographic processing and representation. The papers analyze new words, new meanings of existing words, and new multiword units in as many as ten languages, considering both specialized and general language, monolingual as well as bilingual and printed as well as online dictionaries

    Native and non-native processing of morphologically complex words in Italian

    Get PDF
    The present work focuses on the organization of the mental lexicon in native and non-native speakers and aims at investigating whether words are connected in the mind in terms of morphological criteria, i.e., through a network of associations establishing when a co-occurrence of form and meaning is found. Psycholinguistic research on native lexical access has demonstrated that morphology indeed underlies the organization of the mental lexicon, even though controversies about the locus of this level of organization remain. On the other hand, research in the field of second language acquisition has only recently turned to investigate such issues and its findings so far have been controversial. Specifically, the debate centers on whether native and non-native speakers share the same processing systems. According to recent proposals (Heyer & Clahsen 2015), this would not be the case and L2 processing would be more affected by formal rather than morphological criteria. In this light, the present work is aimed at verifying the impact of formal characteristics in native and non-native lexical access focusing on the processing of formally transparent versus non-transparent words in Italian. Two morphological phenomena are investigated by means of four psycholinguistic experiments involving a lexical decision task combined with the masked priming paradigm. Experiments 1 & 2 compare the processing of allomorphic vs non-allomorphic derivatives, to investigate whether formal alterations impair the appreciation of the relationship between two morphologically related words. Experiments 3 & 4 are focused on lack of base autonomy found in so-called bound stems, i.e., stems which cannot occur in isolation and are aimed at determining whether the processing of free and bound stems differs. The results of Experiments 1 and 2 indicate that allomorphic variation does not influence the associations established among related words in native speakers, in line with the predictions that can be formulated within usage-based perspectives on language. Non-native speakers, on the other hand, seem to be more pervasively affected by the phonological/orthographical properties of words, but not to the point that transparent morphological relations can be reduced to mere form overlap shared by morphological relatives. Likewise, stem autonomy was not found to affect the way words containing bound and free stems are processed by native speakers, at least under certain conditions, suggesting that boundedness is not an issue influencing the establishment of morphological relationships among words. Non-native speakers, however, were found to be sensitive to the isolability of the stem, in a way that suggests that free bases may be more salient morphological units for them, as opposed to bound stems, which are seemingly more closely associated with orthographic strings resembling each other. Taken together, the findings of the present work suggest a model of the native mental lexicon based on words and morphological schemas emerging from the relationships establishing among them, despite phonological variations and stem boundedness. While it is unclear whether such a system of connections and schemas is equally strong in the non-native lexicon, morphological relationships still appear to drive lexical organization. Crucially, however, such organization is modulated by form, as demonstrated by the effects of phonological variations and lack of base autonomy

    The concept of 'Genetic Modification' in a Descriptive Translation Study (DTS) of an English-Spanish corpus of Popular Science Books on Genetic Engineering: Denominative Variation, Semantic Prosody and Ideological Aspects of Translation Strategies

    Get PDF
    El objetivo general consiste en examinar el concepto de 'modificación genética' a través de tres fenómenos lingüísticos: la variación denominativa, la prosodia semántica y los aspectos ideológicos de las principales estrategias de traducción. Para estudiar la variación denominativa se han seleccionado dos términos técnicos 'DNA' y 'gene/s' y dos subtécnicos 'food/s' y 'crop/s'. Para el estudio de la prosodia semántica se han analizado las concordancias de 'genetic' + N y 'genetically'`+ Adj. La comparación de las variantes denominativas y las prosodias semánticas en un corpus paralelo inglés-español de ingenería genética arrojan resultados sobre los aspectos ideológicos de las principales estrategias de traducción encontradas en el corpus.Departamento de Filología Ingles

    The lexeme in descriptive and theoretical morphology

    Get PDF
    After being dominant during about a century since its invention by Baudouin de Courtenay at the end of the nineteenth century, morpheme is more and more replaced by lexeme in contemporary descriptive and theoretical morphology. The notion of a lexeme is usually associated with the work of P. H. Matthews (1972, 1974), who characterizes it as a lexical entity abstracting over individual inflected words. Over the last three decades, the lexeme has become a cornerstone of much work in both inflectional morphology and word formation (or, as it is increasingly been called, lexeme formation). The papers in the present volume take stock of the descriptive and theoretical usefulness of the lexeme, but also adress many of the challenges met by classical lexeme-based theories of morphology

    Native and non-native processing of morphologically complex words in Italian

    Get PDF
    The present work focuses on the organization of the mental lexicon in native and non-native speakers and aims at investigating whether words are connected in the mind in terms of morphological criteria, i.e., through a network of associations establishing when a co-occurrence of form and meaning is found. Psycholinguistic research on native lexical access has demonstrated that morphology indeed underlies the organization of the mental lexicon, even though controversies about the locus of this level of organization remain. On the other hand, research in the field of second language acquisition has only recently turned to investigate such issues and its findings so far have been controversial. Specifically, the debate centers on whether native and non-native speakers share the same processing systems. According to recent proposals (Heyer & Clahsen 2015), this would not be the case and L2 processing would be more affected by formal rather than morphological criteria. In this light, the present work is aimed at verifying the impact of formal characteristics in native and non-native lexical access focusing on the processing of formally transparent versus non-transparent words in Italian. Two morphological phenomena are investigated by means of four psycholinguistic experiments involving a lexical decision task combined with the masked priming paradigm. Experiments 1 & 2 compare the processing of allomorphic vs non-allomorphic derivatives, to investigate whether formal alterations impair the appreciation of the relationship between two morphologically related words. Experiments 3 & 4 are focused on lack of base autonomy found in so-called bound stems, i.e., stems which cannot occur in isolation and are aimed at determining whether the processing of free and bound stems differs. The results of Experiments 1 and 2 indicate that allomorphic variation does not influence the associations established among related words in native speakers, in line with the predictions that can be formulated within usage-based perspectives on language. Non-native speakers, on the other hand, seem to be more pervasively affected by the phonological/orthographical properties of words, but not to the point that transparent morphological relations can be reduced to mere form overlap shared by morphological relatives. Likewise, stem autonomy was not found to affect the way words containing bound and free stems are processed by native speakers, at least under certain conditions, suggesting that boundedness is not an issue influencing the establishment of morphological relationships among words. Non-native speakers, however, were found to be sensitive to the isolability of the stem, in a way that suggests that free bases may be more salient morphological units for them, as opposed to bound stems, which are seemingly more closely associated with orthographic strings resembling each other. Taken together, the findings of the present work suggest a model of the native mental lexicon based on words and morphological schemas emerging from the relationships establishing among them, despite phonological variations and stem boundedness. While it is unclear whether such a system of connections and schemas is equally strong in the non-native lexicon, morphological relationships still appear to drive lexical organization. Crucially, however, such organization is modulated by form, as demonstrated by the effects of phonological variations and lack of base autonomy

    The lexeme in descriptive and theoretical morphology

    Get PDF
    After being dominant during about a century since its invention by Baudouin de Courtenay at the end of the nineteenth century, morpheme is more and more replaced by lexeme in contemporary descriptive and theoretical morphology. The notion of a lexeme is usually associated with the work of P. H. Matthews (1972, 1974), who characterizes it as a lexical entity abstracting over individual inflected words. Over the last three decades, the lexeme has become a cornerstone of much work in both inflectional morphology and word formation (or, as it is increasingly been called, lexeme formation). The papers in the present volume take stock of the descriptive and theoretical usefulness of the lexeme, but also adress many of the challenges met by classical lexeme-based theories of morphology
    corecore