82 research outputs found

    Neural Techniques for German Dependency Parsing

    Get PDF
    Syntactic parsing is the task of analyzing the structure of a sentence based on some predefined formal assumption. It is a key component in many natural language processing (NLP) pipelines and is of great benefit for natural language understanding (NLU) tasks such as information retrieval or sentiment analysis. Despite achieving very high results with neural network techniques, most syntactic parsing research pays attention to only a few prominent languages (such as English or Chinese) or language-agnostic settings. Thus, we still lack studies that focus on just one language and design specific parsing strategies for that language with regards to its linguistic properties. In this thesis, we take German as the language of interest and develop more accurate methods for German dependency parsing by combining state-of-the-art neural network methods with techniques that address the specific challenges posed by the language-specific properties of German. Compared to English, German has richer morphology, semi-free word order, and case syncretism. It is the combination of those characteristics that makes parsing German an interesting and challenging task. Because syntactic parsing is a task that requires many levels of language understanding, we propose to study and improve the knowledge of parsing models at each level in order to improve syntactic parsing for German. These levels are: (sub)word level, syntactic level, semantic level, and sentence level. At the (sub)word level, we look into a surge in out-of-vocabulary words in German data caused by compounding. We propose a new type of embeddings for compounds that is a compositional model of the embeddings of individual components. Our experiments show that character-based embeddings are superior to word and compound embeddings in dependency parsing, and compound embeddings only outperform word embeddings when the part-of-speech (POS) information is unavailable. Thus, we conclude that it is the morpho-syntactic information of unknown compounds, not the semantic one, that is crucial for parsing German. At the syntax level, we investigate challenges for local grammatical function labeler that are caused by case syncretism. In detail, we augment the grammatical function labeling component in a neural dependency parser that labels each head-dependent pair independently with a new labeler that includes a decision history, using Long Short-Term Memory networks (LSTMs). All our proposed models significantly outperformed the baseline on three languages: English, German and Czech. However, the impact of the new models is not the same for all languages: the improvement for English is smaller than for the non-configurational languages (German and Czech). Our analysis suggests that the success of the history-based models is not due to better handling of long dependencies but that they are better in dealing with the uncertainty in head direction. We study the interaction of syntactic parsing with the semantic level via the problem of PP attachment disambiguation. Our motivation is to provide a realistic evaluation of the task where gold information is not available and compare the results of disambiguation systems against the output of a strong neural parser. To our best knowledge, this is the first time that PP attachment disambiguation is evaluated and compared against neural dependency parsing on predicted information. In addition, we present a novel approach for PP attachment disambiguation that uses biaffine attention and utilizes pre-trained contextualized word embeddings as semantic knowledge. Our end-to-end system outperformed the previous pipeline approach on German by a large margin simply by avoiding error propagation caused by predicted information. In the end, we show that parsing systems (with the same semantic knowledge) are in general superior to systems specialized for PP attachment disambiguation. Lastly, we improve dependency parsing at the sentence level using reranking techniques. So far, previous work on neural reranking has been evaluated on English and Chinese only, both languages with a configurational word order and poor morphology. We re-assess the potential of successful neural reranking models from the literature on English and on two morphologically rich(er) languages, German and Czech. In addition, we introduce a new variation of a discriminative reranker based on graph convolutional networks (GCNs). Our proposed reranker not only outperforms previous models on English but is the only model that is able to improve results over the baselines on German and Czech. Our analysis points out that the failure is due to the lower quality of the k-best lists, where the gold tree ratio and the diversity of the list play an important role

    Viability of Sequence Labeling Encodings for Dependency Parsing

    Get PDF
    Programa Oficial de Doutoramento en Computación . 5009V01[Abstract] This thesis presents new methods for recasting dependency parsing as a sequence labeling task yielding a viable alternative to the traditional transition- and graph-based approaches. It is shown that sequence labeling parsers provide several advantages for dependency parsing, such as: (i) a good trade-off between accuracy and parsing speed, (ii) genericity which enables running a parser in generic sequence labeling software and (iii) pluggability which allows using full parse trees as features to downstream tasks. The backbone of dependency parsing as sequence labeling are the encodings which serve as linearization methods for mapping dependency trees into discrete labels, such that each token in a sentence is associated with a label. We introduce three encoding families comprising: (i) head selection, (ii) bracketing-based and (iii) transition-based encodings which are differentiated by the way they represent a dependency tree as a sequence of labels. We empirically examine the viability of the encodings and provide an analysis of their facets. Furthermore, we explore the feasibility of leveraging external complementary data in order to enhance parsing performance. Our sequence labeling parser is endowed with two kinds of representations. First, we exploit the complementary nature of dependency and constituency parsing paradigms and enrich the parser with representations from both syntactic abstractions. Secondly, we use human language processing data to guide our parser with representations from eye movements. Overall, the results show that recasting dependency parsing as sequence labeling is a viable approach that is fast and accurate and provides a practical alternative for integrating syntax in NLP tasks.[Resumen] Esta tesis presenta nuevos métodos para reformular el análisis sintáctico de dependencias como una tarea de etiquetado secuencial, lo que supone una alternativa viable a los enfoques tradicionales basados en transiciones y grafos. Se demuestra que los analizadores de etiquetado secuencial ofrecen varias ventajas para el análisis sintáctico de dependencias, como por ejemplo (i) un buen equilibrio entre la precisión y la velocidad de análisis, (ii) la genericidad que permite ejecutar un analizador en un software genérico de etiquetado secuencial y (iii) la conectividad que permite utilizar el árbol de análisis completo como características para las tareas posteriores. El pilar del análisis sintáctico de dependencias como etiquetado secuencial son las codificaciones que sirven como métodos de linealización para transformar los árboles de dependencias en etiquetas discretas, de forma que cada token de una frase se asocia con una etiqueta. Introducimos tres familias de codificación que comprenden: (i) selección de núcleos, (ii) codificaciones basadas en corchetes y (iii) codificaciones basadas en transiciones que se diferencian por la forma en que representan un árbol de dependencias como una secuencia de etiquetas. Examinamos empíricamente la viabilidad de las codificaciones y ofrecemos un análisis de sus facetas. Además, exploramos la viabilidad de aprovechar datos complementarios externos para mejorar el rendimiento del análisis sintáctico. Dotamos a nuestro analizador sintáctico de dos tipos de representaciones. En primer lugar, explotamos la naturaleza complementaria de los paradigmas de análisis sintáctico de dependencias y constituyentes, enriqueciendo el analizador sintáctico con representaciones de ambas abstracciones sintácticas. En segundo lugar, utilizamos datos de procesamiento del lenguaje humano para guiar nuestro analizador con representaciones de los movimientos oculares. En general, los resultados muestran que la reformulación del análisis sintáctico de dependencias como etiquetado de secuencias es un enfoque viable, rápido y preciso, y ofrece una alternativa práctica para integrar la sintaxis en las tareas de PLN.[Resumo] Esta tese presenta novos métodos para reformular a análise sintáctica de dependencias como unha tarefa de etiquetaxe secuencial, o que supón unha alternativa viable aos enfoques tradicionais baseados en transicións e grafos. Demóstrase que os analizadores de etiquetaxe secuencial ofrecen varias vantaxes para a análise sintáctica de dependencias, por exemplo (i) un bo equilibrio entre a precisión e a velocidade de análise, (ii) a xenericidade que permite executar un analizador nun software xenérico de etiquetaxe secuencial e (iii) a conectividade que permite empregar a árbore de análise completa como características para as tarefas posteriores. O piar da análise sintáctica de dependencias como etiquetaxe secuencial son as codificacións que serven como métodos de linealización para transformar as árbores de dependencias en etiquetas discretas, de forma que cada token dunha frase se asocia cunha etiqueta. Introducimos tres familias de codificación que comprenden: (i) selección de núcleos, (ii) codificacións baseadas en corchetes e (iii) codificacións baseadas en transicións que se diferencian pola forma en que representan unha árbore de dependencia como unha secuencia de etiquetas. Examinamos empíricamente a viabilidade das codificacións e ofrecemos unha análise das súas facetas. Ademais, exploramos a viabilidade de aproveitar datos complementarios externos para mellorar o rendemento da análise sintáctica. O noso analizador sintáctico de etiquetaxe secuencial está dotado de dous tipos de representacións. En primeiro lugar, explotamos a natureza complementaria dos paradigmas de análise sintáctica de dependencias e constituíntes e enriquecemos o analizador sintáctico con representacións de ambas abstraccións sintácticas. En segundo lugar, empregamos datos de procesamento da linguaxe humana para guiar o noso analizador con representacións dos movementos oculares. En xeral, os resultados mostran que a reformulación da análise sintáctico de dependencias como etiquetaxe de secuencias é un enfoque viable, rápido e preciso, e ofrece unha alternativa práctica para integrar a sintaxe nas tarefas de PLN.This work has been carried out thanks to the funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150)

    Deep learning applications for transition-based dependency parsing

    Get PDF
    Dependency Parsing is a method that builds dependency trees consisting of binary relations that describe the syntactic role of words in sentences. Recently, dependency parsing has seen large improvements due to deep learning, which enabled richer feature representations and flexible architectures. In this thesis we focus on the application of these methods to Transition-based parsing, which is a faster variant. We explore current architectures and examine ways to improve their representation capabilities and final accuracies. Our first contribution is an improvement on the basic architecture at the heart of many current parsers. We show that using Recurrent Neural Network hidden layers, initialised with pretrained weights from a feed forward network, provides significant accuracy improvements. Second, we examine the best parser architecture. We show that separate classifiers for dependency parsing and labelling, with a shared input layer provides the best accuracy. We also show that a parser and labeller can be successfully trained separately. Finally, we propose Recursive LSTM Trees, which can represent an entire tree as a single dense vector, and achieve competitive accuracy with minimal features. The parsers that we develop in this thesis cover many aspects of this task, and are easy to integrate with current methods

    Neural Combinatory Constituency Parsing

    Get PDF
    東京都立大学Tokyo Metropolitan University博士(情報科学)doctoral thesi

    An Unsolicited Soliloquy on Dependency Parsing

    Get PDF
    Programa Oficial de Doutoramento en Computación . 5009V01[Abstract] This thesis presents work on dependency parsing covering two distinct lines of research. The first aims to develop efficient parsers so that they can be fast enough to parse large amounts of data while still maintaining decent accuracy. We investigate two techniques to achieve this. The first is a cognitively-inspired method and the second uses a model distillation method. The first technique proved to be utterly dismal, while the second was somewhat of a success. The second line of research presented in this thesis evaluates parsers. This is also done in two ways. We aim to evaluate what causes variation in parsing performance for different algorithms and also different treebanks. This evaluation is grounded in dependency displacements (the directed distance between a dependent and its head) and the subsequent distributions associated with algorithms and the distributions found in treebanks. This work sheds some light on the variation in performance for both different algorithms and different treebanks. And the second part of this area focuses on the utility of part-of-speech tags when used with parsing systems and questions the standard position of assuming that they might help but they certainly won’t hurt.[Resumen] Esta tesis presenta trabajo sobre análisis de dependencias que cubre dos líneas de investigación distintas. La primera tiene como objetivo desarrollar analizadores eficientes, de modo que sean suficientemente rápidos como para analizar grandes volúmenes de datos y, al mismo tiempo, sean suficientemente precisos. Investigamos dos métodos. El primero se basa en teorías cognitivas y el segundo usa una técnica de destilación. La primera técnica resultó un enorme fracaso, mientras que la segunda fue en cierto modo un ´éxito. La otra línea evalúa los analizadores sintácticos. Esto también se hace de dos maneras. Evaluamos la causa de la variación en el rendimiento de los analizadores para distintos algoritmos y corpus. Esta evaluación utiliza la diferencia entre las distribuciones del desplazamiento de arista (la distancia dirigida de las aristas) correspondientes a cada algoritmo y corpus. También evalúa la diferencia entre las distribuciones del desplazamiento de arista en los datos de entrenamiento y prueba. Este trabajo esclarece las variaciones en el rendimiento para algoritmos y corpus diferentes. La segunda parte de esta línea investiga la utilidad de las etiquetas gramaticales para los analizadores sintácticos.[Resumo] Esta tese presenta traballo sobre análise sintáctica, cubrindo dúas liñas de investigación. A primeira aspira a desenvolver analizadores eficientes, de maneira que sexan suficientemente rápidos para procesar grandes volumes de datos e á vez sexan precisos. Investigamos dous métodos. O primeiro baséase nunha teoría cognitiva, e o segundo usa unha técnica de destilación. O primeiro método foi un enorme fracaso, mentres que o segundo foi en certo modo un éxito. A outra liña avalúa os analizadores sintácticos. Esto tamén se fai de dúas maneiras. Avaliamos a causa da variación no rendemento dos analizadores para distintos algoritmos e corpus. Esta avaliaci´on usa a diferencia entre as distribucións do desprazamento de arista (a distancia dirixida das aristas) correspondentes aos algoritmos e aos corpus. Tamén avalía a diferencia entre as distribucións do desprazamento de arista nos datos de adestramento e proba. Este traballo esclarece as variacións no rendemento para algoritmos e corpus diferentes. A segunda parte desta liña investiga a utilidade das etiquetas gramaticais para os analizadores sintácticos.This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150) and from the Centro de Investigación de Galicia (CITIC) which is funded by the Xunta de Galicia and the European Union (ERDF - Galicia 2014-2020 Program) by grant ED431G 2019/01.Xunta de Galicia; ED431G 2019/0

    The DCU-EPFL enhanced dependency parser at the IWPT 2021 shared task

    Get PDF
    We describe the DCU-EPFL submission to the IWPT 2021 Parsing Shared Task: From Raw Text to Enhanced Universal Dependencies. The task involves parsing Enhanced UD graphs, which are an extension of the basic dependency trees designed to be more facilitative towards representing semantic structure. Evaluation is carried out on 29 treebanks in 17 languages and participants are required to parse the data from each language starting from raw strings. Our approach uses the Stanza pipeline to preprocess the text files, XLM-RoBERTa to obtain contextualized token representations, and an edge-scoring and labeling model to predict the enhanced graph. Finally, we run a postprocessing script to ensure all of our outputs are valid Enhanced UD graphs. Our system places 6th out of 9 participants with a coarse Enhanced Labeled Attachment Score (ELAS) of 83.57. We carry out additional post-deadline experiments which include using Trankit for pre-processing, XLM-RoBERTa LARGE, treebank concatenation, and multitask learning between a basic and an enhanced dependency parser. All of these modifications improve our initial score and our final system has a coarse ELAS of 88.04
    corecore