Search CORE

4 research outputs found

Parser Adaptation for Social Media by Integrating Normalization

Author: van der Goot Rob
van Noord Gerardus
Publication venue
Publication date: 01/01/2017
Field of study

This work explores normalization for parser adaptation. Traditionally, normalization is used as separate pre-processing step. We show that integrating the normalization model into the parsing algorithm is beneficial. This way, multiple normalization candidates can be leveraged, which improves parsing performance on social media. We test this hypothesis by modifying the Berkeley parser; out-ofthe-box it achieves an F1 score of 66.52. Our integrated approach reaches a significant improvement with an F1 score of 67.36, while using the best normalization sequence results in an F1 score of only 66.94

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Modeling the interface between morphology and syntax in data-driven dependency parsing

Author: Seeker Wolfgang
Publication venue
Publication date: 01/01/2016
Field of study

When people formulate sentences in a language, they follow a set of rules specific to that language that defines how words must be put together in order to express the intended meaning. These rules are called the grammar of the language. Languages have essentially two ways of encoding grammatical information: word order or word form. English uses primarily word order to encode different meanings, but many other languages change the form of the words themselves to express their grammatical function in the sentence. These languages are commonly subsumed under the term morphologically rich languages. Parsing is the automatic process for predicting the grammatical structure of a sentence. Since grammatical structure guides the way we understand sentences, parsing is a key component in computer programs that try to automatically understand what people say and write. This dissertation is about parsing and specifically about parsing languages with a rich morphology, which encode grammatical information in the form of words. Today’s parsing models for automatic parsing were developed for English and achieve good results on this language. However, when applied to other languages, a significant drop in performance is usually observed. The standard model for parsing is a pipeline model that separates the parsing process into different steps, in particular it separates the morphological analysis, i.e. the analysis of word forms, from the actual parsing step. This dissertation argues that this separation is one of the reasons for the performance drop of standard parsers when applied to other languages than English. An analysis is presented that exposes the connection between the morphological system of a language and the errors of a standard parsing model. In a second series of experiments, we show that knowledge about the syntactic structure of sentence can support the prediction of morphological information. We then argue for an alternative approach that models morphological analysis and syntactic analysis jointly instead of separating them. We support this argumentation with empirical evidence by implementing two parsers that model the relationship between morphology and syntax in two different but complementary ways

Proceedings of the Second International Workshop on Computational Linguistics for Uralic Languages

Author
Publication venue: Szegedi Tudományegyetem
Publication date: 01/01/2016
Field of study

Repository of the Academy's Library

Normalization and parsing algorithms for uncertain input

Author: van der Goot Rob Matthijs
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2019
Field of study

ARTS repository - University of Groningen