Search CORE

56 research outputs found

Tagging the Teleman Corpus

Author: Brants Thorsten
Samuelsson Christer
Publication venue
Publication date: 01/01/1995
Field of study

Experiments were carried out comparing the Swedish Teleman and the English Susanne corpora using an HMM-based and a novel reductionistic statistical part-of-speech tagger. They indicate that tagging the Teleman corpus is the more difficult task, and that the performance of the two different taggers is comparable.Comment: 14 pages, LaTeX, to appear in Proceedings of the 10th Nordic Conference of Computational Linguistics, Helsinki, Finland, 199

arXiv.org e-Print Archive

CiteSeerX

Chunk Tagger - Statistical Recognition of Noun Phrases

Author: Brants Thorsten
Skut Wojciech
Publication venue
Publication date: 01/01/1998
Field of study

We describe a stochastic approach to partial parsing, i.e., the recognition of syntactic structures of limited depth. The technique utilises Markov Models, but goes beyond usual bracketing approaches, since it is capable of recognising not only the boundaries, but also the internal structure and syntactic category of simple as well as complex NP's, PP's, AP's and adverbials. We compare tagging accuracy for different applications and encoding schemes.Comment: 7 pages, LaTe

arXiv.org e-Print Archive

CiteSeerX

A Maximum-Entropy Partial Parser for Unrestricted Text

Author: Brants Thorsten
Skut Wojciech
Publication venue
Publication date: 01/01/1998
Field of study

This paper describes a partial parser that assigns syntactic structures to sequences of part-of-speech tags. The program uses the maximum entropy parameter estimation method, which allows a flexible combination of different knowledge sources: the hierarchical structure, parts of speech and phrasal categories. In effect, the parser goes beyond simple bracketing and recognises even fairly complex structures. We give accuracy figures for different applications of the parser.Comment: 9 pages, LaTe

arXiv.org e-Print Archive

CiteSeerX

Tagging and parsing with cascaded Markov models : automation of corpus annotation

Author: Brants Thorsten
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 30/10/2007
Field of study

This thesis presents new techniques for parsing natural language. They are based on Markov Models, which are commonly used in part-of-speech tagging for sequential processing on the world level. We show that Markov Models can be successfully applied to other levels of syntactic processing. first two classification task are handled: the assignment of grammatical functions and the labeling of non-terminal nodes. Then, Markov Models are used to recognize hierarchical syntactic structures. Each layer of a structure is represented by a separate Markov Model. The output of a lower layer is passed as input to a higher layer, hence the name: Cascaded Markov Models. Instead of simple symbols, the states emit partial context-free structures. The new techniques are applied to corpus annotation and partial parsing and are evaluated using corpora of different languages and domains.Ausgehend von Markov-Modellen, die für das Part-of-Speech-Tagging eingesetzt werden, stellt diese Arbeit Verfahren vor, die Markov-Modelle auch auf weiteren Ebenen der syntaktischen Verarbeitung erfolgreich nutzen. Dies betrifft zum einen Klassifikationen wie die Zuweisung grammatischer Funktionen und die Bestimmung von Kategorien nichtterminaler Knoten, zum anderen die Zuweisung hierarchischer, syntaktischer Strukturen durch Markov-Modelle. Letzteres geschieht durch die Repräsentation jeder Ebene einer syntaktischen Struktur durch ein eigenes Markov-Modell, was den Namen des Verfahrens prägt: Kaskadierte Markov-Modelle. Deren Zustände geben anstelle atomarer Symbole partielle kontextfreie Strukturen aus. Diese Verfahren kommen in der Korpusannotation und dem partiellen Parsing zum Einsatz und werden anhand mehrerer Korpora evaluiert

Universaar

Acronym

Tagging and parsing with cascaded Markov models : automation of corpus annotation

Author: Brants Thorsten
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/1999
Field of study