56 research outputs found
Tagging the Teleman Corpus
Experiments were carried out comparing the Swedish Teleman and the English
Susanne corpora using an HMM-based and a novel reductionistic statistical
part-of-speech tagger. They indicate that tagging the Teleman corpus is the
more difficult task, and that the performance of the two different taggers is
comparable.Comment: 14 pages, LaTeX, to appear in Proceedings of the 10th Nordic
Conference of Computational Linguistics, Helsinki, Finland, 199
Chunk Tagger - Statistical Recognition of Noun Phrases
We describe a stochastic approach to partial parsing, i.e., the recognition
of syntactic structures of limited depth. The technique utilises Markov Models,
but goes beyond usual bracketing approaches, since it is capable of recognising
not only the boundaries, but also the internal structure and syntactic category
of simple as well as complex NP's, PP's, AP's and adverbials. We compare
tagging accuracy for different applications and encoding schemes.Comment: 7 pages, LaTe
A Maximum-Entropy Partial Parser for Unrestricted Text
This paper describes a partial parser that assigns syntactic structures to
sequences of part-of-speech tags. The program uses the maximum entropy
parameter estimation method, which allows a flexible combination of different
knowledge sources: the hierarchical structure, parts of speech and phrasal
categories. In effect, the parser goes beyond simple bracketing and recognises
even fairly complex structures. We give accuracy figures for different
applications of the parser.Comment: 9 pages, LaTe
Tagging and parsing with cascaded Markov models : automation of corpus annotation
This thesis presents new techniques for parsing natural language. They are based on Markov Models, which are commonly used in part-of-speech tagging for sequential processing on the world level. We show that Markov Models can be successfully applied to other levels of syntactic processing. first two classification task are handled: the assignment of grammatical functions and the labeling of non-terminal nodes. Then, Markov Models are used to recognize hierarchical syntactic structures. Each layer of a structure is represented by a separate Markov Model. The output of a lower layer is passed as input to a higher layer, hence the name: Cascaded Markov Models. Instead of simple symbols, the states emit partial context-free structures. The new techniques are applied to corpus annotation and partial parsing and are evaluated using corpora of different languages and domains.Ausgehend von Markov-Modellen, die für das Part-of-Speech-Tagging eingesetzt werden, stellt diese Arbeit Verfahren vor, die Markov-Modelle auch auf weiteren Ebenen der syntaktischen Verarbeitung erfolgreich nutzen. Dies betrifft zum einen Klassifikationen wie die Zuweisung grammatischer Funktionen und die Bestimmung von Kategorien nichtterminaler Knoten, zum anderen die Zuweisung hierarchischer, syntaktischer Strukturen durch Markov-Modelle. Letzteres geschieht durch die Repräsentation jeder Ebene einer syntaktischen Struktur durch ein eigenes Markov-Modell, was den Namen des Verfahrens prägt: Kaskadierte Markov-Modelle. Deren Zustände geben anstelle atomarer Symbole partielle kontextfreie Strukturen aus. Diese Verfahren kommen in der Korpusannotation und dem partiellen Parsing zum Einsatz und werden anhand mehrerer Korpora evaluiert
Tagging and parsing with cascaded Markov models : automation of corpus annotation
This thesis presents new techniques for parsing natural language. They are based on Markov Models, which are commonly used in part-of-speech tagging for sequential processing on the world level. We show that Markov Models can be successfully applied to other levels of syntactic processing. first two classification task are handled: the assignment of grammatical functions and the labeling of non-terminal nodes. Then, Markov Models are used to recognize hierarchical syntactic structures. Each layer of a structure is represented by a separate Markov Model. The output of a lower layer is passed as input to a higher layer, hence the name: Cascaded Markov Models. Instead of simple symbols, the states emit partial context-free structures. The new techniques are applied to corpus annotation and partial parsing and are evaluated using corpora of different languages and domains.Ausgehend von Markov-Modellen, die für das Part-of-Speech-Tagging eingesetzt werden, stellt diese Arbeit Verfahren vor, die Markov-Modelle auch auf weiteren Ebenen der syntaktischen Verarbeitung erfolgreich nutzen. Dies betrifft zum einen Klassifikationen wie die Zuweisung grammatischer Funktionen und die Bestimmung von Kategorien nichtterminaler Knoten, zum anderen die Zuweisung hierarchischer, syntaktischer Strukturen durch Markov-Modelle. Letzteres geschieht durch die Repräsentation jeder Ebene einer syntaktischen Struktur durch ein eigenes Markov-Modell, was den Namen des Verfahrens prägt: Kaskadierte Markov-Modelle. Deren Zustände geben anstelle atomarer Symbole partielle kontextfreie Strukturen aus. Diese Verfahren kommen in der Korpusannotation und dem partiellen Parsing zum Einsatz und werden anhand mehrerer Korpora evaluiert
- …