Skip to main content
Article thumbnail
Location of Repository

A Divide-and-Conquer Strategy for Shallow Parsing of German Free Texts

By Günter Neumann, Christian Braun and Jakub Piskorski

Abstract

We present a divide-and-conquer strategy based on finite state technology for shallow parsing of realworld German texts. In a first phase only the topological structure of a sentence (i.e., verb groups, subclauses) are determined. In a second phase the phrasal grammars are applied to the contents of the di#erent fields of the main and sub-clauses. Shallow parsing is supported by suitably configured preprocessing, including: morphological and on-line compound analysis, e#cient POS-filtering, and named entity recognition. The whole approach proved to be very useful for processing of free word order languages like German. Especially for the divide-andconquer parsing strategy we obtained an f-measure of 87.14% on unseen data. 1 Introduction Current information extraction (IE) systems are quite successful in e#cient processing of large free text collections due to the fact that they can provide a partial understanding of specific types of text with a certain degree of partial accuracy usi..

Year: 2000
OAI identifier: oai:CiteSeerX.psu:10.1.1.41.836
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.dfki.de/~neumann/pu... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.