We present a divide-and-conquer strategy based on finite state technology for shallow parsing of realworld German texts. In a first phase only the topological structure of a sentence (i.e., verb groups, subclauses) are determined. In a second phase the phrasal grammars are applied to the contents of the di#erent fields of the main and sub-clauses. Shallow parsing is supported by suitably configured preprocessing, including: morphological and on-line compound analysis, e#cient POS-filtering, and named entity recognition. The whole approach proved to be very useful for processing of free word order languages like German. Especially for the divide-andconquer parsing strategy we obtained an f-measure of 87.14% on unseen data. 1 Introduction Current information extraction (IE) systems are quite successful in e#cient processing of large free text collections due to the fact that they can provide a partial understanding of specific types of text with a certain degree of partial accuracy usi..
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.