2 research outputs found

    Local context templates for Chinese constituent boundary prediction

    No full text
    : In this paper, we proposed a shallow syntactic knowledge description: constituent boundary representation and its simple and efficient prediction algorithm, based on different local context templates learned from the annotated corpus. An open test on 2780 Chinese real text sentences showed the satisfying results: 94%(92%) precision for the words with multiple (single) boundary tag output. 1. Introduction Research on syntactic parsing has been a focus in natural language processing for a long time. As the development of corpus linguistics, many statistics-based parsers were proposed, such as Magerman(1995)'s statistical decision tree parser, Collins(1996)'s bigram dependency model parser, Ratnaparkhi(1997)'s maximum entropy model parser. All of them tried to get the complete parse trees of the input sentences, based on the statistical data extracted from an annotated corpus. The best parsing accuracy of these parsers was about 87%. Realizing the difficulties of complete ..