A Chinese Parser Based on Probabilistic Context Free Grammar

Abstract

本文研究了PCFG独立性假设的局限性,并针对这一局限性提出了句法结构共现的概念以引入上下文信息,给出了计算方法;为了打破中文树库规模过小的局限性,对于句法规则参数的获取,本文利用In-side-Outside算法进行迭代,最后提出了一个基于统计模型的自顶向下的汉语句法分析器。在封闭测试下,其标记精确率和标记召回率分别为88.1%和86.8%。实验结果表明,这种方法确实能够提高标记的精确率和召回率,值得深入研究。This paper studies the limitations of probabilistic context free grammar,and proposes a concept of co-occurrence in syntax structure so as to use the context information.To address the limitation of the Chinese Treebank's small scale,an Inside-Outside algorithm to obtain the parameters of syntactic rules is given.At last,we present a probabilistic top-down Chinese parser.In the closed test,we get the result that label precision and label recall are 88.1% and 86.8%, showing that this method has potential to get a better performance in parsing and deserves further research.国家高科技研究发展计划(863)资助项目(2002AA117010

    Similar works