2 research outputs found

    Augmented Parsing of Unknown Word by Graph-based Semi-supervised Learning

    Get PDF

    Augmented Parsing of Unknown Word by Graph-based Semi-supervised Learning

    No full text
    This paper presents a novel method using graph-based semi-supervised learning (SSL) to improve the syntax parsing of unknown words. Different from conventional approaches that uses hand-crafted rules, rich morphological features, or a character-based model to handle unknown words, this method is based on a graph-based label propagation technique. It gives greater improvement on grammars trained on a smaller amount of labeled data and a large amount of unlabeled one. A transductiv 1 graph-based SSL method is employed to propagate POS and derive the emission distributions from labeled data to unlabeled one. The derived distributions are incorporated into the parsing process. The proposed method effectively augments the original supervised parsing model by contributing 2.28 % and 1.72 % absolute improvement on the accuracy of POS tagging and syntax parsing for Penn Chinese Treebank respectively.
    corecore