The paper presents a large-coverage rulebased dependency parser for Russian, ETAP-3, and results of its evaluation according to several criteria. The parser takes a morphological structure of a sentence processed as input and builds a dependency tree for this sentence using a set of syntactic rules. Each rule establishes one labeled and directed link between two words of a sentence that form a specific syntactic construction. The parser makes use of about 65 different syntactic links. The rules are applied by an algorithm that at first builds all possible hypothetical links and then uses a variety of filters to delete excessive links so that the remaining ones form a dependency tree. Several types of data collected either empirically or from a syntactically tagged corpus of Russian, SynTagRus, are used at this filtering stage to refine the parser performance. The parser utilizes a highly structured 120,000-strong Russian dictionary, whose entries contain detailed descriptions of syntactic, semantic and other properties of words. A notable proportion of the links in the output trees are non-projective. An important feature of the parser is its ability to produce multiple parses for the same sentence. In a special mode of operation, the parser may be instructed to produce more parsing outputs in addition to the first one. This can be done automatically or interactively. In the evaluation, SynTagRus is viewed as a gold standard. Evaluation results show the figures of 0.900 for unlabelled attachment score, 0.860 for labeled attachment score, and 0.492 for unlabeled structure correctness
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.