4,793 research outputs found

    Improving a Strong Neural Parser with Conjunction-Specific Features

    Full text link
    While dependency parsers reach very high overall accuracy, some dependency relations are much harder than others. In particular, dependency parsers perform poorly in coordination construction (i.e., correctly attaching the "conj" relation). We extend a state-of-the-art dependency parser with conjunction-specific features, focusing on the similarity between the conjuncts head words. Training the extended parser yields an improvement in "conj" attachment as well as in overall dependency parsing accuracy on the Stanford dependency conversion of the Penn TreeBank

    Strong domain variation and treebank-induced LFG resources

    Get PDF
    In this paper we present a number of experiments to test the portability of existing treebank induced LFG resources. We test the LFG parsing resources of Cahill et al. (2004) on the ATIS corpus which represents a considerably different domain to the Penn-II Treebank Wall Street Journal sections, from which the resources were induced. This testing shows an under-performance at both c- and f-structure level as a result of the domain variation. We show that in order to adapt the LFG resources of Cahill et al. (2004) to this new domain, all that is necessary is to retrain the c-structure parser on data from the new domain

    Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

    Get PDF
    The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations

    Reverse engineering to achieve maintainable WWW sites

    Get PDF
    The growth of the World Wide Web and the accelerated development of web sites and associated web technologies has resulted in a variety of maintenance problems. The maintenance problems associated with web sites and the WWW are examined. It is argued that currently web sites and the WWW lack both data abstractions and structures that could facilitate maintenance. A system to analyse existing web sites and extract duplicated content and style is described here. In designing the system, existing Reverse Engineering techniques have been applied, and a case for further application of these techniques is made in order to prepare sites for their inevitable evolution in futur
    corecore