4,539 research outputs found

    An integrated architecture for shallow and deep processing

    Get PDF
    We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis

    Treebank annotation schemes and parser evaluation for German

    Get PDF
    Recent studies focussed on the question whether less-congurational languages like German are harder to parse than English, or whether the lower parsing scores are an artefact of treebank encoding schemes and data structures, as claimed by K¨ubler et al. (2006). This claim is based on the assumption that PARSEVAL metrics fully reflect parse quality across treebank encoding schemes. In this paper we present new experiments to test this claim. We use the PARSEVAL metric, the Leaf-Ancestor metric as well as a dependency-based evaluation, and present novel approaches measuring the effect of controlled error insertion on treebank trees and parser output. We also provide extensive past-parsing crosstreebank conversion. The results of the experiments show that, contrary to K¨ubler et al. (2006), the question whether or not German is harder to parse than English remains undecided

    Automatic acquisition of LFG resources for German - as good as it gets

    Get PDF
    We present data-driven methods for the acquisition of LFG resources from two German treebanks. We discuss problems specific to semi-free word order languages as well as problems arising fromthe data structures determined by the design of the different treebanks. We compare two ways of encoding semi-free word order, as done in the two German treebanks, and argue that the design of the TiGer treebank is more adequate for the acquisition of LFG resources. Furthermore, we describe an architecture for LFG grammar acquisition for German, based on the two German treebanks, and compare our results with a hand-crafted German LFG grammar

    Evaluating evaluation measures

    Get PDF
    This paper presents a thorough examination of the validity of three evaluation measures on parser output. We assess parser performance of an unlexicalised probabilistic parser trained on two German treebanks with different annotation schemes and evaluate parsing results using the PARSEVAL metric, the Leaf-Ancestor metric and a dependency-based evaluation. We reject the claim that the T¨uBa-D/Z annotation scheme is more adequate then the TIGER scheme for PCFG parsing and show that PARSEVAL should not be used to compare parser performance for parsers trained on treebanks with different annotation schemes. An analysis of specific error types indicates that the dependency-based evaluation is most appropriate to reflect parse quality

    Probabilistic parsing

    Get PDF
    Postprin

    Dependency Mapping Software for Jira, Project Management Tool

    Get PDF
    Efficiently managing a software development project is extremely important in industry and is often overlooked by the software developers on a project. Pieces of development work are identified by developers and are then handed off to project managers, who are left to organize this information. Project managers must organize this to set expectations for the client, and ensure the project stays on track and on budget. The main block in this process are dependency chains between tasks. Dependency chains can cause a project to take much longer than anticipated or result in the under utilization of developers on a project. While project managers do have access to project management tools, few have capabilities to effectively visualize dependencies. The goal of this research was to interact with a project management tool\u27s API, pull down dependency information for a project, and build out possible timelines for a set of tasks. We visualize this problem with a directed graph, where each node is a task and edges in the graph indicate dependencies. The relationships between this problem and more well-known problems in graph theory are used to inform the development of the algorithms. Two algorithms are explored to handle the problem and are then run under different conditions. Analysis of the results provide insight to what structures of dependency chains can be handled by the algorithms. The resulting software could be used to save companies both time and money when planning software development projects
    corecore