3,893 research outputs found

    An improved parser for data-oriented lexical-functional analysis

    Full text link
    We present an LFG-DOP parser which uses fragments from LFG-annotated sentences to parse new sentences. Experiments with the Verbmobil and Homecentre corpora show that (1) Viterbi n best search performs about 100 times faster than Monte Carlo search while both achieve the same accuracy; (2) the DOP hypothesis which states that parse accuracy increases with increasing fragment size is confirmed for LFG-DOP; (3) LFG-DOP's relative frequency estimator performs worse than a discounted frequency estimator; and (4) LFG-DOP significantly outperforms Tree-DOP is evaluated on tree structures only.Comment: 8 page

    Data parsing for optimized molecular geometry calculations

    Get PDF
    The purpose of this project is to optimize and streamline to process of using ADF and ReaxFF. There is no efficient way to effectively add constraints to a compound and run it through ADF, take the ADF output and create a file that can be run through Reaxff, then take that Reaxff output and come to conclusions on it. To streamline this process, scripts were developed using Python to parse information out of data generated by ADF

    Combining semantic and syntactic structure for language modeling

    Full text link
    Structured language models for speech recognition have been shown to remedy the weaknesses of n-gram models. All current structured language models are, however, limited in that they do not take into account dependencies between non-headwords. We show that non-headword dependencies contribute to significantly improved word error rate, and that a data-oriented parsing model trained on semantically and syntactically annotated data can exploit these dependencies. This paper also contains the first DOP model trained by means of a maximum likelihood reestimation procedure, which solves some of the theoretical shortcomings of previous DOP models.Comment: 4 page

    How important is the intensive margin of labor adjustment? : discussion of "Aggregate hours worked in OECD countries : new measurement and implications for business cycles" by Lee Ohanian and Andrea Raffo

    Get PDF
    Using new quarterly data for hours worked in OECD countries, Ohanian and Raffo (2011) argue that in many OECD countries, particularly in Europe, hours per worker are quantitatively important as an intensive margin of labor adjustment, possibly because labor market frictions are higher than in the US. I argue that this conclusion is not supported by the data. Using the same data on hours worked, I Önd evidence that labor market frictions are higher in Europe than in the US, like Ohanian and Raffo, but also that these frictions seem to asect the intensive margin at least as much as the extensive margin of labor adjustment

    Data-Oriented Language Processing. An Overview

    Full text link
    During the last few years, a new approach to language processing has started to emerge, which has become known under various labels such as "data-oriented parsing", "corpus-based interpretation", and "tree-bank grammar" (cf. van den Berg et al. 1994; Bod 1992-96; Bod et al. 1996a/b; Bonnema 1996; Charniak 1996a/b; Goodman 1996; Kaplan 1996; Rajman 1995a/b; Scha 1990-92; Sekine & Grishman 1995; Sima'an et al. 1994; Sima'an 1995-96; Tugwell 1995). This approach, which we will call "data-oriented processing" or "DOP", embodies the assumption that human language perception and production works with representations of concrete past language experiences, rather than with abstract linguistic rules. The models that instantiate this approach therefore maintain large corpora of linguistic representations of previously occurring utterances. When processing a new input utterance, analyses of this utterance are constructed by combining fragments from the corpus; the occurrence-frequencies of the fragments are used to estimate which analysis is the most probable one. In this paper we give an in-depth discussion of a data-oriented processing model which employs a corpus of labelled phrase-structure trees. Then we review some other models that instantiate the DOP approach. Many of these models also employ labelled phrase-structure trees, but use different criteria for extracting fragments from the corpus or employ different disambiguation strategies (Bod 1996b; Charniak 1996a/b; Goodman 1996; Rajman 1995a/b; Sekine & Grishman 1995; Sima'an 1995-96); other models use richer formalisms for their corpus annotations (van den Berg et al. 1994; Bod et al., 1996a/b; Bonnema 1996; Kaplan 1996; Tugwell 1995).Comment: 34 pages, Postscrip

    Comparative analysis of copyright assignment and licence formalities for Open Source Contributor Agreements

    Get PDF
    This article discusses formal requirements in open source software contributor copyright assignment and licensing agreements. Contributor agreements are contracts by which software developers transfer or license their work on behalf of an open source project. This is done for convenience and enforcement purposes, and usually takes the form of a formal contract. This work conducts a comparative analysis of how several jurisdicitons regard those agreements. We specifically look at the formal requirements across those countries to ascertain whether formalities are constitutive or probative. We then look at the consequences of the lack of formalities for the validity of those contributor agreements
    • …
    corecore