463 research outputs found

    Towards the Development of a Hybrid Parser for Natural Languages

    Get PDF
    In order to understand natural languages, we have to be able to determine the relations between words, in other words we have to be able to \u27parse\u27 the input text. This is a difficult task, especially for Arabic, which has a number of properties that make it particularly difficult to handle. There are two approaches to parsing natural languages: grammar-driven and data-driven. Each of these approaches poses its own set of problems, which we discuss in this paper. The goal of our work is to produce a hybrid parser, which retains the advantages of the data-driven approach but is guided by grammar rules in order to produce more accurate output. This work consists of two stages: the first stage is to develop a baseline data-driven parser, which is guided by a machine learning algorithm for establishing dependency relations between words. The second stage is to integrate grammar rules into the baseline parser. In this paper, we describe the first stage of our work, which is now implemented, and a number of experiments that have been conducted on this parser. We also discuss the result of these experiments and highlight the different factors that are affecting parsing speed and the correctness of the parser results

    Towards the development of a hybrid parser for natural langauges

    Get PDF

    Gaps and Resumptive Pronouns in Modern Standard Arabic

    Get PDF
    Unbounded dependencies in Modern Standard Arabic often involve not a gap but a null resumptive pronoun. The facts are quite complex, but it is not too difficult to extend the SLASH mechanism of HPSG to handle dependencies with a null resumptive pronoun. It is also not too difficult to restrict the distribution of gaps appropriately

    Towards the Development of a Hybrid Parser for Natural Languages

    Get PDF
    In order to understand natural languages, we have to be able to determine the relations between words, in other words we have to be able to 'parse' the input text. This is a difficult task, especially for Arabic, which has a number of properties that make it particularly difficult to handle. There are two approaches to parsing natural languages: grammar-driven and data-driven. Each of these approaches poses its own set of problems, which we discuss in this paper. The goal of our work is to produce a hybrid parser, which retains the advantages of the data-driven approach but is guided by grammar rules in order to produce more accurate output. This work consists of two stages: the first stage is to develop a baseline data-driven parser, which is guided by a machine learning algorithm for establishing dependency relations between words. The second stage is to integrate grammar rules into the baseline parser. In this paper, we describe the first stage of our work, which is now implemented, and a number of experiments that have been conducted on this parser. We also discuss the result of these experiments and highlight the different factors that are affecting parsing speed and the correctness of the parser results

    DCU 250 Arabic dependency bank: an LFG gold standard resource for the Arabic Penn treebank

    Get PDF
    This paper describes the construction of a dependency bank gold standard for Arabic, DCU 250 Arabic Dependency Bank (DCU 250), based on the Arabic Penn Treebank Corpus (ATB) (Bies and Maamouri, 2003; Maamouri and Bies, 2004) within the theoretical framework of Lexical Functional Grammar (LFG). For parsing and automatically extracting grammatical and lexical resources from treebanks, it is necessary to evaluate against established gold standard resources. Gold standards for various languages have been developed, but to our knowledge, such a resource has not yet been constructed for Arabic. The construction of the DCU 250 marks the first step towards the creation of an automatic LFG f-structure annotation algorithm for the ATB, and for the extraction of Arabic grammatical and lexical resources

    Simpler semantics for computational and cognitive linguistics

    Get PDF
    Certain consequences are considered regarding a simpler, more cognitively plausible treatment of semantics in SignBased Construction Grammar, a cognitive, unification- based theory of language. It is proposed that a construction grammar may be able to improve its coverage of core linguistic phenomena in line with minimalist goals (Chomsky 1993). Suggestions are offered regarding relative clauses and wh-expressions to show that a more straightforward account is available, one that allows a unified treatment of scope for quantifiers and wh-expressions

    Head-Driven Phrase Structure Grammar

    Get PDF
    Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)

    Modeling information structure in a cross-linguistic perspective

    Get PDF
    This study makes substantial contributions to both the theoretical and computational treatment of information structure, with a specific focus on creating natural language processing applications such as multilingual machine translation systems. The present study first provides cross-linguistic findings in regards to information structure meanings and markings. Building upon such findings, the current model represents information structure within the HPSG/MRS framework using Individual Constraints. The primary goal of the present study is to create a multilingual grammar model of information structure for the LinGO Grammar Matrix system. The present study explores the construction of a grammar library for creating customized grammar incorporating information structure and illustrates how the information structure-based model improves performance of transfer-based machine translation
    • 

    corecore