28,660 research outputs found

    Main clause external constituents and the derivation of subject-initial verb Second

    Get PDF
    This paper discusses V3-patterns with a sentence-initial adverbial clause in Standard Dutch (StD) and West-Flemish (WF), which appear to violate the V2 restriction which normally regulates word order in these languages. V3-patterns occur in both languages; they can be interpreted as complying with the V2 constraint provided they are analyzed as the result of merging a regular root clause with V2 order with an extra-sentential adverbial clause. The paper shows that the distribution of V3-patterns is slightly wider in WF than in StD: StD requires that the root clause which combines with the extra-sentential constituent either exhibits subject-verb inversion (XP-V-S) or, in the case of subject initial V2 clauses, that the subject has a distinguished information-structural status such as contrastive focus/topic; WF allows V3-structures more freely, regardless of whether they display subject-verb inversion and regardless of the informational-structural status of the subject. The analysis takes as its point of departure the earlier claim that V2-languages can be symmetric in the sense that the finite verb always leaves the TP domain and occupies the highest head position in the root clause, the complementizer position C in the traditional generative analysis, or asymmetric in the sense that the position of the finite verb varies in that it occupies the C-position when the root clause exhibits subject-verb inversion or a lower TP internal tense position (T) in root clauses without inversion. The hypothesis is that V3-patterns with a sentence-initial adverbial clause are only possible if the initial adverbial clause attains a local relation with the finite verb, and that this requires the finite verb to be in the (higher) C-position. By assuming that StD is an asymmetric V2-language while WF is a symmetric V2-language the variation with respect to the distribution of V3-patterns in these languages can be captured

    From treebank resources to LFG F-structures

    Get PDF
    We present two methods for automatically annotating treebank resources with functional structures. Both methods define systematic patterns of correspondence between partial PS configurations and functional structures. These are applied to PS rules extracted from treebanks, or directly to constraint set encodings of treebank PS trees

    Treebank-based acquisition of LFG resources for Chinese

    Get PDF
    This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof- concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially extend and improve on this earlier research as regards coverage, robustness, quality and fine-grainedness of the resulting LFG resources. We achieve this through (i) improved LFG analyses for a number of core Chinese phenomena; (ii) a new automatic f-structure annotation architecture which involves an intermediate dependency representation; (iii) scaling the approach from 4.1K trees in CTB2 to 18.8K trees in CTB version 5.1 (CTB5.1) and (iv) developing a novel treebank-based approach to recovering non-local dependencies (NLDs) for Chinese parser output. Against a new 200-sentence good standard of manually constructed f-structures, the method achieves 96.00% f-score for f-structures automatically generated for the original CTB trees and 80.01%for NLD-recovered f-structures generated for the trees output by Bikel’s parser

    Structural aspects of local adjunct languages

    Get PDF
    Several open problems concerning local adjunct languages are considered and solved. One of the most interesting (from a linguistic point of view) and difficult (mathematically) open problems was whether or not null symbols can be dispensed without sacrificing the weak generative capacity. This problem is solved and the answer is negative.Also considered are some problems concerning one-sided grammars, homomorphisms of languages (it is shown that local adjunct languages are not closed under homomorphism), β-linear languages and mixed adjunct grammars

    Tree adjunct grammars

    Get PDF
    In this paper, a tree generating system called a tree adjunct grammar is described and its formal properties are studied relating them to the tree generating systems of Brainerd (Information and Control 14 (1969), 217–231) and Rounds (Mathematical Systems Theory 4 (1970), 257–287) and to the recognizable sets and local sets discussed by Thatcher (Journal of Computer and System Sciences 1 (1967), 317–322; 4 (1970), 339–367) and Rounds. Linguistic relevance of these systems has been briefly discussed also

    Automatic annotation of the Penn-treebank with LFG f-structure information

    Get PDF
    Lexical-Functional Grammar f-structures are abstract syntactic representations approximating basic predicate-argument structure. Treebanks annotated with f-structure information are required as training resources for stochastic versions of unification and constraint-based grammars and for the automatic extraction of such resources. In a number of papers (Frank, 2000; Sadler, van Genabith and Way, 2000) have developed methods for automatically annotating treebank resources with f-structure information. However, to date, these methods have only been applied to treebank fragments of the order of a few hundred trees. In the present paper we present a new method that scales and has been applied to a complete treebank, in our case the WSJ section of Penn-II (Marcus et al, 1994), with more than 1,000,000 words in about 50,000 sentences

    An Abstract Machine for Unification Grammars

    Full text link
    This work describes the design and implementation of an abstract machine, Amalia, for the linguistic formalism ALE, which is based on typed feature structures. This formalism is one of the most widely accepted in computational linguistics and has been used for designing grammars in various linguistic theories, most notably HPSG. Amalia is composed of data structures and a set of instructions, augmented by a compiler from the grammatical formalism to the abstract instructions, and a (portable) interpreter of the abstract instructions. The effect of each instruction is defined using a low-level language that can be executed on ordinary hardware. The advantages of the abstract machine approach are twofold. From a theoretical point of view, the abstract machine gives a well-defined operational semantics to the grammatical formalism. This ensures that grammars specified using our system are endowed with well defined meaning. It enables, for example, to formally verify the correctness of a compiler for HPSG, given an independent definition. From a practical point of view, Amalia is the first system that employs a direct compilation scheme for unification grammars that are based on typed feature structures. The use of amalia results in a much improved performance over existing systems. In order to test the machine on a realistic application, we have developed a small-scale, HPSG-based grammar for a fragment of the Hebrew language, using Amalia as the development platform. This is the first application of HPSG to a Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks, pst-node, psfig, fullname and a macros fil

    Book Reviews

    Get PDF

    Automatic acquisition of LFG resources for German - as good as it gets

    Get PDF
    We present data-driven methods for the acquisition of LFG resources from two German treebanks. We discuss problems specific to semi-free word order languages as well as problems arising fromthe data structures determined by the design of the different treebanks. We compare two ways of encoding semi-free word order, as done in the two German treebanks, and argue that the design of the TiGer treebank is more adequate for the acquisition of LFG resources. Furthermore, we describe an architecture for LFG grammar acquisition for German, based on the two German treebanks, and compare our results with a hand-crafted German LFG grammar
    corecore