7,436 research outputs found

    Tactical Generation in a Free Constituent Order Language

    Full text link
    This paper describes tactical generation in Turkish, a free constituent order language, in which the order of the constituents may change according to the information structure of the sentences to be generated. In the absence of any information regarding the information structure of a sentence (i.e., topic, focus, background, etc.), the constituents of the sentence obey a default order, but the order is almost freely changeable, depending on the constraints of the text flow or discourse. We have used a recursively structured finite state machine for handling the changes in constituent order, implemented as a right-linear grammar backbone. Our implementation environment is the GenKit system, developed at Carnegie Mellon University--Center for Machine Translation. Morphological realization has been implemented using an external morphological analysis/generation component which performs concrete morpheme selection and handles morphographemic processes.Comment: gzipped, uuencoded postscript fil

    Wh-questions in Japanese: scrambling, reconstruction, and wh-movement

    Get PDF
    In this article, I discuss some important properties of wh-questions and wh-scrambling in Japanese. The questions I will address are (i) which instances of (wh-) scrambling involve reconstruction and (ii) how the undoing effects of scrambling can be derived. First I will discuss the claim that (wh-) scrambling is semantically vacuous and is therefore undone at LF (Saito 1989, 1992). Then I consider the data that led Takahashi (1993) to the conclusion that at least some instances of wh-scrambling have to be analyzed as instances of "full wh-movement" i.e., overt movement of the wh-phrase in its scopal position. It will be argued that these examples are not instances of full wh-movement in Japanese, but that they also represent semantically vacuous scrambling. Those instances of scrambling that apprently cannot be undone are best explained with recourse to parsing effects. I conclude that wh-scrambling in Japanese is always triggered by a ([-wh]-) scrambling feature. In addition, long distance scrambling (scrambling out of finite CPs) is analyzed as adjunction movement, whereas short distance scrambling is movement to a specifier position of IP. Turning to the mechanisms of undoing, I will argue that only long distance scrambling is undone. This is shown to follow from Chomsky's (1995) bare phrase structure analysis, according to which multi-segmental categories derived by adjunction movement are not licensed at LF. The article is organized as follows. In section 2, the wh-scrambling phenomenon is described. In section 3, I discuss the reconstruction properties of scrambling. In addition, this section provides some basic assumptions about my analysis of Japanese scrambling in general. In section 4, I turn to the analysis of wh-scrambling as an instance of full wh-movement in Japanese. Section 5 provides discussion of multiple wh-questions in Japanese, and section 6 gives the conclusion

    Treebank-based acquisition of LFG resources for Chinese

    Get PDF
    This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof- concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially extend and improve on this earlier research as regards coverage, robustness, quality and fine-grainedness of the resulting LFG resources. We achieve this through (i) improved LFG analyses for a number of core Chinese phenomena; (ii) a new automatic f-structure annotation architecture which involves an intermediate dependency representation; (iii) scaling the approach from 4.1K trees in CTB2 to 18.8K trees in CTB version 5.1 (CTB5.1) and (iv) developing a novel treebank-based approach to recovering non-local dependencies (NLDs) for Chinese parser output. Against a new 200-sentence good standard of manually constructed f-structures, the method achieves 96.00% f-score for f-structures automatically generated for the original CTB trees and 80.01%for NLD-recovered f-structures generated for the trees output by Bikel’s parser

    Scrambling in German and Japanese: adjunction versus multiple specifiers

    Get PDF
    This paper argues that short (clause-internal) scrambling to a pre-subject position has A properties in Japanese but A'-properties in German, while long scrambling (scrambling across sentence boundaries) from finite clauses, which is possible in Japanese but not in German, has A'-properties throughout. It is shown that these differences between German and Japanese can be traced back to parametric variation of phrase structure and the parameterized properties of functional heads. Due to the properties of Agreement, sentences in Japanese may contain multiple (Agro- and Agrs-) specifiers whereas German does not allow for this. In Japanese, a scrambled element may be located in a Spec AgrP, i.e. an A- or L-related position, whereas scrambled NPs in German can only appear in an AgrP-adjoined (broadly-L-related) position, which only has A'-properties. Given our assumption that successive cyclic adjunction is generally impossible, elements in German may not be long scrambled because a scrambled element that is moved to an adjunction site inside an embedded clause may not move further. In Japanese, long distance scrambling out of finite CPs is possible since scrambling may proceed in a successive cyclic manner via embedded Spec- (AgrP) positions. Our analysis of the differences between German and Japanese scrambling provides us with an account of further contrasts between the two languages such as the existence of surprising asymmetries between German and Japanese remnant-movement phenomena, and the fact that unlike German, Japanese freely allows wh-scrambling. Investigation of the properties of Japanese wh-movement also leads us to the formulation of the "Wh-cluster Hypothesis", which implies that Japanese is an LF multiple wh-fronting language

    Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm

    Full text link
    This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class of an input clause. The process examines linguistic features of clauses that are relevant to aspectual classification. A genetic algorithm determines what combinations of linguistic features to use for this task.Comment: postscript, 9 pages, Proceedings of the Second International Conference on New Methods in Language Processing, Oflazer and Somers ed

    Factoring Predicate Argument and Scope Semantics : underspecified Semantics with LTAG

    Get PDF
    In this paper we propose a compositional semantics for lexicalized tree-adjoining grammar (LTAG). Tree-local multicomponent derivations allow separation of the semantic contribution of a lexical item into one component contributing to the predicate argument structure and a second component contributing to scope semantics. Based on this idea a syntax-semantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure (and indirectly the locality of derivations) allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scope ambiguities and related phenomena such as adjunct scope and island constraints

    Left dislocation in Zulu

    Get PDF
    This paper examines left dislocation constructions in Zulu, a Southern Bantu language belonging to the Nguni group (Zone S 40). In Zulu left dislocation configurations, a topic phrase in the beginning of the sentence is linked to a resumptive element within the associated clause. Typically, the resumptive element is an incorporated pronoun (cf. Bresnan & Mchombo 1987), as illustrated by the examples in (1) and (2). In these examples, the object pronoun (in italics) is part of the verbal morphology and agrees with the noun class (gender) of the dislocate. This situation is schematically illustrated in (3), where co-indexation represents agreement: ..
    corecore