Search CORE

1,069 research outputs found

Rule-restricted Automaton-grammar transducers: Power and Linguistic Applications

Author: Horáček Petr
Meduna Alexander
Čermák Martin
Publication venue: 'Brno University of Technology'
Publication date: 01/01/2012
Field of study

This paper introduces the notion of a new transducer as a two-component system, which consists of a nite automaton and a context-free grammar. In essence, while the automaton reads its input string, the grammar produces its output string, and their cooperation is controlled by a set, which restricts the usage of their rules. From a theoretical viewpoint, the present paper discusses the power of this system working in an ordinary way as well as in a leftmost way. In addition, the paper introduces an appearance checking, which allows us to check whether some symbols are present in the rewritten string, and studies its e ect on the power. It achieves the following three main results. First, the system generates and accepts languages de ned by matrix grammars and partially blind multi-counter automata, respectively. Second, if we place a leftmost restriction on derivation in the context-free grammar, both accepting and generating power of the system is equal to generative power of context-free grammars. Third, the system with appearance checking can accept and generate all recursively enumerable languages. From more pragmatical viewpoint, this paper describes several linguistic applications. A special attention is paid to the Japanese-Czech translation

Digital library of Brno University of Technology

Bounded-Depth High-Coverage Search Space for Noncrossing Parses

Author: Yli-Jyrä Anssi Mikael
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2017
Field of study

Volume: Proceeding volume: 13A recently proposed encoding for noncrossing digraphs can be used to implement generic inference over families of these digraphs and to carry out first-order factored dependency parsing. It is now shown that the recent proposal can be substantially streamlined without information loss. The improved encoding is less dependent on hierarchical processing and it gives rise to a high-coverage bounded-depth approximation of the space of non- crossing digraphs. This subset is presented elegantly by a finite-state machine that recognizes an infinite set of encoded graphs. The set includes more than 99.99% of the 0.6 million noncrossing graphs obtained from the UDv2 treebanks through planarisation. Rather than taking the low probability of the residual as a flat rate, it can be modelled with a joint probability distribution that is factorised into two underlying stochastic processes – the sentence length distribution and the related conditional distribution for deep nesting. This model points out that deep nesting in the streamlined code requires extreme sentence lengths. High depth is categorically out in common sentence lengths but emerges slowly at infrequent lengths that prompt further inquiry.A recently proposed encoding for non- crossing digraphs can be used to imple- ment generic inference over families of these digraphs and to carry out first-order factored dependency parsing. It is now shown that the recent proposal can be substantially streamlined without information loss. The improved encoding is less dependent on hierarchical processing and it gives rise to a high-coverage bounded-depth approximation of the space of non- crossing digraphs. This subset is presented elegantly by a finite-state machine that recognises an infinite set of encoded graphs. The set includes more than 99.99% of the 0.6 million noncrossing graphs obtained from the UDv2 treebanks through planarisation. Rather than taking the low probability of the residual as a flat rate, it can be modelled with a joint probability distribution that is factorised into two underlying stochastic processes – the sentence length distribution and the related conditional distribution for deep nesting. This model points out that deep nesting in the streamlined code requires extreme sentence lengths. High depth is categorically out in common sentence lengths but emerges slowly at infrequent lengths that prompt further inquiry.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Binding Phenomena Within A Reductionist Theory of Grammatical Dependencies

Author: Drummond Alex
Publication venue
Publication date: 01/01/2011
Field of study

This thesis investigates the implications of binding phenomena for the development of a reductionist theory of grammatical dependencies. The starting point is the analysis of binding and control in Hornstein (2001, 2009). A number of revisions are made to this framework in order to develop a simpler and empirically more successful account of binding phenomena. The major development is the rejection of economy-based accounts of Condition B effects. It is argued that Condition B effects derive directly from an anti-locality constraint on A-movement. Competition between different dependency types is crucial to the analysis, but is formulated in terms of a heavily revised version of Reinhart's (2006) "No Sneaking" principle, rather than in terms of a simple economy preference for local over non-local dependencies. In contrast to Reinhart's No Sneaking, the condition presented here ("Keeping Up Appearances") has a phonologically rather than semantically specified comparison set. A key claim of the thesis is that the morphology of pronouns and reflexives is of little direct grammatical import. It is argued that much of the complexity of the contemporary binding literature derives from the attempt to capture the distribution of pronouns and reflexives in largely, or purely, syntactic and semantic terms. The analysis presented in this dissertation assigns a larger role to language-specific "spellout" rules, and to general pragmatic/interpretative principles governing the choice between competing morphemes. Thus, a core assumption of binding theory from LGB onwards is rejected: there is no syntactic theory which accounts for the distribution of pronouns and reflexives. Rather, there is a core theory of grammatical dependencies which must be conjoined with with phonological, morphological and pragmatic principles to yield the distributional facts in any given language. In this respect, the approach of the thesis is strictly non-lexicalist: there are no special lexical items which trigger certain kinds of grammatical dependency. All non-strictly-local grammatical dependencies are formed via A- or A-chains, and copies in these chains are pronounced according to a mix of universal principles and language-specific rules. The broader goal of the thesis is to further the prospects for a "reductionist" approach to grammatical dependencies along these lines. The most detailed empirical component of the thesis is an investigation of the problem posed by binding out of prepositional phrases. Even in a framework incorporating sideward movement, the apparent lack of c-command in this configuration poses a problem. Chapter 3 attempts to revive a variant of the traditional "reanalysis" account of binding out of PP. This segues into an investigation of certain properties of pseudopassivization and preposition stranding. The analyses in this thesis are stated within an informal syntactic framework. However, in order to investigate the precise implications of a particular economy condition, Merge over Move, a partial formalization of this framework is developed in chapter 4. This permits the economy condition to be stated precisely, and in a manner which does not have adverse implications for computational complexity

CiteSeerX

Digital Repository at the University of Maryland

ProQuest OAI Repository

Applying dynamic Bayesian networks in transliteration detection and generation

Author: Nabende Peter
Publication venue: s.n.
Publication date: 01/01/2011
Field of study

Proceedings - University of Groningen