18 research outputs found

    Un algorithme d'analyse de type earley pour grammaires à concaténation d'intervalles

    Get PDF
    Nous présentons ici différents algorithmes d’analyse pour grammaires à concaténation d’intervalles (Range Concatenation Grammar, RCG), dont un nouvel algorithme de type Earley, dans le paradigme de l’analyse déductive. Notre travail est motivé par l’intérêt porté récemment à ce type de grammaire, et comble un manque dans la littérature existante.We present several different parsing algorithms for Range Concatenation Grammar (RCG), inter alia an entirely novel Earley-style algorithm, using the deductive parsing framework. Our work is motivated by recent interest in range concatenation grammar in general and fills a gap in the existing literature

    Forgotten Islands of Regularity in Phonology

    Get PDF
    Open access publication of this volume supported by National Research, Development and Innovation Office grant NKFIH #120145 `Deep Learning of Morphological Structure'.Giving birth to Finite State Phonology is classically attributed to Johnson (1972), and Kaplan and Kay (1994). However, there is an ear- lier discovery that was very close to this achievement. In 1965, Hennie presented a very general sufficient condition for regularity of Turing machines. Although this discovery happened chronologically before Generative Phonology (Chomsky and Halle, 1968), it is a mystery why its relevance has not been realized until recently (Yli-Jyrä, 2017). The antique work of Hennie provides enough generality to advance even today’s frontier of finite-state phonology. First, it lets us construct a finite-state transducer from any grammar implemented by a tightly bounded one- tape Turing machine. If the machine runs in o(n log n), the construction is possible, and this case is reasonably decidable. Second, it can be used to model the regularity in context-sensitive derivations. For example, the suffixation in hunspell dictionaries (Németh et al., 2004) corresponds to time-bounded two-way computations performed by a Hennie machine. Thirdly, it challenges us to look for new forgotten islands of regularity where Hennie’s condition does not necessarily hold.Hennie presented a very general sufficient condition for regularity of Turing machines. This happened chronologically before Generative Phonology (Chomsky & Halle 1968) and the related finite-state research (Johnson 1972; Kaplan & Kay 1994). Hennie’s condition lets us (1) construct a finite-state transducer from any grammar implemented by a linear-time Turing machine, and (2) to model the regularity in context-sensitive derivations. For example, the suffixation in hunspell dictionaries (Németh et al. 2004) corresponds to time-bounded two way computations performed by a Hennie machine. Furthermore, it challenges us to look for new forgotten islands of regularity where Hennie’s condition does not necessarily hold.Peer reviewe

    Mild context-sensitivity and tuple-based generalizations of context-free grammar

    Get PDF
    This paper classifies a family of grammar formalisms that extend context-free grammar by talking about tuples of terminal strings, rather than independently combining single terminal words into larger single phrases. These include a number of well-known formalisms, such as head grammar and linear context-free rewriting systems, but also a new formalism, (simple) literal movement grammar, which strictly extends the previously known formalisms, while preserving polynomial time recognizability. The descriptive capacity of simple literal movement grammars is illustrated both formally through a weak generative capacity argument and in a more practical sense by the description of conjunctive cross-serial relative clauses in Dutch. After sketching a complexity result and drawing a number of conclusions from the illustrations, it is then suggested that the notion of mild context-sensitivity currently in use, that depends on the rather loosely defined concept of constant growth, needs a modification to apply sensibly to the illustrated facts; an attempt at such a revision is proposed

    A Corpus Investigation of Syntactic Embedding in Piraha

    Get PDF
    The Pirahã language has been at the center of recent debates in linguistics, in large part because it is claimed not to exhibit recursion, a purported universal of human language. Here, we present an analysis of a novel corpus of natural Pirahã speech that was originally collected by Dan Everett and Steve Sheldon. We make the corpus freely available for further research. In the corpus, Pirahã sentences have been shallowly parsed and given morpheme-aligned English translations. We use the corpus to investigate the formal complexity of Pirahã syntax by searching for evidence of syntactic embedding. In particular, we search for sentences which could be analyzed as containing center-embedding, sentential complements, adverbials, complementizers, embedded possessors, conjunction or disjunction. We do not find unambiguous evidence for recursive embedding of sentences or noun phrases in the corpus. We find that the corpus is plausibly consistent with an analysis of Pirahã as a regular language, although this is not the only plausible analysis

    El lenguaje natural como lenguaje formal

    Get PDF
    Formal languages theory is useful for the study of natural language. In particular, it is of interest to study the adequacy of the grammatical formalisms to express syntactic phenomena present in natural language. First, it helps to draw hypotheses about the nature and complexity of the speaker-hearer linguistic competence, a fundamental question in linguistics and other cognitive sciences. Moreover, from an engineering point of view, it allows for the knowledge of practical limitations of applications based on those formalisms. This article introduces the problem of adequacy of grammatical formalisms for natural language, also introducing some formal language theory concepts required for this discussion. Then, it reviews the formalisms that have been proposed through history, and the arguments that have been given to support or reject their adequacy.La teoría de lenguajes formales es útil para el estudio de los lenguajes naturales. En particular, resulta de interés estudiar la adecuación de los formalismos gramaticales para expresar los fenómenos sintácticos presentes en el lenguaje natural. Primero, ayuda a trazar hipótesis acerca de la naturaleza y complejidad de las competencias lingüísticas de los hablantes-oyentes del lenguaje, un interrogante fundamental de la lingüística y otras ciencias cognitivas. Además, desde el punto de vista de la ingeniería, permite conocer limitaciones prácticas de las aplicaciones basadas en dichos formalismos. En este artículo se hace una introducción al problema de la adecuación de los formalismos gramaticales para el lenguaje natural, introduciendo también algunos conceptos de la teoría de lenguajes formales necesarios para esta discusión. Luego, se hace un repaso de los formalismos que han sido propuestos a lo largo de la historia, y de los argumentos que se han dado para sostener o refutar su adecuación

    The IO and OI hierarchies revisited

    Get PDF
    International audienceWe study languages of λ-terms generated by IO and OI unsafe grammars. These languages can be used to model meaning representations in the formal semantics of natural languages following the tradition of Montague [25]. Using techniques pertaining to the denotational semantics of the simply typed λ-calculus, we show that the emptiness and membership problems for both types of grammars are decidable. In the course of the proof of the decidability results for OI, we identify a decidable variant of the λ-definability problem, and prove a stronger form of Statman's finite completeness Theorem [35]

    Teoría de lenguajes formales : una introducción para lingüistas

    Get PDF
    El text desenvolupa una breu introducció a la teoria dels llenguatges formals i a la teoria de la complexitat computacionals orientada a l'aplicació d'aquestes teories en l'àmbit de la recerca en lingüística i en ciència cognitiva. Després d'una introducció on es presenten algunes nocions matemàtiques bàsiques, el treball s'ocupa amb detall dels sistemes regulars i els sistemes independents del context per, més endavant, tractar aquells sistemes avui en dia classificats com a moderadament sensibles al context, amb especial atenció als "Linear Context-Free Rewriting Systems". La part final, s'ocupa de computabilitat i màquines de Turing per tal d'introduir al lector en el concepte de classe de complexitat computacional de TEMPS i d'ESPAI. Aquesta part es tanca amb algunes consideracions sobre la conjectura si les classes P i NP són iguals o no.This work is a brief introduction to formal language and computational compelxity theory with a special emphasis on the application of these theories to linguistics and cognitive science. After an introduction where some basic mathematical notions are presented, the book offers a detailed presentation of regular and context-free systems in order to tackle, thereafter, those systems classified nowadays as mildly context-sensitive, paying special attention to Linear Context-Free Rewriting Systems. The last part of the work is devoted to computability and Turing machines to introduce the reader in a later chapter to the notion of TIME and SPACE complexity classes. This part closes with some considerations concerning the P = NP conjecture

    Tree Description Grammars and Underspecified Representations

    Get PDF
    In this thesis, a new grammar formalism called (local) Tree Description Grammar (TDG) is presented that generates tree descriptions. This grammar formalism brings together some of the central ideas in the context of Tree Adjoining Grammars (TAG) on the one hand, and approaches to underspecified semantics for scope ambiguities on the other hand. First a general definition of TDGs is presented, and afterwards a restricted variant called local TDGs is proposed. Since the elements of a local TDG are tree descriptions, an extended domain of locality as in TAGs is provided by this formalism. Consequently, local TDGs can be lexicalized, and local dependencies such as filler gap dependencies can be expressed in the descriptions occurring in the grammar. The tree descriptions generated by local TDGs are such that the dominance relation (i.e. the reflexive and transitive closure of the parent relation) need not be fully specified. Therefore the generation of suitable underspecified representations for scope ambiguities is possible. The generative capacity of local TDGs is greater than the one of TAGs. Local TDGs are even more powerful than set-local multicomponent TAGs (MC-TAG). However, the generative capacity of local TDGs is restricted in such a way that only semilinear languages are generated. Therefore these languages are of constant growth, a property generally ascribed to natural languages. Local TDGs of different rank can be distinguished depending on the form of derivation steps that are possible in these grammars. This leads to a hierarchy of local TDGs. For the string languages generated by local TDGs of a certain rank, a pumping lemma is proven that allows to show that local TDGs of rank n can generate a language Li := {a1k···a1k|k ≥ 0} iff i ≤ 2n holds. In order to describe the relation between two languages, synchronous local TDGs are introduced. The synchronization with a second local TDG does not increase the generative power of the grammar in the sense that each language generated by a local TDG that is part of a synchronous pair of local TDGs, also can be generated by a single local TDG. This formalism of synchronous local TDGs is used to describe a syntax-semantics interface for a fragment of French which illustrates the derivation of underspecified representations for scope ambiguities with local TDGs
    corecore