5,757 research outputs found

    Abstract Interpretation of Indexed Grammars.

    Get PDF
    Indexed grammars are a generalization of context-free grammars and recognize a proper subset of context-sensitive languages. The class of languages recognized by indexed grammars are called indexed languages and they correspond to the languages recognized by nested stack automata. For example indexed grammars can recognize the language {a^n b^n c^n | n > = 1} which is not context-free, but they cannot recognize {(ab^n)^n) | n >= 1} which is context-sensitive. Indexed grammars identify a set of languages that are more expressive than context-free languages, while having decidability results that lie in between the ones of context-free and context-sensitive languages. In this work we study indexed grammars in order to formalize the relation between indexed languages and the other classes of languages in the Chomsky hierarchy. To this end, we provide a fixpoint characterization of the languages recognized by an indexed grammar and we study possible ways to abstract, in the abstract interpretation sense, these languages and their grammars into context-free and regular languages

    Partial (In)Completeness in Abstract Interpretation

    Get PDF
    In the abstract interpretation framework, completeness represents an optimal simulation by the abstract operators over the behavior of the concrete operators. This corresponds to an ideal (often rare) feature where there is no loss of information accumulated in abstract computations with respect to the properties encoded by the underlying abstract domains. In this thesis, we deal with the opposite notion of completeness in abstract interpretation, that is, incompleteness, applied to two different contexts: static program analysis and formal languages over the Chomsky's hierarchy. In static program analysis, completeness is a very rare condition to be satisfied in practice and only the straightforward abstractions are complete for all programs, thus, we usually deal with incompleteness. For this reason, we introduce the notion of partial completeness. Partial completeness is a weaker notion of completeness which requires the imprecision of the analysis to be limited. A partially complete abstract interpretation allows some false alarms to be reported, but their number is bounded by a constant. We collect in partial completeness classes all the programs whose abstract interpretations share the same upper bound of imprecision. We then focus on the investigation of the computational limits of the class of partially complete programs with respect to a given abstract domain. Moreover, we show that the class of all partially complete programs is non-recursively enumerable, and its complement is productive whenever we allow an unlimited imprecision in the abstract domain. Finally, we formalize the local partial completeness class within which we require partial completeness only on some specific inputs. We prove that this last class of programs is a recursively enumerable set under a structural hypothesis on the underlying abstract domain, by showing an algorithm capable of proving the local partial completeness of a program with respect to a given abstract domain and an upper bound of imprecision. In formal language theory, we want to study a possible reformulation, by abstract interpretation, of classes of languages in the Chomsky's hierarchy, and, by exploiting the incompleteness of languages abstractions, we want to define separation results between classes of languages. To this end, we do a first step into this direction by studying the relation between indexed languages (recognized by indexed grammars) and context-free languages. Indexed grammars are a generalization of context-free grammars which recognize a proper subset of context-sensitive languages, the so called indexed languages. %The class of languages recognized by indexed grammars is called indexed languages and they correspond to the languages recognized by nested stack automata. For example, indexed grammars can recognize the language anbncnmidngeq1{a^nb^nc^n mid ngeq 1 } which is not context-free, but they cannot recognize (abn)nmidngeq1{ (ab^n)^n mid ngeq 1} which is context-sensitive. Indexed grammars identify a set of languages that are more expressive than context-free languages, while having decidability results that lie in between the ones of context-free and context-sensitive languages. We provide a fixpoint characterization of the languages recognized by an indexed grammar and we study possible ways to abstract, in the abstract interpretation sense, these languages and their grammars into context-free and regular languages. We formalize the separation class between indexed and context-free languages, i.e., all the languages that cannot be generated by a context-free grammar, as an instance of incompleteness of stack elimination abstraction over indexed grammars

    Graph-Based Shape Analysis Beyond Context-Freeness

    Full text link
    We develop a shape analysis for reasoning about relational properties of data structures. Both the concrete and the abstract domain are represented by hypergraphs. The analysis is parameterized by user-supplied indexed graph grammars to guide concretization and abstraction. This novel extension of context-free graph grammars is powerful enough to model complex data structures such as balanced binary trees with parent pointers, while preserving most desirable properties of context-free graph grammars. One strength of our analysis is that no artifacts apart from grammars are required from the user; it thus offers a high degree of automation. We implemented our analysis and successfully applied it to various programs manipulating AVL trees, (doubly-linked) lists, and combinations of both

    An Alternative Conception of Tree-Adjoining Derivation

    Get PDF
    The precise formulation of derivation for tree-adjoining grammars has important ramifications for a wide variety of uses of the formalism, from syntactic analysis to semantic interpretation and statistical language modeling. We argue that the definition of tree-adjoining derivation must be reformulated in order to manifest the proper linguistic dependencies in derivations. The particular proposal is both precisely characterizable through a definition of TAG derivations as equivalence classes of ordered derivation trees, and computationally operational, by virtue of a compilation to linear indexed grammars together with an efficient algorithm for recognition and parsing according to the compiled grammar.Comment: 33 page

    Calibrating Generative Models: The Probabilistic Chomsky-SchĂĽtzenberger Hierarchy

    Get PDF
    A probabilistic Chomsky–Schützenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning

    The Computational Complexity of Symbolic Dynamics at the Onset of Chaos

    Full text link
    In a variety of studies of dynamical systems, the edge of order and chaos has been singled out as a region of complexity. It was suggested by Wolfram, on the basis of qualitative behaviour of cellular automata, that the computational basis for modelling this region is the Universal Turing Machine. In this paper, following a suggestion of Crutchfield, we try to show that the Turing machine model may often be too powerful as a computational model to describe the boundary of order and chaos. In particular we study the region of the first accumulation of period doubling in unimodal and bimodal maps of the interval, from the point of view of language theory. We show that in relation to the ``extended'' Chomsky hierarchy, the relevant computational model in the unimodal case is the nested stack automaton or the related indexed languages, while the bimodal case is modeled by the linear bounded automaton or the related context-sensitive languages.Comment: 1 reference corrected, 1 reference added, minor changes in body of manuscrip

    CHR Grammars

    Full text link
    A grammar formalism based upon CHR is proposed analogously to the way Definite Clause Grammars are defined and implemented on top of Prolog. These grammars execute as robust bottom-up parsers with an inherent treatment of ambiguity and a high flexibility to model various linguistic phenomena. The formalism extends previous logic programming based grammars with a form of context-sensitive rules and the possibility to include extra-grammatical hypotheses in both head and body of grammar rules. Among the applications are straightforward implementations of Assumption Grammars and abduction under integrity constraints for language analysis. CHR grammars appear as a powerful tool for specification and implementation of language processors and may be proposed as a new standard for bottom-up grammars in logic programming. To appear in Theory and Practice of Logic Programming (TPLP), 2005Comment: 36 pp. To appear in TPLP, 200

    On Descriptive Complexity, Language Complexity, and GB

    Get PDF
    We introduce LK,P2L^2_{K,P}, a monadic second-order language for reasoning about trees which characterizes the strongly Context-Free Languages in the sense that a set of finite trees is definable in LK,P2L^2_{K,P} iff it is (modulo a projection) a Local Set---the set of derivation trees generated by a CFG. This provides a flexible approach to establishing language-theoretic complexity results for formalisms that are based on systems of well-formedness constraints on trees. We demonstrate this technique by sketching two such results for Government and Binding Theory. First, we show that {\em free-indexation\/}, the mechanism assumed to mediate a variety of agreement and binding relationships in GB, is not definable in LK,P2L^2_{K,P} and therefore not enforcible by CFGs. Second, we show how, in spite of this limitation, a reasonably complete GB account of English can be defined in LK,P2L^2_{K,P}. Consequently, the language licensed by that account is strongly context-free. We illustrate some of the issues involved in establishing this result by looking at the definition, in LK,P2L^2_{K,P}, of chains. The limitations of this definition provide some insight into the types of natural linguistic principles that correspond to higher levels of language complexity. We close with some speculation on the possible significance of these results for generative linguistics.Comment: To appear in Specifying Syntactic Structures, papers from the Logic, Structures, and Syntax workshop, Amsterdam, Sept. 1994. LaTeX source with nine included postscript figure
    • …