107 research outputs found
Double Greibach operator grammars
AbstractEvery context-free grammar can be transformed into one in double Greibach operator form, that satisfies both double Greibach form and operator form. Examination of the expressive power of various well-known subclasses of context-free grammars in double Greibach and/or operator form yields an extended hierarchy of language classes. Basic decision properties such as equivalence can be stated in stronger forms via new classes of languages in this hierarchy
Generating All Permutations by Context-Free Grammars in Greibach Normal Form
We consider context-free grammars in Greibach normal form and, particularly, in Greibach -form () which generates the finite language of all strings that are permutations of different symbols (). These grammars are investigated with respect to their descriptional complexity, i.e., we determine the number of nonterminal symbols and the number of production rules of as functions of . As in the case of Chomsky normal form these descriptional complexity measures grow faster than any polynomial function
An Alternative Formulation of Cocke-Younger-Kasami's Algorithm
We provide a reformulation of Cocke-Younger-Kasami's algorithm for recognizing context-free languages in which there are no references either to indices of table entries or to the length of the input string. Some top-down analogues of this functional approach are discussed as well
Formal Languages and Compilation
This textbook describes the essential principles and methods used for defining the syntax of artificial languages, and for designing efficient parsing algorithms and syntax-directed translators with semantic attributes. A comprehensive selection of topics is presented within a rigorous, unified framework, illustrated by numerous practical examples. Features and topics: presents a novel conceptual approach to parsing algorithms that applies to extended BNF grammars, together with a parallel parsing algorithm; supplies supplementary teaching tools, including course slides and exercises with solutions, at an associated website; unifies the concepts and notations used in different approaches, enabling an extended coverage of methods with a reduced number of definitions; systematically discusses ambiguous forms, allowing readers to avoid pitfalls when designing grammars; describes all algorithms in pseudocode, so that detailed knowledge of a specific programming language is not necessary; makes extensive usage of theoretical models of automata, transducers and formal grammars; includes concise coverage of algorithms for processing regular expressions and finite automata; and introduces static program analysis based on flow equations. This clearly-written, classroom-tested textbook is an ideal guide to the fundamentals of this field for advanced undergraduate and graduate students in computer science and computer engineering. Some background in programming is required, and readers should also be familiar with basic set theory, algebra and logic
Generalizing input-driven languages: theoretical and practical benefits
Regular languages (RL) are the simplest family in Chomsky's hierarchy. Thanks
to their simplicity they enjoy various nice algebraic and logic properties that
have been successfully exploited in many application fields. Practically all of
their related problems are decidable, so that they support automatic
verification algorithms. Also, they can be recognized in real-time.
Context-free languages (CFL) are another major family well-suited to
formalize programming, natural, and many other classes of languages; their
increased generative power w.r.t. RL, however, causes the loss of several
closure properties and of the decidability of important problems; furthermore
they need complex parsing algorithms. Thus, various subclasses thereof have
been defined with different goals, spanning from efficient, deterministic
parsing to closure properties, logic characterization and automatic
verification techniques.
Among CFL subclasses, so-called structured ones, i.e., those where the
typical tree-structure is visible in the sentences, exhibit many of the
algebraic and logic properties of RL, whereas deterministic CFL have been
thoroughly exploited in compiler construction and other application fields.
After surveying and comparing the main properties of those various language
families, we go back to operator precedence languages (OPL), an old family
through which R. Floyd pioneered deterministic parsing, and we show that they
offer unexpected properties in two fields so far investigated in totally
independent ways: they enable parsing parallelization in a more effective way
than traditional sequential parsers, and exhibit the same algebraic and logic
properties so far obtained only for less expressive language families
flap: A Deterministic Parser with Fused Lexing
Lexers and parsers are typically defined separately and connected by a token
stream. This separate definition is important for modularity and reduces the
potential for parsing ambiguity. However, materializing tokens as data
structures and case-switching on tokens comes with a cost. We show how to fuse
separately-defined lexers and parsers, drastically improving performance
without compromising modularity or increasing ambiguity. We propose a
deterministic variant of Greibach Normal Form that ensures deterministic
parsing with a single token of lookahead and makes fusion strikingly simple,
and prove that normalizing context free expressions into the deterministic
normal form is semantics-preserving. Our staged parser combinator library,
flap, provides a standard interface, but generates specialized token-free code
that runs two to six times faster than ocamlyacc on a range of benchmarks.Comment: PLDI 2023 with appendi
Commutative Languages and their Composition by Consensual Methods
Commutative languages with the semilinear property (SLIP) can be naturally
recognized by real-time NLOG-SPACE multi-counter machines. We show that unions
and concatenations of such languages can be similarly recognized, relying on --
and further developing, our recent results on the family of consensually
regular (CREG) languages. A CREG language is defined by a regular language on
the alphabet that includes the terminal alphabet and its marked copy. New
conditions, for ensuring that the union or concatenation of CREG languages is
closed, are presented and applied to the commutative SLIP languages. The paper
contributes to the knowledge of the CREG family, and introduces novel
techniques for language composition, based on arithmetic congruences that act
as language signatures. Open problems are listed.Comment: In Proceedings AFL 2014, arXiv:1405.527
- …