43 research outputs found
Aperiodicity, Star-freeness, and First-order Definability of Structured Context-Free Languages
A classic result in formal language theory is the equivalence among
noncounting, or aperiodic, regular languages, and languages defined through
star-free regular expressions, or first-order logic. Together with first-order
completeness of linear temporal logic these results constitute a theoretical
foundation for model-checking algorithms. Extending these results to structured
subclasses of context-free languages, such as tree-languages did not work as
smoothly: for instance W. Thomas showed that there are star-free tree languages
that are counting. We show, instead, that investigating the same properties
within the family of operator precedence languages leads to equivalences that
perfectly match those on regular languages. The study of this old family of
context-free languages has been recently resumed to enhance not only parsing
(the original motivation of its inventor R. Floyd) but also to exploit their
algebraic and logic properties. We have been able to reproduce the classic
results of regular languages for this much larger class by going back to string
languages rather than tree languages. Since operator precedence languages
strictly include other classes of structured languages such as visibly pushdown
languages, the same results given in this paper hold as trivial corollary for
that family too
Algebraic properties of structured context-free languages: old approaches and novel developments
The historical research line on the algebraic properties of structured CF
languages initiated by McNaughton's Parenthesis Languages has recently
attracted much renewed interest with the Balanced Languages, the Visibly
Pushdown Automata languages (VPDA), the Synchronized Languages, and the
Height-deterministic ones. Such families preserve to a varying degree the basic
algebraic properties of Regular languages: boolean closure, closure under
reversal, under concatenation, and Kleene star. We prove that the VPDA family
is strictly contained within the Floyd Grammars (FG) family historically known
as operator precedence. Languages over the same precedence matrix are known to
be closed under boolean operations, and are recognized by a machine whose pop
or push operations on the stack are purely determined by terminal letters. We
characterize VPDA's as the subclass of FG having a peculiarly structured set of
precedence relations, and balanced grammars as a further restricted case. The
non-counting invariance property of FG has a direct implication for VPDA too.Comment: Extended version of paper presented at WORDS2009, Salerno,Italy,
September 200
Languages convex with respect to binary relations, and their closure properties
A language is prefix-convex if it satisfies the condition that, if a word w and its prefix u are in the language, then so is every prefix of w that has u as a prefix. Prefix-convex languages include prefix-closed languages at one end of the spectrum, and prefix-free languages, which include prefix codes, at the other. In a similar way, we define suffix-, bifix-, factor-, and subword-convex languages and their closed and free counterparts. This provides a common framework for diverse languages such as codes, factorial languages and ideals. We examine the relationships among these languages. We generalize these notions to arbitrary binary relations on the set of all words over a given alphabet, and study the closure properties of such languages
Higher-Order Operator Precedence Languages
Floyd's Operator Precedence (OP) languages are a deterministic context-free
family having many desirable properties. They are locally and parallely
parsable, and languages having a compatible structure are closed under Boolean
operations, concatenation and star; they properly include the family of Visibly
Pushdown (or Input Driven) languages. OP languages are based on three relations
between any two consecutive terminal symbols, which assign syntax structure to
words. We extend such relations to k-tuples of consecutive terminal symbols, by
using the model of strictly locally testable regular languages of order k at
least 3. The new corresponding class of Higher-order Operator Precedence
languages (HOP) properly includes the OP languages, and it is still included in
the deterministic (also in reverse) context free family. We prove Boolean
closure for each subfamily of structurally compatible HOP languages. In each
subfamily, the top language is called max-language. We show that such languages
are defined by a simple cancellation rule and we prove several properties, in
particular that max-languages make an infinite hierarchy ordered by parameter
k. HOP languages are a candidate for replacing OP languages in the various
applications where they have have been successful though sometimes too
restrictive.Comment: In Proceedings AFL 2017, arXiv:1708.0622
Formal Languages and Compilation
This textbook describes the essential principles and methods used for defining the syntax of artificial languages, and for designing efficient parsing algorithms and syntax-directed translators with semantic attributes. A comprehensive selection of topics is presented within a rigorous, unified framework, illustrated by numerous practical examples. Features and topics: presents a novel conceptual approach to parsing algorithms that applies to extended BNF grammars, together with a parallel parsing algorithm; supplies supplementary teaching tools, including course slides and exercises with solutions, at an associated website; unifies the concepts and notations used in different approaches, enabling an extended coverage of methods with a reduced number of definitions; systematically discusses ambiguous forms, allowing readers to avoid pitfalls when designing grammars; describes all algorithms in pseudocode, so that detailed knowledge of a specific programming language is not necessary; makes extensive usage of theoretical models of automata, transducers and formal grammars; includes concise coverage of algorithms for processing regular expressions and finite automata; and introduces static program analysis based on flow equations. This clearly-written, classroom-tested textbook is an ideal guide to the fundamentals of this field for advanced undergraduate and graduate students in computer science and computer engineering. Some background in programming is required, and readers should also be familiar with basic set theory, algebra and logic
Inferring pure context-free languages from positive data
We study the possibilities to infer pure context-free languages from positive data. We can show that while the whole class of pure context-free languages is not inferable from positive data, it has interesting subclasses which have the desired inference property. We study uniform pure languages, i.e., languages generated by pure grammars obeying restrictions on the length of the right hand sides of their productions, and pure languages generated by deterministic pure grammars