219 research outputs found
Some applications of the formalization of the pumping lemma for context-free languages
Context-free languages are highly important in computer language processing technology as well as in formal language theory. The Pumping Lemma for Context-Free Languages states a property that is valid for all context-free languages, which makes it a tool for showing the existence of non-context-free languages. This paper presents a formalization, extending the previously formalized Lemma, of the fact that several well-known languages are not context-free. Moreover, we build on those results to construct a formal proof of the well-known property that context-free languages are not closed under intersection. All the formalization has been mechanized in the Coq proof assistant.- (undefined
On the formalization of some results of context-free language theory
This work describes a formalization effort, using the Coq proof assistant, of fundamental results related to the classical theory of context-free grammars and languages. These include closure properties (union, concatenation and Kleene star), grammar simplification (elimination of useless symbols, inaccessible symbols, empty rules and unit rules), the existence of a Chomsky Normal Form for context-free grammars and the Pumping Lemma for context-free languages. The result is an important set of libraries covering the main results of context-free language theory, with more than 500 lemmas and theorems fully proved and checked. This is probably the most comprehensive formalization of the classical context-free language theory in the Coq proof assistant done to the present date, and includes the important result that is the formalization of the Pumping Lemma for context-free languages.info:eu-repo/semantics/publishedVersio
Axiomatizing proof tree concepts in Bounded Arithmetic
We construct theories of Cook-Nguyen style two-sort bounded arithmetic
whose provably total functions are exactly those in LOGCFL and LOGDCFL.
Axiomatizations of both theories are based on the proof tree size
characterizations of these classes. We also show that our theory for LOGCFL proves a certain formulation of the pumping lemma for context-free languages
Pumping lemmas for classes of languages generated by folding systems
Geometric folding processes are ubiquitous in natural systems ranging from
protein biochemistry to patterns of insect wings and leaves. In a previous
study, a folding operation between strings of formal languages was introduced
as a model of such processes. The operation was then used to define a folding
system (F-system) as a construct consisting of a core language, containing the
strings to be folded, and a folding procedure language, which defines how the
folding is done. This paper reviews main definitions associated with F-systems
and next it determines necessary conditions for a language to belong to classes
generated by such systems. The conditions are stated in the form of pumping
lemmas and four classes are considered, in which the core and folding procedure
languages are both regular, one of them is regular and the other context-free,
or both are context-free. Full demonstrations of the lemmas are provided, and
the analysis is illustrated with examples.Comment: 12 pages, 6 figures. This is a preprint (pre-refereeing) version of a
manuscript accepted for publication in Natural Computin
Higher-Order Operator Precedence Languages
Floyd's Operator Precedence (OP) languages are a deterministic context-free
family having many desirable properties. They are locally and parallely
parsable, and languages having a compatible structure are closed under Boolean
operations, concatenation and star; they properly include the family of Visibly
Pushdown (or Input Driven) languages. OP languages are based on three relations
between any two consecutive terminal symbols, which assign syntax structure to
words. We extend such relations to k-tuples of consecutive terminal symbols, by
using the model of strictly locally testable regular languages of order k at
least 3. The new corresponding class of Higher-order Operator Precedence
languages (HOP) properly includes the OP languages, and it is still included in
the deterministic (also in reverse) context free family. We prove Boolean
closure for each subfamily of structurally compatible HOP languages. In each
subfamily, the top language is called max-language. We show that such languages
are defined by a simple cancellation rule and we prove several properties, in
particular that max-languages make an infinite hierarchy ordered by parameter
k. HOP languages are a candidate for replacing OP languages in the various
applications where they have have been successful though sometimes too
restrictive.Comment: In Proceedings AFL 2017, arXiv:1708.0622
Formal Properties of XML Grammars and Languages
XML documents are described by a document type definition (DTD). An
XML-grammar is a formal grammar that captures the syntactic features of a DTD.
We investigate properties of this family of grammars. We show that every
XML-language basically has a unique XML-grammar. We give two characterizations
of languages generated by XML-grammars, one is set-theoretic, the other is by a
kind of saturation property. We investigate decidability problems and prove
that some properties that are undecidable for general context-free languages
become decidable for XML-languages. We also characterize those XML-grammars
that generate regular XML-languages.Comment: 24 page
On the Expressive Power of Regular Expressions with Backreferences
A rewb is a regular expression extended with a feature called backreference. It is broadly known that backreference is a practical extension of regular expressions, and is supported by most modern regular expression engines, such as those in the standard libraries of Java, Python, and more. Meanwhile, indexed languages are the languages generated by indexed grammars, a formal grammar class proposed by A.V.Aho. We show that these two models\u27 expressive powers are related in the following way: every language described by a rewb is an indexed language. As the smallest formal grammar class previously known to contain rewbs is the class of context sensitive languages, our result strictly improves the known upper-bound. Moreover, we prove the following two claims: there exists a rewb whose language does not belong to the class of stack languages, which is a proper subclass of indexed languages, and the language described by a rewb without a captured reference is in the class of nonerasing stack languages, which is a proper subclass of stack languages. Finally, we show that the hierarchy investigated in a prior study, which separates the expressive power of rewbs by the notion of nested levels, is within the class of nonerasing stack languages
- …