1,105 research outputs found
Higher-Order Operator Precedence Languages
Floyd's Operator Precedence (OP) languages are a deterministic context-free
family having many desirable properties. They are locally and parallely
parsable, and languages having a compatible structure are closed under Boolean
operations, concatenation and star; they properly include the family of Visibly
Pushdown (or Input Driven) languages. OP languages are based on three relations
between any two consecutive terminal symbols, which assign syntax structure to
words. We extend such relations to k-tuples of consecutive terminal symbols, by
using the model of strictly locally testable regular languages of order k at
least 3. The new corresponding class of Higher-order Operator Precedence
languages (HOP) properly includes the OP languages, and it is still included in
the deterministic (also in reverse) context free family. We prove Boolean
closure for each subfamily of structurally compatible HOP languages. In each
subfamily, the top language is called max-language. We show that such languages
are defined by a simple cancellation rule and we prove several properties, in
particular that max-languages make an infinite hierarchy ordered by parameter
k. HOP languages are a candidate for replacing OP languages in the various
applications where they have have been successful though sometimes too
restrictive.Comment: In Proceedings AFL 2017, arXiv:1708.0622
On the Hierarchy of Block Deterministic Languages
A regular language is -lookahead deterministic (resp. -block
deterministic) if it is specified by a -lookahead deterministic (resp.
-block deterministic) regular expression. These two subclasses of regular
languages have been respectively introduced by Han and Wood (-lookahead
determinism) and by Giammarresi et al. (-block determinism) as a possible
extension of one-unambiguous languages defined and characterized by
Br\"uggemann-Klein and Wood. In this paper, we study the hierarchy and the
inclusion links of these families. We first show that each -block
deterministic language is the alphabetic image of some one-unambiguous
language. Moreover, we show that the conversion from a minimal DFA of a
-block deterministic regular language to a -block deterministic automaton
not only requires state elimination, and that the proof given by Han and Wood
of a proper hierarchy in -block deterministic languages based on this result
is erroneous. Despite these results, we show by giving a parameterized family
that there is a proper hierarchy in -block deterministic regular languages.
We also prove that there is a proper hierarchy in -lookahead deterministic
regular languages by studying particular properties of unary regular
expressions. Finally, using our valid results, we confirm that the family of
-block deterministic regular languages is strictly included into the one of
-lookahead deterministic regular languages by showing that any -block
deterministic unary language is one-unambiguous
Continuity of Functional Transducers: A Profinite Study of Rational Functions
A word-to-word function is continuous for a class of languages~
if its inverse maps _languages to~. This notion
provides a basis for an algebraic study of transducers, and was integral to the
characterization of the sequential transducers computable in some circuit
complexity classes.
Here, we report on the decidability of continuity for functional transducers
and some standard classes of regular languages. To this end, we develop a
robust theory rooted in the standard profinite analysis of regular languages.
Since previous algebraic studies of transducers have focused on the sole
structure of the underlying input automaton, we also compare the two algebraic
approaches. We focus on two questions: When are the automaton structure and the
continuity properties related, and when does continuity propagate to
superclasses
Learning of Structurally Unambiguous Probabilistic Grammars
The problem of identifying a probabilistic context free grammar has two
aspects: the first is determining the grammar's topology (the rules of the
grammar) and the second is estimating probabilistic weights for each rule.
Given the hardness results for learning context-free grammars in general, and
probabilistic grammars in particular, most of the literature has concentrated
on the second problem. In this work we address the first problem. We restrict
attention to structurally unambiguous weighted context-free grammars (SUWCFG)
and provide a query learning algorithm for structurally unambiguous
probabilistic context-free grammars (SUPCFG). We show that SUWCFG can be
represented using co-linear multiplicity tree automata (CMTA), and provide a
polynomial learning algorithm that learns CMTAs. We show that the learned CMTA
can be converted into a probabilistic grammar, thus providing a complete
algorithm for learning a structurally unambiguous probabilistic context free
grammar (both the grammar topology and the probabilistic weights) using
structured membership queries and structured equivalence queries. We
demonstrate the usefulness of our algorithm in learning PCFGs over genomic
data
Toward a theory of input-driven locally parsable languages
If a context-free language enjoys the local parsability property then, no matter how the source string is segmented, each segment can be parsed independently, and an efficient parallel parsing algorithm becomes possible. The new class of locally chain parsable languages (LCPLs), included in the deterministic context-free language family, is here defined by means of the chain-driven automaton and characterized by decidable properties of grammar derivations. Such automaton decides whether to reduce or not a substring in a way purely driven by the terminal characters, thus extending the well-known concept of input-driven (ID) alias visibly pushdown machines. The LCPL family extends and improves the practically relevant Floyd's operator-precedence (OP) languages which are known to strictly include the ID languages, and for which a parallel-parser generator exists
Locally Chain-Parsable Languages
If a context-free language enjoys the local parsability property then, no matter how the source string is segmented, each segment can be parsed in- dependently, and an efficient parallel parsing algorithm becomes possible. The new class of locally chain-parsable languages (LCPL), included in deterministic context-free languages, is here defined by means of the chain-driven automa- ton and characterized by decidable properties of grammar derivations. Such au- tomaton decides to reduce or not a factor in a way purely driven by the terminal characters, thus extending the well-known concept of Input-Driven (ID) (visibly) pushdown machines. LCPL extend and improve the practically relevant operator- precedence languages (Floyd), which are known to strictly include the ID lan- guages, and for which a parallel-parser generator exists. Consistently with the classical results for ID, chain-compatible LCPL are closed under reversal and Boolean operations, and language inclusion is decidable
- …