Search CORE

342 research outputs found

Parsing Unary Boolean Grammars Using Online Convolution

Author: Okhotin Alexander
Publication venue: Dagstuhl Seminar Proceedings. 10501 - Advances and Applications of Automata on Words and Trees
Publication date: 01/01/2011
Field of study

In contrast to context-free grammars, the extension of these grammars by explicit conjunction, the so-called conjunctive grammars can generate (quite complicated) non-regular languages over a single-letter alphabet (DLT 2007). Given these expressibility results, we study the parsability of Boolean grammars, an extension of context-free grammars by conjunction and negation, over a unary alphabet and show that they can be parsed in time O(|G| log^2(n) M(n)) where M(n) is the time to multiply two n-bit integers. This multiplication algorithm is transformed into a convolution algorithm which in turn is converted to an online convolution algorithm which is used for the parsing

Dagstuhl Research Online Publication Server

Numbers and Languages

Author: Lehtinen Tommi J.M.
Publication venue: Turku Centre for Computer Science
Publication date: 15/03/2013
Field of study

The thesis presents results obtained during the authors PhD-studies. First systems of language equations of a simple form consisting of just two equations are proved to be computationally universal. These are systems over unary alphabet, that are seen as systems of equations over natural numbers. The systems contain only an equation X+A=B and an equation X+X+C=X+X+D, where A, B, C and D are eventually periodic constants. It is proved that for every recursive set S there exists natural numbers p and d, and eventually periodic sets A, B, C and D such that a number n is in S if and only if np+d is in the unique solution of the abovementioned system of two equations, so all recursive sets can be represented in an encoded form. It is also proved that all recursive sets cannot be represented as they are, so the encoding is really needed. Furthermore, it is proved that the family of languages generated by Boolean grammars is closed under injective gsm-mappings and inverse gsm-mappings. The arguments apply also for the families of unambiguous Boolean languages, conjunctive languages and unambiguous languages. Finally, characterizations for morphisims preserving subfamilies of context-free languages are presented. It is shown that the families of deterministic and LL context-free languages are closed under codes if and only if they are of bounded deciphering delay. These families are also closed under non-codes, if they map every letter into a submonoid generated by a single word. The family of unambiguous context-free languages is closed under all codes and under the same non-codes as the families of deterministic and LL context-free languages.Siirretty Doriast

UTUPub

Constant-Delay Enumeration for SLP-Compressed Documents

Author: Riveros Cristian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 26th International Conference on Database Theory (ICDT 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

Toward a theory of input-driven locally parsable languages

Author: CRESPI REGHIZZI Stefano
Lonati Violetta
Mandrioli Dino
Pradella Matteo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

If a context-free language enjoys the local parsability property then, no matter how the source string is segmented, each segment can be parsed independently, and an efficient parallel parsing algorithm becomes possible. The new class of locally chain parsable languages (LCPLs), included in the deterministic context-free language family, is here defined by means of the chain-driven automaton and characterized by decidable properties of grammar derivations. Such automaton decides whether to reduce or not a substring in a way purely driven by the terminal characters, thus extending the well-known concept of input-driven (ID) alias visibly pushdown machines. The LCPL family extends and improves the practically relevant Floyd's operator-precedence (OP) languages which are known to strictly include the ID languages, and for which a parallel-parser generator exists

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Ambiguity Detection Methods for Context-Free Grammars

Author: Basten H.J.S. (Bas)
Publication venue: 'Universite Catholique de Louvain'
Publication date: 17/08/2007
Field of study

The Meta-Environment enables the creation of grammars using the SDF formalism. From these grammars an SGLR parser can be generated. One of the advantages of these parsers is that they can handle the entire class of context-free grammars (CFGs). The grammar developer does not have to squeeze his grammar into a specific subclass of CFGs that is deterministically parsable. Instead, he can now design his grammar to best describe the structure of his language. The downside of allowing the entire class of CFGs is the danger of ambiguities. An ambiguous grammar prevents some sentences from having a unique meaning, depending on the semantics of the used language. It is best to remove all ambiguities from a grammar before it is used. Unfortunately, the detection of ambiguities in a grammar is an undecidable problem. For a recursive grammar the number of possibilities that have to be checked might be infinite. Various ambiguity detection methods (ADMs) exist, but none can always correctly identify the (un)ambiguity of a grammar. They all try to attack the problem from different angles, which results in different characteristics like termination, accuracy and performance. The goal of this project was to find out which method has the best practical usability. In particu

CWI's Institutional Repository

Using Contextual Representations to Efficiently Learn Context-Free Languages

Author: Clark Alexander
Eyraud Rémi
Habrard Amaury
Publication venue: Microtome Publishing
Publication date: 01/01/2010
Field of study

International audienceWe present a polynomial update time algorithm for the inductive inference of a large class of context-free languages using the paradigm of positive data and a membership oracle. We achieve this result by moving to a novel representation, called Contextual Binary Feature Grammars (CBFGs), which are capable of representing richly structured context-free languages as well as some context sensitive languages. These representations explicitly model the lattice structure of the distribution of a set of substrings and can be inferred using a generalisation of distributional learning. This formalism is an attempt to bridge the gap between simple learnable classes and the sorts of highly expressive representations necessary for linguistic representation: it allows the learnability of a large class of context-free languages, that includes all regular languages and those context-free languages that satisfy two simple constraints. The formalism and the algorithm seem well suited to natural language and in particular to the modeling of first language acquisition. Preliminary experimental results confirm the effectiveness of this approach

CiteSeerX

HAL AMU

King's Research Portal