Search CORE

13 research outputs found

Higher-Order Operator Precedence Languages

Author: Pradella Matteo
Reghizzi Stefano Crespi
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2017
Field of study

Floyd's Operator Precedence (OP) languages are a deterministic context-free family having many desirable properties. They are locally and parallely parsable, and languages having a compatible structure are closed under Boolean operations, concatenation and star; they properly include the family of Visibly Pushdown (or Input Driven) languages. OP languages are based on three relations between any two consecutive terminal symbols, which assign syntax structure to words. We extend such relations to k-tuples of consecutive terminal symbols, by using the model of strictly locally testable regular languages of order k at least 3. The new corresponding class of Higher-order Operator Precedence languages (HOP) properly includes the OP languages, and it is still included in the deterministic (also in reverse) context free family. We prove Boolean closure for each subfamily of structurally compatible HOP languages. In each subfamily, the top language is called max-language. We show that such languages are defined by a simple cancellation rule and we prove several properties, in particular that max-languages make an infinite hierarchy ordered by parameter k. HOP languages are a candidate for replacing OP languages in the various applications where they have have been successful though sometimes too restrictive.Comment: In Proceedings AFL 2017, arXiv:1708.0622

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: case of grammatical inference

Author: Angluin
Angluin
Angluin
Bagchi
Bhalse
Choubey
Choubey
Choubey
Clark
Clark
Cleeremans
De La Higuera
de la Higuera
Delgado
Dupont
D’Ulizia
Elman
Fu
Gallager
Gold
Graves
Grünwald
Hansen
Harrison
Higuera
Holland
Hrnčič
Hrnčič
Iuspa
Jonyer
Li
Michalewicz
Pandey
Pandey
Pandey
Petasis
Rissanen
Roy
Saers
Sakakibara
Sakakibara
Sivaraj
Solomonoff
Stevenson
Stevenson
Theeramunkongy
Valiant
Yang
Yoshinaka
Črepinšek
Črepinšek
Publication venue: 'Elsevier BV'
Publication date: 17/05/2016
Field of study

In this paper, a genetic algorithm with minimum description length (GAWMDL) is proposed for grammatical inference. The primary challenge of identifying a language of infinite cardinality from a finite set of examples should know when to generalize and specialize the training data. The minimum description length principle that has been incorporated addresses this issue is discussed in this paper. Previously, the e-GRIDS learning model was proposed, which enjoyed the merits of the minimum description length principle, but it is limited to positive examples only. The proposed GAWMDL, which incorporates a traditional genetic algorithm and has a powerful global exploration capability that can exploit an optimum offspring. This is an effective approach to handle a problem which has a large search space such the grammatical inference problem. The computational capability, the genetic algorithm poses is not questionable, but it still suffers from premature convergence mainly arising due to lack of population diversity. The proposed GAWMDL incorporates a bit mask oriented data structure that performs the reproduction operations, creating the mask, then Boolean based procedure is applied to create an offspring in a generative manner. The Boolean based procedure is capable of introducing diversity into the population, hence alleviating premature convergence. The proposed GAWMDL is applied in the context free as well as regular languages of varying complexities. The computational experiments show that the GAWMDL finds an optimal or close-to-optimal grammar. Two fold performance analysis have been performed. First, the GAWMDL has been evaluated against the elite mating pool genetic algorithm which was proposed to introduce diversity and to address premature convergence. GAWMDL is also tested against the improved tabular representation algorithm. In addition, the authors evaluate the performance of the GAWMDL against a genetic algorithm not using the minimum description length principle. Statistical tests demonstrate the superiority of the proposed algorithm. Overall, the proposed GAWMDL algorithm greatly improves the performance in three main aspects: maintains regularity of the data, alleviates premature convergence and is capable in grammatical inference from both positive and negative corpora

Nottingham ePrints

Nottingham eTheses

Crossref

Edge Hill University Research Information Repository

Middlesex University Research Repository

University of Missouri, St. Louis

Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: case of grammatical inference

Author: Chaudhary A.
Chaudhary A.
Kendall G.
Kendall G.
Mehrotra D.
Mehrotra D.
Pandey H.
Pandey H.
Publication venue: Elsevier
Publication date: 01/01/2016
Field of study

Middlesex University Research Repository

Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: Case of grammatical inference

Author: Chaudhary Ankit
Kendall Graham
Mehrotra Deepti
Pandey Hari Mohan
Publication venue: IRL @ UMSL
Publication date: 01/12/2016
Field of study

University of Missouri, St. Louis

Learning local substitutable context-free languages from positive examples in polynomial time and data by reduction

Author: Coste François
Nicolas Jacques
Publication venue: HAL CCSD
Publication date: 05/09/2018
Field of study

International audienceTo study more formally the approach by reduction initiated by ReGLiS, we propose a formal characterization of the grammars in reduced normal form (RNF) which can be learned by this approach. A modification of the core of ReGLiS is then proposed to ensure returning RNF grammars in polynomial time. This enables us to show that local substitutable languages represented by RNF context-free grammars are identifiable in polynomial time and thick data (IPTtD) from positive examples

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

PCFG Induction for Unsupervised Parsing and Language Modelling

Author
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

Crossref

Learning local substitutable context-free languages from positive examples in polynomial time and data by reduction

Author: Coste François
Nicolas Jacques
Publication venue: HAL CCSD
Publication date: 05/09/2018
Field of study

INRIA a CCSD electronic archive server

Using Contextual Representations to Efficiently Learn Context-Free Languages

Author: Clark Alexander
Eyraud Rémi
Habrard Amaury
Publication venue: Microtome Publishing
Publication date: 01/01/2010
Field of study

International audienceWe present a polynomial update time algorithm for the inductive inference of a large class of context-free languages using the paradigm of positive data and a membership oracle. We achieve this result by moving to a novel representation, called Contextual Binary Feature Grammars (CBFGs), which are capable of representing richly structured context-free languages as well as some context sensitive languages. These representations explicitly model the lattice structure of the distribution of a set of substrings and can be inferred using a generalisation of distributional learning. This formalism is an attempt to bridge the gap between simple learnable classes and the sorts of highly expressive representations necessary for linguistic representation: it allows the learnability of a large class of context-free languages, that includes all regular languages and those context-free languages that satisfy two simple constraints. The formalism and the algorithm seem well suited to natural language and in particular to the modeling of first language acquisition. Preliminary experimental results confirm the effectiveness of this approach

CiteSeerX

HAL AMU

King's Research Portal

Beyond operator-precedence grammars and languages

Author: Crespi Reghizzi S.
Pradella M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Operator Precedence Languages (OPL) are deterministic context-free and have desirable properties. OPL are parallely parsable, and, when structurally compatible, are closed under Boolean operations, concatenation and star; they include the Input Driven languages. OPL use three relations between two terminal symbols, to assign syntax structure to words. We extend such relations to k-tuples of consecutive symbols, in agreement with strictly locally testable regular languages. For each k, the new corresponding class of Higher-order Operator Precedence languages properly includes the OPL and enjoy many of their properties. OPL are a strict hierarchy based on k, which contains maximal languages

Archivio istituzionale della ricerca - Politecnico di Milano