Search CORE

31,206 research outputs found

Recommended from our members

Partially ordered multiset context-free grammars and ID/LP parsing

Author: Nederhof Mark-Jan
Satta Giorgio
Shieber Stuart
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2003
Field of study

We present a new formalism, partially ordered multiset context-free grammars (poms-CFG), along with an Earley-style parsing algorithm. The formalism, which can be thought of as a generalization of context-free grammars with partially ordered right-hand sides, is of interest in its own right, and also as infrastructure for obtaining tighter complexity bounds for more expressive context-free formalisms intended to express free or multiple word-order, such as ID/LP grammars. We reduce ID/LP grammars to poms-grammars, thereby getting finer-grained bounds on the parsing complexity of ID/LP grammars. We argue that in practice, the width of attested ID/LP grammars is small, yielding effectively polynomial time complexity for ID/LP grammar parsing.Engineering and Applied Science

Harvard University - DASH

Coalition formation in a virtual buying cooperative: a case for formal grammars

Author: Raborife Mpho Ivy
Publication venue
Publication date: 01/01/2016
Field of study

A thesis submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Doctor of Philosophy. Johannesburg, March 2016.We report on a study that investigates the applicability of formal grammars in modelling coalition formation. This particular coalition formation is amongst a group of physically distributed enterprises intending to purchase items from a supplier as a single entity, termed a virtual buying cooperative (VBC). We investigate several grammars with regard to their appropriateness in modelling the interaction strategy amongst the enterprises during the formation of a VBC. A regular grammar, context-free grammars, a random permitting context grammar, random forbidding context grammars, and random context grammars are used to model the formation of a VBC in this study. The adequacy and limitations in modelling the formation of a VBC by these grammars is explored. The results demonstrate that random context grammars are adequate in modelling a VBC environment. In addition to generating the specified languages representing a formed coalition, the production rules of all the three random context grammars investigated in this study, at every derivation step, adhere to the interaction strategy of a VBC during its formation. The strategy excludes enterprises that have not been invited to join the coalition from participating in the coalition. Furthermore, if an enterprise has been invited to join the coalition by multiple enterprises, it can only accept one invitation. This study aims to bridge the gap between formal grammars and technological applications.M T 201

Wits Institutional Repository on DSPACE

XRate: a fast prototyping, training and annotation tool for phylo-grammars

Author: Bendaña Yuri R
Bradley Robert K
Chao Sharon
Goldman Nick
Holmes Ian
Klosterman Peter S
Kosiol Carolin
Uzilov Andrew V
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Recent years have seen the emergence of genome annotation methods based on the phylo-grammar, a probabilistic model combining continuous-time Markov chains and stochastic grammars. Previously, phylo-grammars have required considerable effort to implement, limiting their adoption by computational biologists. RESULTS: We have developed an open source software tool, xrate, for working with reversible, irreversible or parametric substitution models combined with stochastic context-free grammars. xrate efficiently estimates maximum-likelihood parameters and phylogenetic trees using a novel "phylo-EM" algorithm that we describe. The grammar is specified in an external configuration file, allowing users to design new grammars, estimate rate parameters from training data and annotate multiple sequence alignments without the need to recompile code from source. We have used xrate to measure codon substitution rates and predict protein and RNA secondary structures. CONCLUSION: Our results demonstrate that xrate estimates biologically meaningful rates and makes predictions whose accuracy is comparable to that of more specialized tools

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Faster scannerless GLR parsing

Author: Economopoulos G.R. (Giorgos Robert)
Klint P. (Paul)
Vinju J.J. (Jurgen)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2009
Field of study

Analysis and renovation of large software portfolios requires syntax analysis of multiple, usually embedded, languages and this is beyond the capabilities of many standard parsing techniques. The traditional separation between lexer and parser falls short due to the limitations of tokenization based on regular expressions when handling multiple lexical grammars. In such cases scannerless parsing provides a viable solution. It uses the power of context-free grammars to be able to deal with a wide variety of issues in parsing lexical syntax. However, it comes at the price of less efficiency. The structure of tokens is obtained using a more powerful but more time and memory intensive parsing algorithm. Scannerless grammars are also more non-deterministic than their tokenized counterparts, increasing the burden on the parsing algorithm even further. In this paper we investigate the application of the Right-Nulled Generalized LR parsing algorithm (RNGLR) to scannerless parsing. We adapt the Scannerless Generalized LR parsing and filtering algorithm (SGLR) to implement the optimizations of RNGLR. We present an updated parsing and filtering algorithm, called SRNGLR, and analyze its performance in comparison to SGLR on ambiguous grammars for the programming languages C, Java, Python, SASL, and C++. Measurements show that SRNGLR is on average 33% faster than SGLR, but is 95% faster on the highly ambiguous SASL grammar. For the mainstream languages C, C++, Java and Python the average speedup is 16%

CWI's Institutional Repository

Multiple Context-Free Tree Grammars: Lexicalization and Characterization

Author: Engelfriet Joost
Maletti Andreas
Maneth Sebastian
Publication venue
Publication date: 11/07/2017
Field of study

Multiple (simple) context-free tree grammars are investigated, where "simple" means "linear and nondeleting". Every multiple context-free tree grammar that is finitely ambiguous can be lexicalized; i.e., it can be transformed into an equivalent one (generating the same tree language) in which each rule of the grammar contains a lexical symbol. Due to this transformation, the rank of the nonterminals increases at most by 1, and the multiplicity (or fan-out) of the grammar increases at most by the maximal rank of the lexical symbols; in particular, the multiplicity does not increase when all lexical symbols have rank 0. Multiple context-free tree grammars have the same tree generating power as multi-component tree adjoining grammars (provided the latter can use a root-marker). Moreover, every multi-component tree adjoining grammar that is finitely ambiguous can be lexicalized. Multiple context-free tree grammars have the same string generating power as multiple context-free (string) grammars and polynomial time parsing algorithms. A tree language can be generated by a multiple context-free tree grammar if and only if it is the image of a regular tree language under a deterministic finite-copying macro tree transducer. Multiple context-free tree grammars can be used as a synchronous translation device.Comment: 78 pages, 13 figure

arXiv.org e-Print Archive

Leiden University Scholary Publications