154 research outputs found
On the Complexity and Performance of Parsing with Derivatives
Current algorithms for context-free parsing inflict a trade-off between ease
of understanding, ease of implementation, theoretical complexity, and practical
performance. No algorithm achieves all of these properties simultaneously.
Might et al. (2011) introduced parsing with derivatives, which handles
arbitrary context-free grammars while being both easy to understand and simple
to implement. Despite much initial enthusiasm and a multitude of independent
implementations, its worst-case complexity has never been proven to be better
than exponential. In fact, high-level arguments claiming it is fundamentally
exponential have been advanced and even accepted as part of the folklore.
Performance ended up being sluggish in practice, and this sluggishness was
taken as informal evidence of exponentiality.
In this paper, we reexamine the performance of parsing with derivatives. We
have discovered that it is not exponential but, in fact, cubic. Moreover,
simple (though perhaps not obvious) modifications to the implementation by
Might et al. (2011) lead to an implementation that is not only easy to
understand but also highly performant in practice.Comment: 13 pages; 12 figures; implementation at
http://bitbucket.org/ucombinator/parsing-with-derivatives/ ; published in
PLDI '16, Proceedings of the 37th ACM SIGPLAN Conference on Programming
Language Design and Implementation, June 13 - 17, 2016, Santa Barbara, CA,
US
Analysing symbolic music with probabilistic grammars
Recent developments in computational linguistics offer ways to approach the analysis of musical structure by inducing probabilistic models (in the form of grammars) over a corpus of music. These can produce idiomatic sentences from a probabilistic model of the musical language and thus offer explanations of the musical structures they model. This chapter surveys historical and current work in musical analysis using grammars, based on computational linguistic approaches. We outline the theory of probabilistic grammars and illustrate their implementation in Prolog using PRISM. Our experiments on learning the probabilities for simple grammars from pitch sequences in two kinds of symbolic musical corpora are summarized. The results support our claim that probabilistic grammars are a promising framework for computational music analysis, but also indicate that further work is required to establish their superiority over Markov models
Saggitarius: A DSL for Specifying Grammatical Domains
Common data types like dates, addresses, phone numbers and tables can have
multiple textual representations, and many heavily-used languages, such as SQL,
come in several dialects. These variations can cause data to be misinterpreted,
leading to silent data corruption, failure of data processing systems, or even
security vulnerabilities. Saggitarius is a new language and system designed to
help programmers reason about the format of data, by describing grammatical
domains -- that is, sets of context-free grammars that describe the many
possible representations of a datatype. We describe the design of Saggitarius
via example and provide a relational semantics. We show how Saggitarius may be
used to analyze a data set: given example data, it uses an algorithm based on
semi-ring parsing and MaxSAT to infer which grammar in a given domain best
matches that data. We evaluate the effectiveness of the algorithm on a
benchmark suite of 110 example problems, and we demonstrate that our system
typically returns a satisfying grammar within a few seconds with only a small
number of examples. We also delve deeper into a more extensive case study on
using Saggitarius for CSV dialect detection. Despite being general-purpose, we
find that Saggitarius offers comparable results to hand-tuned, specialized
tools; in the case of CSV, it infers grammars for 84% of benchmarks within 60
seconds, and has comparable accuracy to custom-built dialect detection tools.Comment: OOPSLA 202
Symbol–Relation Grammars: A Formalism for Graphical Languages
AbstractA common approach to the formal description of pictorial and visual languages makes use of formal grammars and rewriting mechanisms. The present paper is concerned with the formalism of Symbol–Relation Grammars (SR grammars, for short). Each sentence in an SR language is composed of a set of symbol occurrences representing visual elementary objects, which are related through a set of binary relational items. The main feature of SR grammars is the uniform way they use context-free productions to rewrite symbol occurrences as well as relation items. The clearness and uniformity of the derivation process for SR grammars allow the extension of well-established techniques of syntactic and semantic analysis to the case of SR grammars. The paper provides an accurate analysis of the derivation mechanism and the expressive power of the SR formalism. This is necessary to fully exploit the capabilities of the model. The most meaningful features of SR grammars as well as their generative power are compared with those of well-known graph grammar families. In spite of their structural simplicity, variations of SR grammars have a generative power comparable with that of expressive classes of graph grammars, such as the edNCE and the N-edNCE classes
Object-oriented engineering of visual languages
Visual languages are notations that employ graphics (icons, diagrams) to present information in a two or more dimensional space. This work focuses on diagrammatic visual languages, as found in software engineering, and their computer implementations. Implementation means the development of processors to automatically analyze diagrams and the development of graphical editors for constructing the diagrams. We propose a rigorous implementation technique that uses a formal grammar to specify the syntax of a visual language and that uses parsing to automatically analyze the visual sentences generated by the grammar. The theoretical contributions of our work are an original treatment of error handling (error detection, reporting, and recovery) in off-line visual language parsing, and the source-to-source translation of visual languages. We have also substantially extended an existing grammatical model for multidimensional languages, called atomic relational grammars. We have added support for meta-language expressions that denote optional and repetitive right-hand-side elements. We hav
A Graphical User Interface for Designing Graph Grammars
Graph grammar has been widely applied in many scientific areas. However, designing graph grammar is very challenging for users without strong computer science background. This paper presents a graphical user interface (GUI) for designing graph grammars following an edge-based context-sensitive graph grammar formalism, EGG. This GUI significantly eases graph grammar design, especially for users unfamiliar with the grammar format
- …