Skip to main content
Article thumbnail
Location of Repository

Exercises in Free Syntax. Syntax Definition, Parsing, and Assimilation of Language Conglomerates

By M. Bravenboer


In modern software development the use of multiple software languages\ud to constitute a single application is ubiquitous. Despite the\ud omnipresent use of combinations of languages, the principles and\ud techniques for using languages together are ad-hoc, unfriendly to\ud programmers, and result in a poor level of integration. We work\ud towards a principled and generic solution to language extension by\ud studying the applicability of modular syntax definition, scannerless\ud parsing, generalized parsing algorithms, and program transformations.\ud \ud We describe MetaBorg, a method for providing concrete syntax for\ud domain abstractions to application programmers. Since object-oriented\ud languages are designed for extensibility and reuse, the language\ud constructs are often sufficient for expressing domain abstractions at\ud the semantic level. However, they do not provide the right\ud abstractions at the syntactic level. The MetaBorg method consists of\ud embedding domain-specific languages in a general purpose host language\ud and assimilating the embedded domain code into the surrounding host\ud code. Instead of extending the implementation of the host language,\ud the assimilation phase implements domain abstractions in terms of\ud existing APIs leaving the host language undisturbed.\ud \ud We present a solution to injection vulnerabilities. Software written\ud in one language often needs to construct sentences in another\ud language, such as SQL queries, XML output, or shell command\ud invocations. This is almost always done using unhygienic string\ud manipulation. A client can then supply specially crafted input that\ud causes the constructed sentence to be interpreted in an unintended\ud way, leading to an injection attack. We describe a more natural style\ud of programming that yields code that is impervious to injections by\ud construction. Our approach embeds the grammars of the guest languages\ud into that of the host language and automatically generates code that\ud maps the embedded language to constructs in the host language that\ud reconstruct the embedded sentences, adding escaping functions where\ud appropriate.\ud \ud We study AspectJ as a typical example of a language conglomerate,\ud i.e. a language composed of a number of separate languages with\ud different syntactic styles. We show that the combination of the\ud lexical syntax leads to considerable complexity in the lexical states\ud to be processed. We show how scannerless parsing elegantly addresses\ud this. We present the design of a modular, extensible, and formal\ud definition of the lexical and context-free aspects of the AspectJ\ud syntax. We introduce grammar mixins, which allows the declarative\ud definition of keyword policies and combination of extensions.\ud \ud We introduce separate compilation of grammars to enable deployment of\ud languages as plugins to a compiler. Current extensible compilers focus\ud on source-level extensibility, which requires users to compile the\ud compiler with a specific configuration of extensions. A compound\ud parser needs to be generated for every combination. We introduce an\ud algorithm for parse table composition to support separate compilation\ud of grammars to parse table components. Parse table components can be\ud composed (linked) efficiently at runtime, i.e. just before\ud parsing. For realistic language combination scenarios involving\ud grammars for real languages, our parse table composition algorithm is\ud an order of magnitude faster than computation of the parse table for\ud the combined grammars, making online language composition feasible

Topics: Wiskunde en Informatica, MetaBorg, Syntax embedding, Parse table composition, Scannerless parsing, Generalized LR parsing, Concrete object syntax, extensible syntax, AspectJ, Precedence rules
Publisher: Utrecht University
Year: 2008
OAI identifier:
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.