12 research outputs found

    A Pattern-based Foundation for Language-Driven Software Engineering

    Get PDF
    This work brings together two fundamental ideas for modelling, programming and analysing software systems. The first idea is of a methodological nature: engineering software by systematically creating and relating languages. The second idea is of a technical nature: using patterns as a practical foundation for computing. The goal is to show that the systematic creation and layering of languages can be reduced to the elementary operations of pattern matching and instantiation and that this pattern-based approach provides a formal and practical foundation for language-driven modelling, programming and analysis. The underpinning of the work is a novel formalism for recognising, deconstructing, creating, searching, transforming and generally manipulating data structures. The formalism is based on typed sequences, a generic structure for representing trees. It defines basic pattern expressions for matching and instantiating atomic values and variables. Horizontal, vertical, diagonal and hierarchical operators are different ways of combining patterns. Transformations combine matching and instantiating patterns and they are patterns themselves. A quasiquotation mechanism allows arbitrary levels of meta-pattern functionality and forms the basis of pattern abstraction. Path polymorphic operators are used to specify fine-grained search of structures. A range of core concepts such as layering, parsing and pattern-based computing can naturally be defined through pattern expressions. Three language-driven tools that utilise the pattern formalism showcase the applicability of the pattern-approach. Concat is a self-sustaining (meta-)programming system in which all computations are expressed by matching and instantiation. This includes parsing, executing and optimising programs. By applying its language engineering tools to its own meta-language, Concat can extend itself from within. XMF (XML Modeling Framework) is a browser-based modelling- and meta-modelling framework that provides flexible means to create and relate modelling languages and to query and validate models. The pattern functionality that makes this possible is partly exposed as a schema language and partly as a JavaScript library. CFR (Channel Filter Rule Language) implements a language-driven approach for layered analysis of communication in complex networked systems. The communication on each layer is visible in the language of an ā€œabstract protocolā€ that is defined by communication patterns

    Incremental parsing algorithms for speech-editing mathematics and computer code

    Get PDF
    The provision of speech control for editing plain language text has existed for a long time, but does not extend to structured content such as mathematics. The requirements of a user interface for a spoken mathematics editor are explored through the lens of an intuitive natural user interface (NUI) for speech control, the desired properties of which are based on a combination of existing literature on NUIs and intuitive user interfaces. An important aspect of an intuitive NUI is timely update of display of the content in response to editing actions. This is not feasible using batch parsing alone, and this issue will be more serious for larger documents such as computer program code. The solution is an incremental parser designed to work with operator precedence (OP) grammars. The contribution to knowledge provided by this thesis is to improve the efficiency in terms of processing time, of the OP incremental parsing algorithm developed by Heeman, and extend it to handle the distfix (mixfix) operators described by Attanayake to model brackets and mathematical functions. This is implemented successfully for the TalkMaths system and shows a greatly reduced response time compared with using batch scanning and parsing alone. The author is not aware of any other incremental OP parser that handles such operators. Furthermore, a proposal is made for modifications to the data structures produced by Attanayake's parser, along with appropriate adjustments to the incremental parser, that will in the future, facilitate application of OP grammar to program code or other structured content by changing the definition of its content language

    A pattern-based foundation for language-driven software engineering

    Get PDF
    This work brings together two fundamental ideas for modelling, programming and analysing software systems. The first idea is of a methodological nature: engineering software by systematically creating and relating languages. The second idea is of a technical nature: using patterns as a practical foundation for computing. The goal is to show that the systematic creation and layering of languages can be reduced to the elementary operations of pattern matching and instantiation and that this pattern-based approach provides a formal and practical foundation for language-driven modelling, programming and analysis. The underpinning of the work is a novel formalism for recognising, deconstructing, creating, searching, transforming and generally manipulating data structures. The formalism is based on typed sequences, a generic structure for representing trees. It defines basic pattern expressions for matching and instantiating atomic values and variables. Horizontal, vertical, diagonal and hierarchical operators are different ways of combining patterns. Transformations combine matching and instantiating patterns and they are patterns themselves. A quasiquotation mechanism allows arbitrary levels of meta-pattern functionality and forms the basis of pattern abstraction. Path polymorphic operators are used to specify fine-grained search of structures. A range of core concepts such as layering, parsing and pattern-based computing can naturally be defined through pattern expressions. Three language-driven tools that utilise the pattern formalism showcase the applicability of the pattern-approach. Concat is a self-sustaining (meta-)programming system in which all computations are expressed by matching and instantiation. This includes parsing, executing and optimising programs. By applying its language engineering tools to its own meta-language, Concat can extend itself from within. XMF (XML Modeling Framework) is a browser-based modelling- and meta-modelling framework that provides flexible means to create and relate modelling languages and to query and validate models. The pattern functionality that makes this possible is partly exposed as a schema language and partly as a JavaScript library. CFR (Channel Filter Rule Language) implements a language-driven approach for layered analysis of communication in complex networked systems. The communication on each layer is visible in the language of an ā€œabstract protocolā€ that is defined by communication patterns.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A pretty-printer for every occasion

    Get PDF
    Tool builders dealing with many different languages, and language designers require sophisticated pretty-print techniques to minimize the time needed for constructing and adapting pretty-printers. We combined new and existing pretty-print techniques in a generic pretty-printer that satisfies modern pretty-print requirements. Its features include language independence, customization, and incremental pretty-printer generation. Furthermore, we emphasize that the recent acceptance of XML as international standard for the representation of structured data demands flexible pretty-print techniques, and we demonstrate that our pretty-printer provides such technology

    An illumination of the template enigma : software code generation with templates

    Get PDF
    Creating software is a process of refining a concept to an implementation. This process consists of several stages represented by documents, models and plans at several levels of abstraction. Mostly, the refinement process requires creativity of the programmers, but sometimes the task is boring and repetitive. This repetitive work is an indication that the program is not written at the most suitable level of abstraction. The level of abstraction offered by the used programming language might be too low to remove the recurring code. Code generators can be used to raise the level of abstraction of program specifications and to automate the repetitive work. This thesis focuses on code generators based on templates. Templates are one of the techniques to implement a code generator. Templates allow extension of the syntax of a programming language, enabling generative programming without modifying the underlying compiler. Four artifacts are involved in a template based generator: templates, input data, a template evaluator and output code. The templates we consider are a concrete (incomplete) representation of the output document, i.e. object code, that contains holes, i.e. the meta code. These holes are filled by the template evaluator using information from the input data to obtain the output code. Templates are widely used to generate HTML code in web applications. They can be used for generating all kinds of text, like e-mails or (source) code. In this thesis we limit the scope to the generation of source code. The central research question is how the quality of template based code generators can be improved. Quality, in general, is a broad notion and our scope is limited to the technical quality of templates and generated code. We focused on improving the maintainability of template based code generators and the correctness of the generated code. This is facilitated by the three main contributions provided by this thesis. First, the maintainability of template based code generators is increased by specifying the following requirement for our metalanguage. Our metalanguage should not be rich enough to allow programming in templates, without being too restrictive to express some code generators. We used the theory of formal languages to specify our metalanguage. Second, we ensure correctness of the templates and generated code. Third, the presented theory and techniques are validated by case studies. These case studies show application of templates in real world applications, increased maintainability and syntactical correctness of generated code. Our metalanguage should not be rich enough to allow programming in templates, without being too restrictive to express some code generators. The theory of formal languages is used to specify the requirements for our metalanguage. As we only consider to generate programming languages, it is sufficient to support the generation of languages defined by context-free grammars. This assumption is used to derive a metalanguage, that is rich enough to specify code generators that are able to instantiate all possible sentences of a context-free language. A specific case of a code generator, the unparser, is a program that can instantiate all sentences of a context-free language. We proved that an unparser can be implemented using a linear deterministic topdown tree-to-string transducer. We call this property unparser-completeness. Our metalanguage is based on a linear deterministic top-down tree-to-string transducer. Recall that the goal of specifying the requirements of the metalanguage is to increase the maintainability of template based code generators, without being too restrictive. To validate that our metalanguage is not too restrictive and leads to better maintainable templates, we compared it with four off-the-shelf text template systems by implementing an unparser. We have observed that the industrial template evaluators provide a Turing complete metalanguage, but they do not contain a block scoping mechanism for the meta-variables. This results in undesired additional boilerplate meta code in their templates. The second contribution is guaranteeing the correctness of the generated code. Correctness of the generated code can be divided in two concerns: syntactical correctness and semantical correctness. We start with syntactical correctness of the generated code. The use of text templates implies that syntactical correctness of the generated code can only be detected at compilation time. This means that errors detected during the compilation are reported on the level of the generated code. The developer is required to trace back manually the errors to their origin in the template or input data. We believe that programs manipulating source code should not consider the object code as text to detect errors as early as possible. We present an approach where the grammars of the object language and metalanguage can be combined in a modular way. Combining both grammars allows parsing both languages simultaneously. Syntax errors in both languages of the template will be found while parsing it. Moreover, only parsing a template is not sufficient to ensure that the generated code will be free of syntax errors. The template evaluator must be equipped with a mechanism to guarantee its output will be syntactically correct. We discuss our mechanism in short. A parse tree is constructed during the parsing of the template. This tree contains subtrees for the object code and subtrees for the meta code. While evaluating the template, subtrees of the meta code are substituted by object code subtrees. The template evaluator checks whether the root nonterminal of the object code subtree is equal to the root nonterminal of the meta code subtree. When both are equal, it is allowed to substitute the meta code. When the root nonterminals are distinct an accurate error message is generated. The template evaluator terminates when all meta code subtrees are substituted. The result is a parse tree of the object language and thus syntactically correct. We call this process syntax safe code generation. In order to validate that the presented techniques increase maintainability and ensure syntactical correctness, we implemented our ideas in a syntax safe template evaluator called Repleo. Repleo has been applied in four case studies. The first case is a real world situation, where it is required to generate a three tier web application from a data model. This case showed that multiple layers of an applications defined in different programming languages can be generated from a single model. The second case and third case are used to show that our metalanguage results in a better maintainable code generator. Our metalanguage forces to use a two layer code generator with separation of concerns between the two layers, where the original implementations are less modular. The last case study shows that ensuring syntactical correctness results in the prevention of cross-site scripting attacks in dynamic generation of web pages. Recall that one of our goals was ensuring the correctness of the generated code. We also showed that is possible to check static semantic properties of templates. Static semantic checks are defined for the metalanguage, for the object language and checks for the situations where the object language is dependent on the metalanguage. We implemented a prototype of a static semantic checker for PicoJava templates using attribute grammars. The use of attribute grammars leads to re-use of the original PicoJava checker. Summarizing, in this thesis we have formulated the requirements for a metalanguage and discussed how to implement a syntax safe template evaluator. This results in better maintainable template based code generators and more reliable generated code

    Machine Assisted Proofs for Generic Semantics to Compiler Transformation Correctness Theorems

    Get PDF
    This thesis investigates the issues involved in the creation of a "general theory of operational semantics" in LEGO, a type-theoretic theorem proving environment implementing a constructionist logic. Such a general theory permits the ability to manipulate and reason about operational semantics both individually and as a class. The motivation for this lies in the studies of semantics directed compiler generation in which a set of generic semantics transforming functions can help convert arbitrary semantic definitions to abstract machines. Such transformations require correctness theorems that quantify over the class of operational semantics. In implementation terms this indicates the need to ensure both the class of operational semantics and the means of inferring results thereon remain at the theorem prover level. The endeavour of this thesis can be seen as assessing both the requirements that general theories of semantics impose on proof assistants and the efficacy of proof assistants in modelling such theories

    Large-scale semi-automated migration of legacy C/C++ test code

    Get PDF
    This is an industrial experience report on a large semi-automated migration of legacy test code in C and C++. The particular migration was enabled by automating most of the maintenance steps. Without automation this particular large-scale migration would not have been conducted, due to the risks involved in manual maintenance (risk of introducing errors, risk of unexpected rework, and loss of productivity). We describe and evaluate the method of automation we used on this real-world case. The benefits were that by automating analysis, we could make sure that we understand all the relevant details for the envisioned maintenance, without having to manually read and check our theories. Furthermore, by automating transformations we could reiterate and improve over complex and large scale source code updates, until they were ā€œjust right.ā€ The drawbacks were that, first, we have had to learn new metaprogramming skills. Second, our automation scripts are not readily reusable for other contexts; they were necessarily developed for this ad-hoc maintenance task. Our analysis shows that automated software maintenance as compared to the (hypothetical) manual alternative method seems to be better both in terms of avoiding mistakes and avoiding rework because of such mistakes. It seems that necessary and beneficial source code maintenance need not to be avoided, if software engineers are enabled to create bespoke (and ad-hoc) analysis and transformation tools to support it

    Unparsing expressions with prefix and postfix operators

    Full text link

    On Language Processors and Software Maintenance

    Get PDF
    This work investigates declarative transformation tools in the context of software maintenance. Besides maintenance of the language specification, evolution of a software language requires the adaptation of the software written in that language as well as the adaptation of the software that transforms software written in the evolving language. This co-evolution is studied to derive automatic adaptations of artefacts from adaptations of the language specification. Furthermore, AOP for Prolog is introduced to improve maintainability of language specifications and derived tools.Die Arbeit unterstĆ¼tzt deklarative Transformationswerkzeuge im Kontext der Softwarewartung. Neben der Wartung der Sprachbeschreibung erfordert die Evolution einer Sprache sowohl die Anpassung der Software, die in dieser Sprache geschrieben ist als auch die Anpassung der Software, die diese Software transformiert. Diese Koevolution wird untersucht, um automatische Anpassungen von Artefakten von Anpassungen der Sprachbeschreibungen abzuleiten. Weiterhin wird AOP fĆ¼r Prolog eingefĆ¼hrt, um die Wartbarkeit von Sprachbeschreibungen und den daraus abgeleiteten Werkzeugen zu erhƶhen
    corecore