9 research outputs found

    Data Validation Infrastructure for R

    Get PDF
    Checking data quality against domain knowledge is a common activity that pervades statistical analysis from raw data to output. The R package validate facilitates this task by capturing and applying expert knowledge in the form of validation rules: logical restrictions on variables, records, or data sets that should be satisfied before they are considered valid input for further analysis. In the validate package, validation rules are objects of computation that can be manipulated, investigated, and confronted with data or versions of a data set. The results of a confrontation are then available for further investigation, summarization or visualization. Validation rules can also be endowed with metadata and documentation and they may be stored or retrieved from external sources such as text files or tabular formats. This data validation infrastructure thus allows for systematic, user-defined definition of data quality requirements that can be reused for various versions of a data set or by data correction algorithms that are parameterized by validation rules

    Findel: Secure Derivative Contracts for Ethereum

    Get PDF
    Blockchain-based smart contracts are considered a promising technology for handling financial agreements securely. In order to realize this vision, we need a formal language to unambiguously describe contract clauses. We introduce Findel -- a purely declarative financial domain-specific language (DSL) well suited for implementation in blockchain networks. We implement an Ethereum smart contract that acts as a marketplace for Findel contracts and measure the cost of its operation. We analyze challenges in modeling financial agreements in decentralized networks and outline directions for future work

    Spreadsheet engineering

    Get PDF
    These tutorial notes present a methodology for spreadsheet engineering. First, we present data mining and database techniques to reason about spreadsheet data. These techniques are used to compute relationships between spreadsheet elements (cells/columns/rows). These relations are then used to infer a model defining the business logic of the spreadsheet. Such a model of a spreadsheet data is a visual domain specific language that we embed in a well-known spreadsheet system. The embedded model is the building block to define techniques for modeldriven spreadsheet development, where advanced techniques are used to guarantee the model-instance synchronization. In this model-driven environment, any user data update as to follow the the model-instance conformance relation, thus, guiding spreadsheet users to introduce correct data. Data refinement techniques are used to synchronize models and instances after users update/evolve the model. These notes brie y describe our model-driven spreadsheet environment, the MDSheet environment, that implements the presented methodology. To evaluate both proposed techniques and the MDSheet tool, we have conducted, in laboratory sessions, an empirical study with the summer school participants. The results of this study are presented in these notes

    Modular and type-safe definition of Attribute Grammars with AspectAG

    Get PDF
    AspectAG is a Haskell-embedded domain-specific language (EDSL) that encodes first-class attribute grammars (AGs). AspectAG ensures the wellformedness of AGs at compile time by using extensible records and predicates encoded using old-fashioned type-level programming features, such as multiparameter type classes and functional dependencies. AspectAG suffers the usual drawbacks of EDSLs: when type errors occur they usually do not deliver error messages that refer to domain terms, but to the host language. Often, implementation details of the EDSL are leaked in those messages. The use of type-level programming techniques makes the situation worse since type-level abstraction mechanisms are quite poor. Additionally, old-fashioned type-level programs are untyped at type-level, which is inconsistent with the general approach of strongly-typed functional programming. By using modern Haskell extensions and techniques we propose a reworked version of AspectAG that tackles those weaknesses. New AG definitions are safer, both at the level of types and at the level of kinds. Furthemore, a set of identified domain-specific errors are reported with DSL-oriented messages. To achieve this, we define and use a framework for manipulating type errors that can be used in any EDSL. We show the pragmatics of AspectAG by defining languages and extending them both with new syntax and semantics. We use MateFun, a purelyfunctional language used to teach mathematics as a case study.AspectAG es un lenguaje de dominio específico embebido (EDSL) que codifica gramáticas de atributos (AGs) como ciudadanos de primera clase. AspectAG garantiza la buena formación de las AGs en tiempo de compilación por medio del uso de registros extensibles y predicados, codificados gracias al uso de características antiguas de programación a nivel de tipos, como clases multiparámetro y dependencias funcionales. AspectAG sufre las desventajas usuales de los EDSLs: cuando ocurren errores de tipado, los mensajes de error reportados no se expresan en términos del dominio, sino del lenguaje anfitrión. También es usual que detalles de implementación del EDSL se vean filtrados en estos mensajes. El uso de técnicas de programación a nivel de tipos agrava la situación porque los mecanismos de abstracción a nivel de tipos son pobres. Ademas, las técnicas de programación a nivel de tipos usadas en AspectAG son esencialmente no tipadas, lo que es inconsistente con nuestro enfoque de tipado fuerte. Usando extensiones modernas al sistema de tipos de Haskell, proponemos una nueva versión de la biblioteca AspectAG, abordando los problemas antes mencionados. Las nuevas definiciones de AGs son mas seguras tanto a nivel de tipado como a nivel de kinds (tipado a nivel de tipos). Ademas, un conjunto identificado de errores específicos del dominio son reportados con mensajes referentes al mismo. Para lograr esto, definimos y utilizamos un framework para manipular errores de tipado, que puede ser aplicado a cualquier EDSL. Mostramos la pragmática de AspectAG definiendo lenguajes y extendiéndoles con nueva sintaxis y con nueva semántica. Utilizamos el lenguaje MateFun, un lenguaje funcional puro utilizado para enseñar matemáticas como caso de estudio

    Modelling language for biology with applications

    Get PDF
    Understanding the links between biological processes at multiple scales, from molecular regulation to populations and evolution along with their interactions with the environment, is a major challenge in understanding life. Apart from understanding this is also becoming important in attempts to engineer traits, for example in crops, starting from genetics or from genomes and at different environmental conditions (genotype x environment → trait). As systems become more complex relying on intuition alone is not enough and formal modelling becomes necessary for integrating data across different processes and allowing us to test hypotheses. The more complex the systems become, however, the harder the modelling process becomes and the harder the models become to read and write. In particular intuitive formalisms like Chemical Reaction Networks are not powerful enough to express ideas at higher levels, for example dynamic environments, dynamic state spaces, and abstraction relations between different parts of the model. Other formalisms are more powerful (for example general purpose programming languages) but they lack the readability of more domain specific approaches. The first contribution of this thesis is a modelling language with stochastic semantics, Chromar, that extends the visually intuitive formalisms of reactions, in which simple objects, called agents, are extended with attributes. Dynamics are given as stochastic rules that can operate on the level of agents (removing/adding) or at the level of attributes (updating their values). Chromar further allows the seamless integration of time and state functions with the normal set of expressions – crucial in multi-scale plant models for describing the changing environment and abstractions between scales. This leads to models that are both formal enough for simulations and easy to read and write. The second contribution of this thesis is a whole-life-cycle multi-model of the growth and reproduction of Arabidopsis Thaliana, FM-life, expressed in a declarative way in Chromar. It combines phenology models from ecology to time developmental processes and physical development, which allows to scale to the population and address ecological questions at different genotype x environment scenarios. This is a step in the path for mechanistic links between genotype x environment and higher-level crop traits. Finally, I show a way of using optimal control techniques to engineer traits of plants by controlling their growth environmental conditions. In particular we explore (i) a direct problem where the control is temperature – assuming homogeneous growth conditions and (ii) indirect problem where the control is the position of the plants – assuming inhomogeneous growth conditions

    Functional programming for domain-specific languages

    No full text
    Abstract. Domain-specific languages become effective only in the presence of convenient lightweight tools for defining, implementing, and optimizing new languages. Functional programming provides a promising framework for such tasks; FP and DSLs are natural partners. In these lectures we will discuss FP techniques for DSLs—especially standalone versus embedded DSLs, and shallow versus deep embeddings.
    corecore