8,675 research outputs found

    Building Efficient Query Engines in a High-Level Language

    Get PDF
    Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance, instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain and extend. In this article, we realize this vision in the domain of analytical query processing. We present LegoBase, a query engine written in the high-level language Scala. The key technique to regain efficiency is to apply generative programming: LegoBase performs source-to-source compilation and optimizes the entire query engine by converting the high-level Scala code to specialized, low-level C code. We show how generative programming allows to easily implement a wide spectrum of optimizations, such as introducing data partitioning or switching from a row to a column data layout, which are difficult to achieve with existing low-level query compilers that handle only queries. We demonstrate that sufficiently powerful abstractions are essential for dealing with the complexity of the optimization effort, shielding developers from compiler internals and decoupling individual optimizations from each other. We evaluate our approach with the TPC-H benchmark and show that: (a) With all optimizations enabled, LegoBase significantly outperforms a commercial database and an existing query compiler. (b) Programmers need to provide just a few hundred lines of high-level code for implementing the optimizations, instead of complicated low-level code that is required by existing query compilation approaches. (c) The compilation overhead is low compared to the overall execution time, thus making our approach usable in practice for compiling query engines

    Verification of Imperative Programs by Constraint Logic Program Transformation

    Full text link
    We present a method for verifying partial correctness properties of imperative programs that manipulate integers and arrays by using techniques based on the transformation of constraint logic programs (CLP). We use CLP as a metalanguage for representing imperative programs, their executions, and their properties. First, we encode the correctness of an imperative program, say prog, as the negation of a predicate 'incorrect' defined by a CLP program T. By construction, 'incorrect' holds in the least model of T if and only if the execution of prog from an initial configuration eventually halts in an error configuration. Then, we apply to program T a sequence of transformations that preserve its least model semantics. These transformations are based on well-known transformation rules, such as unfolding and folding, guided by suitable transformation strategies, such as specialization and generalization. The objective of the transformations is to derive a new CLP program TransfT where the predicate 'incorrect' is defined either by (i) the fact 'incorrect.' (and in this case prog is not correct), or by (ii) the empty set of clauses (and in this case prog is correct). In the case where we derive a CLP program such that neither (i) nor (ii) holds, we iterate the transformation. Since the problem is undecidable, this process may not terminate. We show through examples that our method can be applied in a rather systematic way, and is amenable to automation by transferring to the field of program verification many techniques developed in the field of program transformation.Comment: In Proceedings Festschrift for Dave Schmidt, arXiv:1309.455

    CRUD-DOM: a model for bridging the gap between the object-oriented and the relational paradigms

    Get PDF
    Best Paper AwardObject-oriented programming is the most successful programming paradigm. Relational database management systems are the most successful data storage components. Despite their individual successes and their desirable tight binding, they rely on different points of view about data entailing difficulties on their integration. Some solutions have been proposed to overcome these difficulties, such as Embedded SQL, object/relational mappings (O/RM), language extensions and even Call Level Interfaces (CLI), as JDBC and ADO.NET. In this paper we present a new model aimed at integrating object-oriented languages and relational databases, named CRUD Data Object Model (CRUD-DOM). CRUDDOM relies on CLI (JDBC) and aims not only at exploring CLI advantages as preserving its performance and SQL expressiveness but also on providing a typestate approach for the implementation of the ResultSet interface. The model design aims to facilitate the development of automatic code generation tools. We also present such a tool, called CRUD Manager (CRUD-M), which provides automatic code generation with a complementary support for software maintenance. This paper shows that CRUD-DOM is an effective model to address the aforementioned objectives.(undefined

    Fundamentals of object-oriented languages, systems, and methods : Seminar 9434, August 22-26, 1994

    Get PDF

    Fundamentals of object-oriented languages, systems, and methods : Seminar 9434, August 22-26, 1994

    Get PDF

    Model query transformation framework- MQT: from EMF-based model query languages to persistence-spefic query languages

    Get PDF
    Memory problems of XML Metadata Interchange (XMI) (default persistence in Eclipse Modelling Framework (EMF)) when operating large models, have motivated the appearance of alternative mechanisms for persistence of EMF models. Most recent approaches propose using database back-ends. These approaches provide support for querying models using EMF-based model query languages (Plain EMF, Object Constraint Language (OCL), EMF Query, Epsilon Object Language (EOL), etc.). However, these languages commonly require loading in-memory all the model elements that are involved in the query. In the case of queries that traverse models (most commonly used type of queries) they require to load entire model in-memory. This loading strategy causes memory problems when operated models are large. Most database back-ends provide database-specific query languages that leverage capabilities of the database engine (better performance) and without requiring in-memory load of models for query execution (lower memory footprint). For example, Structured Query Language (SQL) is a query language for relational databases and Cypher is for Neo4J databases. In this dissertation we present MQT-Engine, a framework that supports execution of model query languages but with the e ciency (in terms of memory and performance) of a database-specifoc query language. To achieve this, MQT-Engine provides a two-step query transformation mechanism: forst, queries expressed with a model query language are transformed into a Query Language Independent Model (QLI Model); and then QLI Model is transformed into a database-specifoc query that is executed directly over the database. This mechanism provides extensibility and reusability to the framework, since it facilitates the inclusion of new query languages at both sides of the transformation. A prototype of the framework is provided. It supports transformation of EOL queries into SQL queries that are executed directly over a relational Connected Data Objects (CDO) repository. The prototype has been evaluated with two experimental evaluations. First evaluation is based on the reverse engineering domain. It compares time and memory usage required by MQT-Engine and other query languages (EMF API, OCL and SQL) to execute a set of queries over models persisted with CDO. Second evaluation is based on the railway domain, and compares performance results of MQT-Engine and other query languages (EMF API, OCL, IncQuery, SQL, etc.) for executing a set of queries. Obtained results show that MQT-Engine is able to execute successfully all the evaluated experiments. MQT-Engine is one of the evaluated solutions showing best performance results for first execution of model queries. In the case of query languages executed over CDO repositories, it is the faster solution and the one requiring less memory. For example, for the largest model in the reverse engineering case it is up to 162 times faster than a model query language executed at client-side, and it requires 23 times less memory. Additionally, the query transformation overload is constant and small (less than 2 seconds). These results validate the main goal of this dissertation: to provide a framework that gives to the model engineers the ability for specifying queries in a model query language, and then execute them with a performance and memory footprint similar to that of a persistence-specific query language. However, the framework has a set of limitations: the approach should be optimized when queries are subsequently executed; it only supports nonmodification model traversal queries; and the prototype is specific for EOL queries over CDO repositories with DBStore. Therefore, it is planned to extend the framework and address these limitations in a future version.Los problemas de memoria de XMI (mecanismo de persistencia por defecto en EMF) cuando se trabaja con modelos grandes, han motivado la aparición de mecanismos de persistencia alternativos para los modelos EMF. Los enfoques más recientes proponen el uso de bases de datos para la persistencia de los modelos. La mayoría de estos enfoques soportan la ejecución de operaciones usando lenguajes de consulta de modelos basados en EMF (EMF API, OCL, EMF Query, EOL, etc.). Sin embargo, este tipo de lenguajes necesitan almacenar en memoria al menos todos los elementos implicados en la consulta (todos los elementos del modelo en las consultas que recorren completamente el modelo consultado). Esta estrategia de carga de la información para hacer las consultas provoca problemas de memoria cuando los modelos son de gran tamaño. La mayoría de las bases de datos tienen lenguajes específicos que aprovechan las capacidades del motor de la base de datos (mayor rapidez) y sin la necesidad de cargar en memoria los modelos (menor uso de memoria). Por ejemplo, SQL es el lenguaje específico para las bases de datos relacionales y Cypher para las bases de datos Neo4J. Este trabajo propone MQT-Engine, un framework que permite ejecutar lenguajes de consulta para modelos con tiempos de ejecución y uso de memoria similares al de un lenguaje específico de base de datos. MQT-Engine realiza una transformación en dos pasos de las consultas: primero transforma las consultas que han sido escritas con un lenguaje de consulta para modelos en un modelo que es independiente del lenguaje (QLI Model); después, el modelo generado se transforma en una consulta equivalente, pero escrita con un lenguaje específico de base de datos. La transformación en dos pasos proporciona extensibilidad y reusabilidad ya que facilita la inclusión de nuevos lenguajes. Se ha implementado un prototipo de MQT-Engine que transforma consultas EOL en SQL y las ejecuta directamente sobre un repositorio CDO. El prototipo se ha evaluado con dos casos de uso. El primero está basado en el dominio de la ingeniería inversa. Se han comparado los tiempos de ejecución y el uso de memoria que necesitan MQT-Engine y otros lenguajes de consulta (EMF API, OCL y SQL) para ejecutar una serie de consultas sobre modelos persistidos en CDO. El segundo caso de uso está basado en el dominio de los ferrocarriles y compara los tiempos de ejecución que necesitan MQT-Engine y otros lenguajes (EMF API, OCL, IncQuery, etc.) para ejecutar varias consultas. Los resultados obtenidos muestran que MQT-Engine es capaz de ejecutar correctamente todos los experimentos y además es una de las soluciones con mejores tiempos para la primera ejecución de las consultas de modelos. MQTEngine es la opción más rápida y que necesita menos memoria entre los lenguajes ejecutados sobre repositorios CDO. Por ejemplo, en el caso del modelo más grande de ingeniería inversa, MQT-Engine es 162 veces más rápido y necesita 23 veces menos memoria que los lenguajes de consulta de modelos ejecutados al lado del cliente. Además, la sobrecarga de la transformación es pequeña y constante (menos de 2 segundos). Estos resultados prueban el objetivo principal de esta tesis: proporcionar un framework que permite a los ingenieros de modelos definir las consultas con un lenguaje de consulta de modelos y además ejecutarlas con una con tiempos de ejecución y uso de memoria similares a los de un lenguaje específico de bases de datos. Sin embargo, la solución tiene una serie de limitaciones: solo soporta consultas que recorren el modelo completamente y sin modificarlo; el prototipo es específico para consultas en EOL y sobre repositorios CDO (relacionales); y habría que optimizar la ejecución de las consultas cuando estas se ejecutan más de una vez. Se ha planeado resolver estas limitaciones en versiones futuras del trabajo

    Database-Inspired Optimizations for Statistical Analysis

    Get PDF
    Computing complex statistics on large amounts of data is no longer a corner case, but a daily challenge. However, current tools such as GNU R were not built to efficiently handle large data sets. We propose to vastly improve the execution of R scripts by interpreting them as a declaration of intent rather than an imperative order set in stone. This allows us to apply optimization techniques from the columnar data management research field. We have implemented several of these optimizers in Renjin, an open-source execution environment for R scripts targeted at the Java virtual machine. The demonstration of our approach using a series of micro-benchmarks and experiments on complex survey analysis show orders-of-magnitude improvements in analysis cost

    Análisis de recursos de programas enteros y abstractos

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Sistemas lnformáticos y de Computación, leída el 27-05-2022Since the beginning of automated computing in the middle of the last century, the development of computer science has been linked to an increasing importance in all areas of the current society. The inclusion of computer science processes in everyday life and, in particular, its inclusion in critical situations, cannot go linked only to the generation of hardware and software, but also to the analysis and verification of all its components. While hardware analysis is crucial for the generation and maintenance of the computation infrastructure, as it is able to detect or predict components that can have a wrong behavior, software analysis focuses on analyzing the behavior of computer programs to address properties such as security, correctness or optimality. Depending on the type of analysis applied to the software, we can detect potential vulnerabilities in the code, find incorrect specifications, apply optimizations based on the maximun and minimun cost of the programs, calculate the resource consumption of a program..Desde el comienzo de la computación automática a mediados del siglo pasado, el avance de la informática ha ido ligado a una cada vez mayor importancia en todos los ámbitos d ela sociedad actual. La inclusión de procesos informáticos en la vida cotidiana y, en particular, su inclusión en situaciones críticas, no puede ir ligada solo a la generación del hardware el software, sino también al análisis y verificación de todos sus componentes. Mientras que el análisis de hardware es crucial para la generación de la infraestructura informática y el mantenimiento de la misma, detectando o prediciendo componentes que puedan funcionar de manera errónea, el análisis de software se enfoca hacia el análisis del comportamiento de los programas informáticos para abordar propiedades como la seguridad, la corrección o la optimalidad. Dependiendo del tipo de análisis aplicado al software, podremos detectar fragmentos de código potencialmente vulnerables, especificaciones incorrectas, aplicar optimizaciones en base al coste máximo y mínimo de los programas, calcular el consumo de recursos de un programa...Fac. de InformáticaTRUEunpu

    IS2020 A Competency Model for Undergraduate Programs in Information Systems: The Joint ACM/AIS IS2020 Task Force

    Get PDF
    The IS2020 report is the latest in a series of model curricula recommendations and guidelines for undergraduate degrees in Information Systems (IS). The report builds on the foundations developed in previous model curricula reports to develop a major revision of the model curriculum with the inclusion of significant new characteristics. Specifically, the IS2020 report does not directly prescribe a degree structure that targets a specific context or environment. Rather, the IS2020 report provides guidance regarding the core content of the curriculum that should be present but also provides flexibility to customize curricula according to local institutional needs
    • …
    corecore