50 research outputs found

    Revisiting the Futamura Projections: A Diagrammatic Approach

    Get PDF
    The advent of language implementation tools such as PyPy and Truffle/Graal have reinvigorated and broadened interest in topics related to automatic compiler generation and optimization. Given this broader interest, we revisit the Futamura Projections using a novel diagram scheme. Through these diagrams we emphasize the recurring patterns in the Futamura Projections while addressing their complexity and abstract nature. We anticipate that this approach will improve the accessibility of the Futamura Projections and help foster analysis of those new tools through the lens of partial evaluation

    Accelerating interpreted programming languages on GPUs with just-in-time compilation and runtime optimisations

    Get PDF
    Nowadays, most computer systems are equipped with powerful parallel devices such as Graphics Processing Units (GPUs). They are present in almost every computer system including mobile devices, tablets, desktop computers and servers. These parallel systems have unlocked the possibility for many scientists and companies to process significant amounts of data in shorter time. But the usage of these parallel systems is very challenging due to their programming complexity. The most common programming languages for GPUs, such as OpenCL and CUDA, are created for expert programmers, where developers are required to know hardware details to use GPUs. However, many users of heterogeneous and parallel hardware, such as economists, biologists, physicists or psychologists, are not necessarily expert GPU programmers. They have the need to speed up their applications, which are often written in high-level and dynamic programming languages, such as Java, R or Python. Little work has been done to generate GPU code automatically from these high-level interpreted and dynamic programming languages. This thesis presents a combination of a programming interface and a set of compiler techniques which enable an automatic translation of a subset of Java and R programs into OpenCL to execute on a GPU. The goal is to reduce the programmability and usability gaps between interpreted programming languages and GPUs. The first contribution is an Application Programming Interface (API) for programming heterogeneous and multi-core systems. This API combines ideas from functional programming and algorithmic skeletons to compose and reuse parallel operations. The second contribution is a new OpenCL Just-In-Time (JIT) compiler that automatically translates a subset of the Java bytecode to GPU code. This is combined with a new runtime system that optimises the data management and avoids data transformations between Java and OpenCL. This OpenCL framework and the runtime system achieve speedups of up to 645x compared to Java within 23% slowdown compared to the handwritten native OpenCL code. The third contribution is a new OpenCL JIT compiler for dynamic and interpreted programming languages. While the R language is used in this thesis, the developed techniques are generic for dynamic languages. This JIT compiler uniquely combines a set of existing compiler techniques, such as specialisation and partial evaluation, for OpenCL compilation together with an optimising runtime that compile and execute R code on GPUs. This JIT compiler for the R language achieves speedups of up to 1300x compared to GNU-R and 1.8x slowdown compared to native OpenCL

    StreamJIT: A Commensal Compiler for High-Performance Stream Programming

    Get PDF
    There are many domain libraries, but despite the performance benefits of compilation, domain-specific languages are comparatively rare due to the high cost of implementing an optimizing compiler. We propose commensal compilation, a new strategy for compiling embedded domain-specific languages by reusing the massive investment in modern language virtual machine platforms. Commensal compilers use the host language's front-end, use host platform APIs that enable back-end optimizations by the host platform JIT, and use an autotuner for optimization selection. The cost of implementing a commensal compiler is only the cost of implementing the domain-specific optimizations. We demonstrate the concept by implementing a commensal compiler for the stream programming language StreamJIT atop the Java platform. Our compiler achieves performance 2.8 times better than the StreamIt native code (via GCC) compiler with considerably less implementation effort.United States. Dept. of Energy. Office of Science (X-Stack Award DE-SC0008923)Intel Corporation (Science and Technology Center for Big Data)SMART3 Graduate Fellowshi

    The parallel event loop model and runtime: a parallel programming model and runtime system for safe event-based parallel programming

    Get PDF
    Recent trends in programming models for server-side development have shown an increasing popularity of event-based single- threaded programming models based on the combination of dynamic languages such as JavaScript and event-based runtime systems for asynchronous I/O management such as Node.JS. Reasons for the success of such models are the simplicity of the single-threaded event-based programming model as well as the growing popularity of the Cloud as a deployment platform for Web applications. Unfortunately, the popularity of single-threaded models comes at the price of performance and scalability, as single-threaded event-based models present limitations when parallel processing is needed, and traditional approaches to concurrency such as threads and locks don't play well with event-based systems. This dissertation proposes a programming model and a runtime system to overcome such limitations by enabling single-threaded event-based applications with support for speculative parallel execution. The model, called Parallel Event Loop, has the goal of bringing parallel execution to the domain of single-threaded event-based programming without relaxing the main characteristics of the single-threaded model, and therefore providing developers with the impression of a safe, single-threaded, runtime. Rather than supporting only pure single-threaded programming, however, the parallel event loop can also be used to derive safe, high-level, parallel programming models characterized by a strong compatibility with single-threaded runtimes. We describe three distinct implementations of speculative runtimes enabling the parallel execution of event-based applications. The first implementation we describe is a pessimistic runtime system based on locks to implement speculative parallelization. The second and the third implementations are based on two distinct optimistic runtimes using software transactional memory. Each of the implementations supports the parallelization of applications written using an asynchronous single-threaded programming style, and each of them enables applications to benefit from parallel execution

    Fully reflective execution environments

    Get PDF
    Las Máquinas Virtuales (MV) son artefactos de software complejos. Sus responsabilidades abarcan desde realizar la semántica de algún lenguaje de programación en particular hasta garantizar propiedades tales como la eficiencia, la portabilidad y la seguridad de los programas. Actualmente, las MV son construidas como “cajas negras”, lo cual reduce significativamente la posibilidad de observar o modificar su comportamiento mientras están siendo ejecutadas. En este trabajo pregonamos que la falta de interacción entre las aplicaciones y las MV impone un límite a las posibilidades de adaptación de los programas, mientras están siendo ejecutados, ante nuevos requerimientos. Para solucionar esta limitación presentamos la noción de plataformas de ejecución reflexivas: un tipo especial de MV que promueve su propia inspección y modificación en tiempo de ejecución permitiendo de este modo a las aplicaciones reconfigurar el comportamiento de la MV cuando sus requerimientos cambian. Proponemos una arquitectura de referencia para construir plataformas de ejecución reflexivas e introducimos una serie de optimizaciones específicamente diseñadas para este tipo de plataformas. En particular proponemos aplicar técnicas de optimización especulativa, técnicas estándar en el contexto de los lenguajes dinámicos, a nivel dela MV misma. Para evaluar nuestro enfoque construimos dos plataformas de ejecución reflexivas, una basada en un compilador de métodos y la otra en un optimizador de trazas. Luego, analizamos una serie de casos de estudio que nos permitieron evaluar sus propiedades distintivas para lidiar con escenarios adaptativos. Comparamos nuestras implementaciones con soluciones alternativas de nivel de lenguaje y argumentamos porqué una plataforma de ejecución reflexiva potencialmente las subsume a todas. Por otra parte, mostramos empíricamente que las MV reflexivas pueden ejecutarse con un desempeño asintótico similar al de las MV estándar (no reflexivas) cuando las capacidades reflexivas no se usan. También que la degradación deldesempeño es bajo (comparado con las soluciones alternativas) cuando estos mecanismos sí son utilizados. Aprovechando nuestras dos implementaciones, estudiamos cómo impactan las diferentes familias de compiladores (por método vs. por trazas) en los resultados finales. Por último, realizamos una serie de experimentos con el objetivo de estudiarlos efectos de exponer el comportamiento de los módulos de compilación a las aplicaciones. Los resultados preliminares muestran que este es un enfoque plausiblepara mejorar el desempeño de aplicaciones sobre las cuales las heurísticas de los compiladores dinámicos producen resultados subóptimos.Many programming languages run on top of a Virtual Machine (VM). VMs are complex pieces of software because they realize the language semantics and provide efficiency, portability, and security. Unfortunately, mainstream VMs are engineered as “black boxes” and provide only minimal means to expose their state and behavior at run time. In this thesis we argue that the lack of interaction between applications and VMs put a limit on the adaptation capabilities of (running) applications. To overcome this situation we introduce the notion of fully reflective VM: a new kind of VM providing reflection not only at the application but also at the VM level. In other words, a fully reflective VM provides means to support its own observability and modifiability at run time and enables programming languages to interact with and adapt the underlying VM to changing requirements. We propose a reference architecture for such VMs and discuss some challenges in terms of performance degradation that these systems may induce. We then introduce a series of optimizations targeted specially to this kind of platforms. They are based on a key assumption: that the variability of the VM behavior tend to be low at run time. Accordingly, we apply standard dynamic compilation techniques such as specialization, speculation, and deoptimization on the VM code itself as a means tomitigate the overheads. To validate our claims we built two reflective VMs, one featuring a methodbased just in time (JIT) compiler and the other running on top of a trace-based optimizer. We start our evaluation by analyzing a series of case studies to understand how a reflective VM could deal with unanticipated adaptation scenarios on the fly. Furthermore, we compare our approach with the existing language-level alternatives and provide an elaborated discussion on why a reflective VM would subsume all of them. Then, we empirically show that our implementations can feature similar peak performance of that of standard VMs when their reflective mechanisms are not activated. Moreover, they present low overheads (in comparison to existing alternatives) when VM’s reflective capabilities are used. We also analyzed how the different compilation strategies (per-method vs. tracing) impact on the overallresults. Finally, we conduct a series of experiments in order to study the effects of opening up the compilation module of a reflective VM to the applications. We conclude that it is a plausible approach that brings new opportunities for optimizing algorithms in which the compiler heuristics fail to give optimal results.Fil: Chari, Guido Martín. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina

    Efficient Implementation of Parametric Polymorphism using Reified Types

    Get PDF
    Parametric polymorphism is a language feature that lets programmers define code that behaves independently of the types of values it operates on. Using parametric polymorphism enables code reuse and improves the maintainability of software projects. The approach that a language implementation uses to support parametric polymorphism can have important performance implications. One such approach, erasure, converts generic code to non-generic code that uses a uniform representation for generic data. Erasure is notorious for introducing primitive boxing and other indirections that harm the performance of generic code. More generally, erasure destroys type information that could be used by the language implementation to optimize generic code. This thesis presents TASTyTruffle, a new interpreter for the Scala language. Whereas the standard Scala implementation executes erased Java Virtual Machine (JVM) bytecode, TASTyTruffle interprets TASTy, a different representation that has precise type information. This thesis explores how the type information present in TASTy empowers TASTyTruffle to implement generic code more effectively. In particular, TASTy's type information allows TASTyTruffle to reify types as objects that can be passed around the interpreter. These reified types are used to support heterogeneous box-free representations of generic values. Reified types also enable TASTyTruffle to create specialized, monomorphic copies of generic code that can be easily and reliably optimized by its just-in-time (JIT) compiler. Empirically, TASTyTruffle is competitive with the standard JVM implementation. Both implementations perform similarly on monomorphic workloads, but when generic code is used with multiple types, TASTyTruffle consistently outperforms the JVM. TASTy's type information enables TASTyTruffle to find additional optimization opportunities that could not be uncovered with erased JVM bytecode alone

    The parallel event loop model and runtime: a parallel programming model and runtime system for safe event-based parallel programming

    Get PDF
    Recent trends in programming models for server-side development have shown an increasing popularity of event-based single- threaded programming models based on the combination of dynamic languages such as JavaScript and event-based runtime systems for asynchronous I/O management such as Node.JS. Reasons for the success of such models are the simplicity of the single-threaded event-based programming model as well as the growing popularity of the Cloud as a deployment platform for Web applications. Unfortunately, the popularity of single-threaded models comes at the price of performance and scalability, as single-threaded event-based models present limitations when parallel processing is needed, and traditional approaches to concurrency such as threads and locks don't play well with event-based systems. This dissertation proposes a programming model and a runtime system to overcome such limitations by enabling single-threaded event-based applications with support for speculative parallel execution. The model, called Parallel Event Loop, has the goal of bringing parallel execution to the domain of single-threaded event-based programming without relaxing the main characteristics of the single-threaded model, and therefore providing developers with the impression of a safe, single-threaded, runtime. Rather than supporting only pure single-threaded programming, however, the parallel event loop can also be used to derive safe, high-level, parallel programming models characterized by a strong compatibility with single-threaded runtimes. We describe three distinct implementations of speculative runtimes enabling the parallel execution of event-based applications. The first implementation we describe is a pessimistic runtime system based on locks to implement speculative parallelization. The second and the third implementations are based on two distinct optimistic runtimes using software transactional memory. Each of the implementations supports the parallelization of applications written using an asynchronous single-threaded programming style, and each of them enables applications to benefit from parallel execution

    Fully Reflective Execution Environments: Virtual Machines for More Flexible Software

    Get PDF
    VMs are complex pieces of software that implement programming language semantics in an efficient, portable, and secure way. Unfortunately, mainstream VMs provide applications with few mechanisms to alter execution semantics or memory management at run time. We argue that this limits the evolvability and maintainability of running systems for both, the application domain, e.g., to support unforeseen requirements, and the VM domain, e.g., to modify the organization of objects in memory. This work explores the idea of incorporating reflective capabilities into the VM domain and analyzes its impact in the context of software adaptation tasks. We characterize the notion of a fully reflective VM, a kind of VM that provides means for its own observability and modifiability at run time. This enables programming languages to adapt the underlying VM to changing requirements. We propose a reference architecture for such VMs and present TruffleMATE as a prototype for this architecture. We evaluate the mechanisms TruffleMATE provides to deal with unanticipated dynamic adaptation scenarios for security, optimization, and profiling aspects. In contrast to existing alternatives, we observe that TruffleMATE is able to handle all scenarios, using less than 50 lines of code for each, and without interfering with the application's logic

    Towards Self-Adaptable Languages

    Get PDF
    International audienceOver recent years, self-adaptation has become a concern for many software systems that have to operate in complex and changing environments. At the core of self-adaptation, there is a feedback loop and associated trade-off reasoning to decide on the best course of action. However, existing software languages do not abstract the development and execution of such feedback loops for self-adaptable systems. Developers have to fall back to ad-hoc solutions to implement self-adaptable systems, often with wide-ranging design implications (e.g., explicit MAPE-K loop). Furthermore, existing software languages do not capitalize on monitored usage data of a language and its modeling environment. This hinders the continuous and automatic evolution of a software language based on feedback loops from the modeling environment and runtime software system. To address the aforementioned issues, this paper introduces the concept of Self-Adaptable Language (SAL) to abstract the feedback loops at both system and language levels. We propose L-MODA (Language, Models, and Data) as a conceptual reference framework that characterizes the possible feedback loops abstracted into a SAL. To demonstrate SALs, we present emerging results on the abstraction of the system feedback loop into the language semantics. We report on the concept of Self-Adaptable Virtual Machines as an example of semantic adaptation in a language interpreter and present a roadmap for SALs
    corecore