Search CORE

71 research outputs found

Speculative Staging for Interpreter Optimization

Author: Brunthaler Stefan
Publication venue
Publication date: 08/10/2013
Field of study

Interpreters have a bad reputation for having lower performance than just-in-time compilers. We present a new way of building high performance interpreters that is particularly effective for executing dynamically typed programming languages. The key idea is to combine speculative staging of optimized interpreter instructions with a novel technique of incrementally and iteratively concerting them at run-time. This paper introduces the concepts behind deriving optimized instructions from existing interpreter instructions---incrementally peeling off layers of complexity. When compiling the interpreter, these optimized derivatives will be compiled along with the original interpreter instructions. Therefore, our technique is portable by construction since it leverages the existing compiler's backend. At run-time we use instruction substitution from the interpreter's original and expensive instructions to optimized instruction derivatives to speed up execution. Our technique unites high performance with the simplicity and portability of interpreters---we report that our optimization makes the CPython interpreter up to more than four times faster, where our interpreter closes the gap between and sometimes even outperforms PyPy's just-in-time compiler.Comment: 16 pages, 4 figures, 3 tables. Uses CPython 3.2.3 and PyPy 1.

arXiv.org e-Print Archive

CiteSeerX

Optimizing prolog for small devices: A case study

Author: Carro Liñares Manuel
Hermenegildo Manuel V.
Morales Caballero José Francisco
Muller Henk L.
Puebla Sánchez Alvaro Germán
Publication venue: Facultad de Informática (UPM)
Publication date: 01/05/2006
Field of study

In this paper we present the design and implementation of a wearable application in Prolog. The application program is a "sound spatializer." Given an audio signal and real time data from a head-mounted compass, a signal is generated for stereo headphones that will appear to come from a position in space. We describe high-level and low-level optimizations and transformations that have been applied in order to fit this application on the wearable device. The end application operates comfortably in real-time on a wearable computer, and has a memory foot print that remains constant over time enabling it to run on continuous audio streams. Comparison with a version hand-written in C shows that the C version is no more than 20-40% faster; a small price to pay for a high level description

Archivo Digital UPM

High-level languages for small devices: A case study

Author: Carro Liñares Manuel
Hermenegildo Manuel V.
Morales Caballero José Francisco
Muller Henk L.
Puebla Sánchez Alvaro Germán
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2006
Field of study

In this paper we study, through a concrete case, the feasibility of using a high-level, general-purpose logic language in the design and implementation of applications targeting wearable computers. The case study is a "sound spatializer" which, given real-time signáis for monaural audio and heading, generates stereo sound which appears to come from a position in space. The use of advanced compile-time transformations and optimizations made it possible to execute code written in a clear style without efñciency or architectural concerns on the target device, while meeting strict existing time and memory constraints. The final executable compares favorably with a similar implementation written in C. We believe that this case is representative of a wider class of common pervasive computing applications, and that the techniques we show here can be put to good use in a range of scenarios. This points to the possibility of applying high-level languages, with their associated flexibility, conciseness, ability to be automatically parallelized, sophisticated compile-time tools for analysis and verification, etc., to the embedded systems field without paying an unnecessary performance penalty

Archivo Digital UPM

Simple optimizing JIT compilation of higher-order dynamic programming languages

Author: Saleil Baptiste
Publication venue
Publication date: 01/05/2019
Field of study

Implémenter efficacement les langages de programmation dynamiques demande beaucoup d’effort de développement. Les compilateurs ne cessent de devenir de plus en plus complexes. Aujourd’hui, ils incluent souvent une phase d’interprétation, plusieurs phases de compilation, plusieurs représentations intermédiaires et des analyses de code. Toutes ces techniques permettent d’implémenter efficacement un langage de programmation dynamique, mais leur mise en oeuvre est difficile dans un contexte où les ressources de développement sont limitées. Nous proposons une nouvelle approche et de nouvelles techniques dynamiques permettant de développer des compilateurs performants pour les langages dynamiques avec de relativement bonnes performances et un faible effort de développement. Nous présentons une approche simple de compilation à la volée qui permet d’implémenter un langage en une seule phase de compilation, sans transformation vers des représentations intermédiaires. Nous expliquons comment le versionnement de blocs de base, une technique de compilation existante, peut être étendue, sans effort de développement significatif, pour fonctionner interprocéduralement avec les langages de programmation d’ordre supérieur, permettant d’appliquer des optimisations interprocédurales sur ces langages. Nous expliquons également comment le versionnement de blocs de base permet de supprimer certaines opérations utilisées pour implémenter les langages dynamiques et qui impactent les performances comme les vérifications de type. Nous expliquons aussi comment les compilateurs peuvent exploiter les représentations dynamiques des valeurs par Tagging et NaN-boxing pour optimiser le code généré avec peu d’effort de développement. Nous présentons également notre expérience de développement d’un compilateur à la volée pour le langage de programmation Scheme, pour montrer que ces techniques permettent effectivement de construire un compilateur avec un effort moins important que les compilateurs actuels et qu’elles permettent de générer du code efficace, qui rivalise avec les meilleures implémentations du langage Scheme.Efficiently implementing dynamic programming languages requires a significant development effort. Over the years, compilers have become more complex. Today, they typically include an interpretation phase, several compilation phases, several intermediate representations and code analyses. These techniques allow efficiently implementing these programming languages but are difficult to implement in contexts in which development resources are limited. We propose a new approach and new techniques to build optimizing just-in-time compilers for dynamic languages with relatively good performance and low development effort. We present a simple just-in-time compilation approach to implement a language with a single compilation phase, without the need to use code transformations to intermediate representations. We explain how basic block versioning, an existing compilation technique, can be extended without significant development effort, to work interprocedurally with higherorder programming languages allowing interprocedural optimizations on these languages. We also explain how basic block versioning allows removing operations used to implement dynamic languages that degrade performance, such as type checks, and how compilers can use Tagging and NaN-boxing to optimize the generated code with low development effort. We present our experience of building a JIT compiler using these techniques for the Scheme programming language to show that they indeed allow building compilers with less development effort than other implementations and that they allow generating efficient code that competes with current mature implementations of the Scheme language

Dépôt Institutionnel Numérique

Dynamic Compilation for Functional Programs

Author: Grabmüller Martin
Publication venue
Publication date: 19/06/2009
Field of study

Diese Arbeit behandelt die dynamische, zur Laufzeit stattfindende Übersetzung und Optimierung funktionaler Programme. Ziel der Optimierung ist die erhöhte Laufzeiteffizient der Programme, die durch die compilergesteuerte Eliminierung von Abstraktionen der Programmiersprache erreicht wird. Bei der Implementierung objekt-orientierter Programmiersprachen werden bereits seit mehreren Jahrzehnten Compiler-Techniken zur Laufzeit eingesetzt, um objekt-orientierte Programme effizient ausführen zu können. Spätestens seit der Einführung der Programmiersprache Java und ihres auf einer abstrakten Maschine basierenden Ausführungsmodells hat sich die Praktikabilität dieser Implementierungstechnik gezeigt. Viele Eigenschaften moderner Programmiersprachen konnten erst durch den Einsatz dynamischer Transformationstechniken effizient realisiert werden, wie zum Beispiel das dynamische Nachladen von Programmteilen (auch über Netzwerke), Reflection sowie verschiedene Sicherheitslösungen (z.B. Sandboxing). Ziel dieser Arbeit ist zu zeigen, dass rein funktionale Programmiersprachen auf ähnliche Weise effizient implementiert werden können, und sogar Vorteile gegenüber den allgemein eingesetzten objekt-orientierten Sprachen bieten, was die Effizienz, Sicherheit und Korrektheit von Programmen angeht. Um dieses Ziel zu erreichen, werden in dieser Arbeit Implementierungstechniken entworfen bzw. aus bestehenden Lösungen weiterentwickelt, welche die dynamische Kompilierung und Optimierung funktionaler Programme erlauben: zum einen präsentieren wir eine Programmzwischendarstellung (getypte dynamische Continuation-Passing-Style-Darstellung), welche sich zur dynamischen Kompilierung und Optimierung eignet. Basierend auf dieser Darstellung haben wir eine Erweiterung zur verzögerten und selektiven Codeerzeugung von Programmteilen entwickelt. Der wichtigste Beitrag dieser Arbeit ist die dynamische Spezialisierung zur Eliminierung polymorpher Funktionen und Datenstrukturen, welche die Effizienz funktionaler Programme deutlich steigern kann. Die präsentierten Ergebnisse experimenteller Messungen eines prototypischen Ausführungssystems belegen, dass funktionale Programme effizient dynamisch kompiliert werden können.This thesis is about dynamic translation and optimization of functional programs. The goal of the optimization is increased run-time efficiency, which is obtained by compiler-directed elimination of programming language abstractions. Object-oriented programming languages have been implemented for several decades using run-time compilation techniques. With the introduction of the Java programming language and its virtual machine-based execution model, the practicability of this implementation method for real-world applications has been proved. Many aspects of modern programming languages, such as dynamic loading and linking of code (even across networks), reflection and security solutions (e.g., sandboxing) can be realized efficiently only by using dynamic transformation techniques. The goal of this work is to show that functional programming languages can be efficiently implemented in a similar way, and that these languages even offer advantages when compared to more common object-oriented languages. Efficiency, security and correctness of programs is easier to ensure in the functional setting. Towards this goal, we design and develop implementation techniques to enable dynamic compilation and optimization of functional programming languages: we describe an intermediate representation for functional programs (typed dynamic continuation-passing style), which is well suited for dynamic compilation. Based on this representation, we have developed an extension for incremental and selective code generation. The main contribution of this work shows how dynamic specialization of polymorphic functions and data structures can increase the run-time efficiency of functional programs considerably. We present the results of experimental measurements for a prototypical implementation, which prove that functional programs can efficiently be dynamically compiled

DepositOnce

Detection and optimization of suspension-free logic programs

Author: Bigot Peter
Debray Saumya
Gudeman David
Publication venue: Published by Elsevier Inc.
Publication date
Field of study

AbstractIn recent years, language mechanisms to suspend, or delay, the execution of goals until certain variables become bound have become increasingly popular in logic programming languages. While convenient, such mechanisms can make control flow within a program difficult to predict at compile time, and therefore render many traditional compiler optimizations inapplicable. Unfortunately, this performance cost is also incurred by programs that do not use any delay primitives. In this paper, we describe a simple dataflow analysis for detecting computations where suspension effects can be ignored, and discuss several low-level optimizations that rely on this information. Our algorithm has been implemented in the jc system. Optimizations based on information it produces result in significant performance improvements, demonstrating speed comparable to or exceeding that of optimized C programs

Elsevier - Publisher Connector

Reusable semantics for implementation of Python optimizing compilers

Author: Melançon Olivier
Publication venue
Publication date: 01/08/2021
Field of study

Le langage de programmation Python est aujourd'hui parmi les plus populaires au monde grâce à son accessibilité ainsi que l'existence d'un grand nombre de librairies standards. Paradoxalement, Python est également reconnu pour ses performances médiocres lors de l'exécution de nombreuses tâches. Ainsi, l'écriture d’implémentations efficaces du langage est nécessaire. Elle est toutefois freinée par la sémantique complexe de Python, ainsi que par l’absence de sémantique formelle officielle. Pour régler ce problème, nous présentons une sémantique formelle pour Python axée sur l’implémentation de compilateurs optimisants. Cette sémantique est écrite de manière à pouvoir être intégrée et analysée aisément par des compilateurs déjà existants. Nous introduisons également semPy, un évaluateur partiel de notre sémantique formelle. Celui-ci permet d'identifier et de retirer automatiquement certaines opérations redondantes dans la sémantique de Python. Ce faisant, semPy génère une sémantique naturellement plus performante lorsqu'exécutée. Nous terminons en présentant Zipi, un compilateur optimisant pour le langage Python développé avec l'assistance de semPy. Sur certaines tâches, Zipi offre des performances compétitionnant avec celle de PyPy, un compilateur Python reconnu pour ses bonnes performances. Ces résultats ouvrent la porte à des optimisations basées sur une évaluation partielle générant une implémentation spécialisée pour les cas d'usage fréquent du langage.Python is among the most popular programming language in the world due to its accessibility and extensive standard library. Paradoxically, Python is also known for its poor performance on many tasks. Hence, more efficient implementations of the language are required. The development of such optimized implementations is nevertheless hampered by the complex semantics of Python and the lack of an official formal semantics. We address this issue by presenting a formal semantics for Python focussed on the development of optimizing compilers. This semantics is written as to be easily reusable by existing compilers. We also introduce semPy, a partial evaluator of our formal semantics. This tool allows to automatically target and remove redundant operations from the semantics of Python. As such, semPy generates a semantics which naturally executes more efficiently. Finally, we present Zipi, a Python optimizing compiler developped with the aid of semPy. On some tasks, Zipi displays performance competing with those of PyPy, a Python compiler known for its good performance. These results open the door to optimizations based on a partial evaluation technique which generates specialized implementations for frequent use cases

Dépôt Institutionnel Numérique

Supporting high-level, high-performance parallel programming with library-driven optimization

Author: Rodrigues Christopher
Publication venue
Publication date: 01/05/2014
Field of study

Parallel programming is a demanding task for developers partly because achieving scalable parallel speedup requires drawing upon a repertoire of complex, algorithm-specific, architecture-aware programming techniques. Ideally, developers of programming tools would be able to build algorithm-specific, high-level programming interfaces that hide the complex architecture-aware details. However, it is a monumental undertaking to develop such tools from scratch, and it is challenging to provide reusable functionality for developing such tools without sacrificing the hosted interface’s performance or ease of use. In particular, to get high performance on a cluster of multicore computers without requiring developers to manually place data and computation onto processors, it is necessary to combine prior methods for shared memory parallelism with new methods for algorithm-aware distribution of computation and data across the cluster. This dissertation presents Triolet, a programming language and compiler for high-level programming of parallel loops for high-performance execution on clusters of multicore computers. Triolet adopts a simple, familiar programming interface based on traversing collections of data. By incorporating semantic knowledge of how traversals behave, Triolet achieves efficient parallel execution and communication. Moreover, Triolet’s performance on sequential loops is comparable to that of low-level C code, ranging from seven percent slower to 2.8× slower on tested benchmarks. Triolet’s design demonstrates that it is possible to decouple the design of a compiler from the implementation of parallelism without sacrificing performance or ease of use: parallel and sequential loops are implemented as library code and compiled to efficient code by an optimizing compiler that is unaware of parallelism beyond the scope of a single thread. All handling of parallel work partitioning, data partitioning, and scheduling is embodied in library code. During compilation, library code is inlined into a program and specialized to yield customized parallel loops. Experimental results from a 128-core cluster (with 8 nodes and 16 cores per node) show that loops in Triolet outperform loops in Eden, a similar high-level language. Triolet achieves significant parallel speedup over sequential C code, with performance ranging from slightly faster to 4.3× slower than manually parallelized C code on compute-intensive loops. Thus, Triolet demonstrates that a library of container traversal functions can deliver cluster-parallel performance comparable to manually parallelized C code without requiring programmers to manage parallelism. This programming approach opens the potential for future research into parallel programming frameworks

Illinois Digital Environment for Access to Learning and Scholarship Repository