10 research outputs found

    Simple and Effective Type Check Removal through Lazy Basic Block Versioning

    Get PDF
    Dynamically typed programming languages such as JavaScript and Python defer type checking to run time. In order to maximize performance, dynamic language VM implementations must attempt to eliminate redundant dynamic type checks. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly complex multi-tiered VM architectures. This paper introduces lazy basic block versioning, a simple JIT compilation technique which effectively removes redundant type checks from critical code paths. This novel approach lazily generates type-specialized versions of basic blocks on-the-fly while propagating context-dependent type information. This does not require the use of costly program analyses, is not restricted by the precision limitations of traditional type analyses and avoids the implementation complexity of speculative optimization techniques. We have implemented intraprocedural lazy basic block versioning in a JavaScript JIT compiler. This approach is compared with a classical flow-based type analysis. Lazy basic block versioning performs as well or better on all benchmarks. On average, 71% of type tests are eliminated, yielding speedups of up to 50%. We also show that our implementation generates more efficient machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on several benchmarks. The combination of implementation simplicity, low algorithmic complexity and good run time performance makes basic block versioning attractive for baseline JIT compilers

    Future value based single assignment program representations and optimizations

    Get PDF
    An optimizing compiler internal representation fundamentally affects the clarity, efficiency and feasibility of optimization algorithms employed by the compiler. Static Single Assignment (SSA) as a state-of-the-art program representation has great advantages though still can be improved. This dissertation explores the domain of single assignment beyond SSA, and presents two novel program representations: Future Gated Single Assignment (FGSA) and Recursive Future Predicated Form (RFPF). Both FGSA and RFPF embed control flow and data flow information, enabling efficient traversal program information and thus leading to better and simpler optimizations. We introduce future value concept, the designing base of both FGSA and RFPF, which permits a consumer instruction to be encountered before the producer of its source operand(s) in a control flow setting. We show that FGSA is efficiently computable by using a series T1/T2/TR transformation, yielding an expected linear time algorithm for combining together the construction of the pruned single assignment form and live analysis for both reducible and irreducible graphs. As a result, the approach results in an average reduction of 7.7%, with a maximum of 67% in the number of gating functions compared to the pruned SSA form on the SPEC2000 benchmark suite. We present a solid and near optimal framework to perform inverse transformation from single assignment programs. We demonstrate the importance of unrestricted code motion and present RFPF. We develop algorithms which enable instruction movement in acyclic, as well as cyclic regions, and show the ease to perform optimizations such as Partial Redundancy Elimination on RFPF

    Profile-guided redundancy elimination

    Full text link
    Program optimisations analyse and transform the programs such that better performance results can be achieved. Classical optimisations mainly use the static properties of the programs to analyse program code and make sure that the optimisations work for every possible combination of the program and the input data. This approach is conservative in those cases when the programs show the same runtime behaviours for most of their execution time. On the other hand, profile-guided optimisations use runtime profiling information to discover the aforementioned common behaviours of the programs and explore more optimisation opportunities, which are missed in the classical, non-profile-guided optimisations. Redundancy elimination is one of the most powerful optimisations in compilers. In this thesis, a new partial redundancy elimination (PRE) algorithm and a partial dead code elimination algorithm (PDE) are proposed for a profile-guided redundancy elimination framework. During the design and implementation of the algorithms, we address three critical issues: optimality, feasibility and profitability. First, we prove that both our speculative PRE algorithm and our region-based PDE algorithm are optimal for given edge profiling information. The total number of dynamic occurrences of redundant expressions or dead codes cannot be further eliminated by any other code motion. Moreover, our speculative PRE algorithm is lifetime optimal, which means that the lifetimes of new introduced temporary variables are minimised. Second, we show that both algorithms are practical and can be efficiently implemented in production compilers. For SPEC CPU2000 benchmarks, the average compilation overhead for our PRE algorithm is 3%, and the average overhead for our PDE algorithm is less than 2%. Moreover, edge profiling rather than expensive path profiling is sufficient to guarantee the optimality of the algorithms. Finally, we demonstrate that the proposed profile-guided redundancy elimination techniques can provide speedups on real machines by conducting a thorough performance evaluation. To the best of our knowledge, this is the first performance evaluation of the profile-guided redundancy elimination techniques on real machines

    Analyse und Transformation kontrollfluáparalleler Programme

    Get PDF

    On the fly type specialization without type analysis

    Full text link
    Les langages de programmation typés dynamiquement tels que JavaScript et Python repoussent la vérification de typage jusqu’au moment de l’exécution. Afin d’optimiser la performance de ces langages, les implémentations de machines virtuelles pour langages dynamiques doivent tenter d’éliminer les tests de typage dynamiques redondants. Cela se fait habituellement en utilisant une analyse d’inférence de types. Cependant, les analyses de ce genre sont souvent coûteuses et impliquent des compromis entre le temps de compilation et la précision des résultats obtenus. Ceci a conduit à la conception d’architectures de VM de plus en plus complexes. Nous proposons le versionnement paresseux de blocs de base, une technique de compilation à la volée simple qui élimine efficacement les tests de typage dynamiques redondants sur les chemins d’exécution critiques. Cette nouvelle approche génère paresseusement des versions spécialisées des blocs de base tout en propageant de l’information de typage contextualisée. Notre technique ne nécessite pas l’utilisation d’analyses de programme coûteuses, n’est pas contrainte par les limitations de précision des analyses d’inférence de types traditionnelles et évite la complexité des techniques d’optimisation spéculatives. Trois extensions sont apportées au versionnement de blocs de base afin de lui donner des capacités d’optimisation interprocédurale. Une première extension lui donne la possibilité de joindre des informations de typage aux propriétés des objets et aux variables globales. Puis, la spécialisation de points d’entrée lui permet de passer de l’information de typage des fonctions appellantes aux fonctions appellées. Finalement, la spécialisation des continuations d’appels permet de transmettre le type des valeurs de retour des fonctions appellées aux appellants sans coût dynamique. Nous démontrons empiriquement que ces extensions permettent au versionnement de blocs de base d’éliminer plus de tests de typage dynamiques que toute analyse d’inférence de typage statique.Dynamically typed programming languages such as JavaScript and Python defer type checking to run time. In order to maximize performance, dynamic language virtual machine implementations must attempt to eliminate redundant dynamic type checks. This is typically done using type inference analysis. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly complex multi-tiered VM architectures. We introduce lazy basic block versioning, a simple just-in-time compilation technique which effectively removes redundant type checks from critical code paths. This novel approach lazily generates type-specialized versions of basic blocks on the fly while propagating context-dependent type information. This does not require the use of costly program analyses, is not restricted by the precision limitations of traditional type analyses and avoids the implementation complexity of speculative optimization techniques. Three extensions are made to the basic block versioning technique in order to give it interprocedural optimization capabilities. Typed object shapes give it the ability to attach type information to object properties and global variables. Entry point specialization allows it to pass type information from callers to callees, and call continuation specialization makes it possible to pass return value type information back to callers without dynamic overhead. We empirically demonstrate that these extensions enable basic block versioning to exceed the capabilities of static whole-program type analyses

    Reconfiguration of legacy software artifacts in resource constraint embedded systems

    Get PDF
    Hochgradig ressourcenbeschränkte eingebettete Systeme befinden sich überall. Einige dieser Systeme befinden sich in Smart-Phones oder elektronischen Kontroll-Einheiten, andere in Sensor-Netzwerken oder auch Smart-Cards. Gerade die zuletzt genannten gehören zu den in Bezug auf Prozessorleistung und Speicherplatz am meist beschränkten Systemen. Um bei gleicher Ressourcenauslastung mehr Funktionalität bereitzustellen führt diese Arbeit ein Verfahren ein, welche es erlaubt durch Rekonfigurationstechniken dieses Problem zu lösen. Im Gegensatz zu traditionellen Verwendungszwecken von Rekonfigurationstechniken wird in dieser Arbeit Rekonfiguration zur Reduktion der Anwendungsgröße verwendet. Heutige Architekturen, welche Rekonfiguration ermöglichen, basieren auf der Unterstützung dieser Mechanismen auf Entwurfs- bzw. Source-Code Ebene. Software Lösungen basieren jedoch zum großen Teil auf wiederverwertbaren Bibliotheken oder Drittanbieter-Komponenten, welche keine Unterstützung von Rekonfiguration mit sich bringen und zumeist im Binärformat vorliegen. Diese Arbeit stellt eine Methode vor, um ein existierendes System unter Verwendung von Binärcode automatisch in ein rekonfigurierbares System umzuwandeln, mit dem Ziel die Anwendungsgröße zuverringern und dabei weiterhin harten Echtzeitbedingungen zu genügen. Das Verfahren basiert auf der Verwendung von Binärcode-Analyse Techniken zur Rekonstruktion der Anwendungssemantik, welche es erlauben dem Benutzer durch Bedingungen in einer Hochsprache Komponenten aus der Anwendungen zu extrahieren. Diese Komponenten werden anschließend optimiert. Mit dem Verfahren ist es möglich nicht rekonfigurierbare binäre Softwaresysteme in rekonfigurierbare Systeme umzuwandeln, welche die Anwendungsgröße reduzieren und dabei harte Echtzeit-Bedingungen erfüllen.Highly resource-constrained embedded systems are everywhere. Some of them can be found inside smartphones, electronic control units, others in wireless sensor networks or smart cards. The last two systems are among the most restrictive ones in the sense of processing power, energy consumption and memory availability. Pricing policies often lead to a reduction in software functionality as cheaper hardware with less resources is demanded for the final product. In order to allow more complex software to run on such constrained systems, this thesis proposes the use of software reconfiguration. In contrast to traditional uses of reconfiguration this thesis proposes the use of reconfiguration mechanisms in order to reduce the footprint of an deeply embedded application while maintaining real-time constraints. Todays adaptable architectures require the support of reconfigurability and adaptability at design level. However, modern software products are often constructed out of reusable but non-adaptable legacy software artifacts to meet early time-to-market requirements. This thesis proposes a methodology to semiautomatically use existing binaries in a reconfigurable manner. It is based on using binary analysis techniques to reconstruct the semantics of the binary application in order to allow the system developer to select meaningful code parts as components from the binary code. Using a set of high level constraints the user is able to extract components from the binary application. These components are then subject to a design space exploration step, which optimizes the resulting reconfigurable system regarding parameters as, e.g., worst case blocking time and flash lifetime. With this approach, reconfiguration can be added with a low effort to non-adaptive binary software in order to decrease the footprint of the application while maintaining real-time constraints.Tag der Verteidigung: 05.04.2013Paderborn, Univ., Diss., 201

    PRACTICAL ADAPTATION OF THE GLOBAL OPTIMIZATION ALGORITHM OF MOREL AND RENVOISE

    No full text
    We present some modifications to Morel and Renvoise's algorithm for global optimization by suppression of partial redundancies. The modifications are motivated by the desire to (1) eliminate redundant code motion, and (2) extend the scope of optimization to the movement of assignments. The complexity of the modified algorithm is compared with that of the original algorithm
    corecore