602 research outputs found
A Formal Foundation for Trace-Based JIT Compilers
Trace-based JIT compilers identify frequently executed program paths at run-time and subsequently record, compile and optimize their execution. In order to improve the performance of the generated machine instructions, JIT compilers heavily rely on dynamic analysis of the code. Existing work treats the components of a JIT compiler as a monolithic whole, tied to particular execution semantics. We propose a formal framework that facilitates the design and implementation of a tracing JIT compiler and its accompanying dynamic analyses by decoupling the tracing, optimization, and interpretation processes. This results in a framework that is more configurable and extensible than existing formal tracing models. We formalize the tracer and interpreter as two abstract state machines that communicate through a minimal, well-defined interface. Developing a tracing JIT compiler becomes possible for arbitrary interpreters that implement this interface. The abstract machines also provide the necessary hooks to plug in custom analyses and optimizations
An Abstract Interpretation-based Model of Tracing Just-In-Time Compilation
Tracing just-in-time compilation is a popular compilation technique for the
efficient implementation of dynamic languages, which is commonly used for
JavaScript, Python and PHP. We provide a formal model of tracing JIT
compilation of programs using abstract interpretation. Hot path detection
corresponds to an abstraction of the trace semantics of the program. The
optimization phase corresponds to a transform of the original program that
preserves its trace semantics up to an observation modeled by some abstraction.
We provide a generic framework to express dynamic optimizations and prove them
correct. We instantiate it to prove the correctness of dynamic type
specialization and constant variable folding. We show that our framework is
more general than the model of tracing compilation introduced by Guo and
Palsberg [2011] based on operational bisimulations.Comment: To appear in ACM Transactions on Programming Languages and System
On-stack replacement, distilled
On-stack replacement (OSR) is essential technology for adaptive optimization, allowing changes to code actively executing in a managed runtime. The engineering aspects of OSR are well-known among VM architects, with several implementations available to date. However, OSR is yet to be explored as a general means to transfer execution between related program versions, which can pave the road to unprecedented applications that stretch beyond VMs. We aim at filling this gap with a constructive and provably correct OSR framework, allowing a class of general-purpose transformation functions to yield a special-purpose replacement. We describe and evaluate an implementation of our technique in LLVM. As a novel application of OSR, we present a feasibility study on debugging of optimized code, showing how our techniques can be used to fix variables holding incorrect values at breakpoints due to optimizations
Speculative Staging for Interpreter Optimization
Interpreters have a bad reputation for having lower performance than
just-in-time compilers. We present a new way of building high performance
interpreters that is particularly effective for executing dynamically typed
programming languages. The key idea is to combine speculative staging of
optimized interpreter instructions with a novel technique of incrementally and
iteratively concerting them at run-time.
This paper introduces the concepts behind deriving optimized instructions
from existing interpreter instructions---incrementally peeling off layers of
complexity. When compiling the interpreter, these optimized derivatives will be
compiled along with the original interpreter instructions. Therefore, our
technique is portable by construction since it leverages the existing
compiler's backend. At run-time we use instruction substitution from the
interpreter's original and expensive instructions to optimized instruction
derivatives to speed up execution.
Our technique unites high performance with the simplicity and portability of
interpreters---we report that our optimization makes the CPython interpreter up
to more than four times faster, where our interpreter closes the gap between
and sometimes even outperforms PyPy's just-in-time compiler.Comment: 16 pages, 4 figures, 3 tables. Uses CPython 3.2.3 and PyPy 1.
Micro Virtual Machines: A Solid Foundation for Managed Language Implementation
Today new programming languages proliferate, but many of them
suffer from
poor performance and inscrutable semantics. We assert that the
root of
many of the performance and semantic problems of today's
languages is
that language implementation is extremely difficult. This
thesis
addresses the fundamental challenges of efficiently developing
high-level
managed languages.
Modern high-level languages provide abstractions over execution,
memory
management and concurrency. It requires enormous intellectual
capability
and engineering effort to properly manage these concerns.
Lacking such
resources, developers usually choose naive implementation
approaches
in the early stages of language design, a strategy which too
often has
long-term consequences, hindering the future development of the
language. Existing language development platforms have failed
to
provide the right level of abstraction, and forced implementers
to
reinvent low-level mechanisms in order to obtain performance.
My thesis is that the introduction of micro virtual machines will
allow
the development of higher-quality, high-performance managed
languages.
The first contribution of this thesis is the design of Mu, with
the
specification of Mu as the main outcome. Mu is
the first micro virtual machine, a robust, performant, and
light-weight
abstraction over just three concerns: execution, concurrency and
garbage
collection. Such a foundation attacks three of the most
fundamental and
challenging issues that face existing language designs and
implementations, leaving the language implementers free to focus
on the
higher levels of their language design.
The second contribution is an in-depth analysis of on-stack
replacement
and its efficient implementation. This low-level mechanism
underpins
run-time feedback-directed optimisation, which is key to the
efficient
implementation of dynamic languages.
The third contribution is demonstrating the viability of Mu
through
RPython, a real-world non-trivial language implementation. We
also did
some preliminary research of GHC as a Mu client.
We have created the Mu specification and its reference
implementation,
both of which are open-source. We show that that Mu's on-stack
replacement API can gracefully support dynamic languages such as
JavaScript, and it is implementable on concrete hardware. Our
RPython
client has been able to translate and execute non-trivial
RPython
programs, and can run the RPySOM interpreter and the core of the
PyPy
interpreter.
With micro virtual machines providing a low-level substrate,
language
developers now have the option to build their next language on a
micro
virtual machine. We believe that the quality of programming
languages
will be improved as a result
Efficient Dispatch of Multi-object Polymorphic Call Sites in Contextual Role-Oriented Programming Languages
Adaptive software becomes more and more important as computing is increasingly context-dependent. Runtime adaptability can be achieved by dynamically selecting and applying context-specific code. Role-oriented programming has been proposed as a paradigm to enable runtime adaptive software by design. Roles change the objects’ behavior at runtime, thus adapting the software to a given context. The cost of adaptivity is however a high runtime overhead stemming from executing compositions of behavior-modifying code. It has been shown that the overhead can be reduced by optimizing dispatch plans at runtime when contexts do not change, but no method exists to reduce the overhead in cases with high context variability. This paper presents a novel approach to implement polymorphic role dispatch, taking advantage of run-time information to effectively guard abstractions and enable reuse even in the presence of variable contexts. The concept of polymorphic inline caches is extended to role invocations. We evaluate the implementation with a benchmark for role-oriented programming languages achieving a geometric mean speedup of 4.0× (3.8× up to 4.5×) with static contexts, and close to no overhead in the case of varying contexts over the current implementation of contextual roles in Object Team
Reusable semantics for implementation of Python optimizing compilers
Le langage de programmation Python est aujourd'hui parmi les plus populaires au monde grâce à son accessibilité ainsi que l'existence d'un grand nombre de librairies standards. Paradoxalement, Python est également reconnu pour ses performances médiocres lors de l'exécution de nombreuses tâches. Ainsi, l'écriture d’implémentations efficaces du langage est nécessaire. Elle est toutefois freinée par la sémantique complexe de Python, ainsi que par l’absence de sémantique formelle officielle.
Pour régler ce problème, nous présentons une sémantique formelle pour Python axée sur l’implémentation de compilateurs optimisants. Cette sémantique est écrite de manière à pouvoir être intégrée et analysée aisément par des compilateurs déjà existants.
Nous introduisons également semPy, un évaluateur partiel de notre sémantique formelle. Celui-ci permet d'identifier et de retirer automatiquement certaines opérations redondantes dans la sémantique de Python. Ce faisant, semPy génère une sémantique naturellement plus performante lorsqu'exécutée.
Nous terminons en présentant Zipi, un compilateur optimisant pour le langage Python développé avec l'assistance de semPy. Sur certaines tâches, Zipi offre des performances compétitionnant avec celle de PyPy, un compilateur Python reconnu pour ses bonnes performances. Ces résultats ouvrent la porte à des optimisations basées sur une évaluation partielle générant une implémentation spécialisée pour les cas d'usage fréquent du langage.Python is among the most popular programming language in the world due to its accessibility and extensive standard library. Paradoxically, Python is also known for its poor performance on many tasks. Hence, more efficient implementations of the language are required. The development of such optimized implementations is nevertheless hampered by the complex semantics of Python and the lack of an official formal semantics. We address this issue by presenting a formal semantics for Python focussed on the development of optimizing compilers. This semantics is written as to be easily reusable by existing compilers. We also introduce semPy, a partial evaluator of our formal semantics. This tool allows to automatically target and remove redundant operations from the semantics of Python. As such, semPy generates a semantics which naturally executes more efficiently. Finally, we present Zipi, a Python optimizing compiler developped with the aid of semPy. On some tasks, Zipi displays performance competing with those of PyPy, a Python compiler known for its good performance. These results open the door to optimizations based on a partial evaluation technique which generates specialized implementations for frequent use cases
MicroWalk: A Framework for Finding Side Channels in Binaries
Microarchitectural side channels expose unprotected software to information
leakage attacks where a software adversary is able to track runtime behavior of
a benign process and steal secrets such as cryptographic keys. As suggested by
incremental software patches for the RSA algorithm against variants of
side-channel attacks within different versions of cryptographic libraries,
protecting security-critical algorithms against side channels is an intricate
task. Software protections avoid leakages by operating in constant time with a
uniform resource usage pattern independent of the processed secret. In this
respect, automated testing and verification of software binaries for
leakage-free behavior is of importance, particularly when the source code is
not available. In this work, we propose a novel technique based on Dynamic
Binary Instrumentation and Mutual Information Analysis to efficiently locate
and quantify memory based and control-flow based microarchitectural leakages.
We develop a software framework named \tool~for side-channel analysis of
binaries which can be extended to support new classes of leakage. For the first
time, by utilizing \tool, we perform rigorous leakage analysis of two
widely-used closed-source cryptographic libraries: \emph{Intel IPP} and
\emph{Microsoft CNG}. We analyze different cryptographic implementations
consisting of million instructions in about minutes of CPU time. By
locating previously unknown leakages in hardened implementations, our results
suggest that \tool~can efficiently find microarchitectural leakages in software
binaries
A Flexible Framework for Studying Trace-Based Just-In-Time Compilation
Just-in-time compilation has proven an effective, though effort-intensive, choice for realizing performant language runtimes. Recently introduced JIT compilation frameworks advocate applying meta-compilation techniques such as partial evaluation or meta-tracing on simple interpreters to reduce the implementation effort. However, such frameworks are few and far between. Designed and highly optimized for performance, they are difficult to experiment with. We therefore present STRAF, a minimalistic yet flexible Scala framework for studying trace-based JIT compilation. STRAF is sufficiently general to support a diverse set of language interpreters, but also sufficiently extensible to enable experiments with trace recording and optimization. We demonstrate the former by plugging two different interpreters into STRAF. We demonstrate the latter by extending STRAF with e.g., constant folding and type-specialization optimizations, which are commonly found in dedicated trace-based JIT compilers. The evaluation shows that STRAF is suitable for prototyping new techniques and formalisms in the domain of trace-based JIT compilation
- …