1,317 research outputs found
Simple and Effective Type Check Removal through Lazy Basic Block Versioning
Dynamically typed programming languages such as JavaScript and Python defer
type checking to run time. In order to maximize performance, dynamic language
VM implementations must attempt to eliminate redundant dynamic type checks.
However, type inference analyses are often costly and involve tradeoffs between
compilation time and resulting precision. This has lead to the creation of
increasingly complex multi-tiered VM architectures.
This paper introduces lazy basic block versioning, a simple JIT compilation
technique which effectively removes redundant type checks from critical code
paths. This novel approach lazily generates type-specialized versions of basic
blocks on-the-fly while propagating context-dependent type information. This
does not require the use of costly program analyses, is not restricted by the
precision limitations of traditional type analyses and avoids the
implementation complexity of speculative optimization techniques.
We have implemented intraprocedural lazy basic block versioning in a
JavaScript JIT compiler. This approach is compared with a classical flow-based
type analysis. Lazy basic block versioning performs as well or better on all
benchmarks. On average, 71% of type tests are eliminated, yielding speedups of
up to 50%. We also show that our implementation generates more efficient
machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on
several benchmarks. The combination of implementation simplicity, low
algorithmic complexity and good run time performance makes basic block
versioning attractive for baseline JIT compilers
Factoring out ordered sections to expose thread-level parallelism
With the rise of multi-core processors, researchers are taking a new look at extending the applicability auto-parallelization techniques. In this paper, we identify a dependence pattern on which autoparallelization currently fails. This dependence pattern occurs for ordered sections, i.e. code fragments in a loop that must be executed atomically and in original program order. We discuss why these ordered sections prohibit current auto-parallelizers from working and we present a technique to deal with them. We experimentally demonstrate the efficacy of the technique, yielding significant overall program speedups
- …