9 research outputs found

    ShareJIT: JIT Code Cache Sharing across Processes and Its Practical Implementation

    Get PDF
    Just-in-time (JIT) compilation coupled with code caching are widely used to improve performance in dynamic programming language implementations. These code caches, along with the associated profiling data for the hot code, however, consume significant amounts of memory. Furthermore, they incur extra JIT compilation time for their creation. On Android, the current standard JIT compiler and its code caches are not shared among processes---that is, the runtime system maintains a private code cache, and its associated data, for each runtime process. However, applications running on the same platform tend to share multiple libraries in common. Sharing cached code across multiple applications and multiple processes can lead to a reduction in memory use. It can directly reduce compile time. It can also reduce the cumulative amount of time spent interpreting code. All three of these effects can improve actual runtime performance. In this paper, we describe ShareJIT, a global code cache for JITs that can share code across multiple applications and multiple processes. We implemented ShareJIT in the context of the Android Runtime (ART), a widely used, state-of-the-art system. To increase sharing, our implementation constrains the amount of context that the JIT compiler can use to optimize the code. This exposes a fundamental tradeoff: increased specialization to a single process' context decreases the extent to which the compiled code can be shared. In ShareJIT, we limit some optimization to increase shareability. To evaluate the ShareJIT, we tested 8 popular Android apps in a total of 30 experiments. ShareJIT improved overall performance by 9% on average, while decreasing memory consumption by 16% on average and JIT compilation time by 37% on average.Comment: OOPSLA 201

    Coverage-Based Debloating for Java Bytecode

    Full text link
    Software bloat is code that is packaged in an application but is actually not necessary to run the application. The presence of software bloat is an issue for security, for performance, and for maintenance. In this paper, we introduce a novel technique for debloating Java bytecode, which we call coverage-based debloating. We leverage a combination of state-of-the-art Java bytecode coverage tools to precisely capture what parts of a project and its dependencies are used at runtime. Then, we automatically remove the parts that are not covered to generate a debloated version of the compiled project. We successfully generate debloated versions of 220 open-source Java libraries, which are syntactically correct and preserve their original behavior according to the workload. Our results indicate that 68.3% of the libraries' bytecode and 20.5% of their total dependencies can be removed through coverage-based debloating. Meanwhile, we present the first experiment that assesses the utility of debloated libraries with respect to client applications that reuse them. We show that 80.9% of the clients with at least one test that uses the library successfully compile and pass their test suite when the original library is replaced by its debloated version

    An Abstract Interpretation-based Model of Tracing Just-In-Time Compilation

    Get PDF
    Tracing just-in-time compilation is a popular compilation technique for the efficient implementation of dynamic languages, which is commonly used for JavaScript, Python and PHP. We provide a formal model of tracing JIT compilation of programs using abstract interpretation. Hot path detection corresponds to an abstraction of the trace semantics of the program. The optimization phase corresponds to a transform of the original program that preserves its trace semantics up to an observation modeled by some abstraction. We provide a generic framework to express dynamic optimizations and prove them correct. We instantiate it to prove the correctness of dynamic type specialization and constant variable folding. We show that our framework is more general than the model of tracing compilation introduced by Guo and Palsberg [2011] based on operational bisimulations.Comment: To appear in ACM Transactions on Programming Languages and System

    Compilação Just-In-Time: Histórico, Arquitetura, Princípios e Sistemas

    Get PDF
    Diversas implementações de linguagens de alto nível focam no desenvolvimento de sistemas baseados em mecanismos de compilação just-in-time. Esse mecanismo possui o atrativo de melhorar o desempenho de tais linguagens, mantendo a portabilidade. Contudo, ao preço da inclusão do tempo de compilação ao tempo total de execução. Diante disso, as pesquisas na área têm voltado balancear o custo de compilação com eficiência de execução. Os primeiros sistemas de compilação just-in-time empregavam estratégias estáticas para selecionar e otimizar as regiões de código propícias para gerar bom desempenho. Sistemas mais sofisticados aprimoraram tais estratégias com o objetivo de aplicar otimizações de forma mais criteriosa. Nesse sentido, este tutorial apresenta os princípios que fundamentam a compilação just-in-time e sua evolução ao longo dos anos, bem como a abordagem utilizada por diversos sistemas para garantir o balanceamento de custo e eficiência. Embora seja difícil definir a melhor abordagem, trabalhos recentes mostram que estratégias rígidas para detecção e otimização de código, juntamente com recursos de paralelismo oferecidos pelas arquiteturas multi-core formarão a base dos futuros sistemas de compilação just-in-time

    OpenISA, um conjunto de instruções híbrido

    Get PDF
    Orientador: Edson BorinTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: OpenISA é concebido como a interface de processadores que pretendem ser altamente flexíveis. Isto é conseguido por meio de três estratégias: em primeiro lugar, o ISA é empiricamente escolhido para ser facilmente traduzido para outros, possibilitando flexibilidade do software no caso de um processador OpenISA físico não estar disponível. Neste caso, não há nenhuma necessidade de aplicar um processador virtual OpenISA em software. O ISA está preparado para ser estaticamente traduzido para outros ISAs. Segundo, o ISA não é um ISA concreto nem um ISA virtual, mas um híbrido com a capacidade de admitir modificações nos opcodes sem afetar a compatibilidade retroativa. Este mecanismo permite que as futuras versões do ISA possam sofrer modificações em vez de extensões simples das versões anteriores, um problema comum com ISA concretos, como o x86. Em terceiro lugar, a utilização de uma licença permissiva permite o ISA ser usado livremente por qualquer parte interessada no projeto. Nesta tese de doutorado, concentramo-nos nas instruções de nível de usuário do OpenISA. A tese discute (1) alternativas para ISAs, alternativas para distribuição de programas e o impacto de cada opção, (2) características importantes de OpenISA para atingir seus objetivos e (3) fornece uma completa avaliação do ISA escolhido com respeito a emulação de desempenho em duas CPUs populares, uma projetada pela Intel e outra pela ARM. Concluímos que a versão do OpenISA apresentada aqui pode preservar desempenho próximo do nativo quando traduzida para outros hospedeiros, funcionando como um modelo promissor para ISAs flexíveis da próxima geração que podem ser facilmente estendidos preservando a compatibilidade. Ainda, também mostramos como isso pode ser usado como um formato de distribuição de programas no nível de usuárioAbstract: OpenISA is designed as the interface of processors that aim to be highly flexible. This is achieved by means of three strategies: first, the ISA is empirically chosen to be easily translated to others, providing software flexibility in case a physical OpenISA processor is not available. Second, the ISA is not a concrete ISA nor a virtual ISA, but a hybrid one with the capability of admitting modifications to opcodes without impacting backwards compatibility. This mechanism allows future versions of the ISA to have real changes instead of simple extensions of previous versions, a common problem with concrete ISAs such as the x86. Third, the use of a permissive license allows the ISA to be freely used by any party interested in the project. In this PhD. thesis, we focus on the user-level instructions of OpenISA. The thesis discusses (1) ISA alternatives, program distribution alternatives and the impact of each choice, (2) important features of OpenISA to achieve its goals and (3) provides a thorough evaluation of the chosen ISA with respect to emulation performance on two popular host CPUs, one from Intel and another from ARM. We conclude that the version of OpenISA presented here can preserve close-to-native performance when translated to other hosts, working as a promising model for next-generation, flexible ISAs that can be easily extended while preserving backwards compatibility. Furthermore, we show how this can also be a program distribution format at user-levelDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação2011/09630-1FAPES

    Low-Level Haskell Code: Measurements and Optimization Techniques

    Get PDF
    Haskell is a lazy functional language with a strong static type system and excellent support for parallel programming. The language features of Haskell make it easier to write correct and maintainable programs, but execution speed often suffers from the high levels of abstraction. While much past research focuses on high-level optimizations that take advantage of the functional properties of Haskell, relatively little attention has been paid to the optimization opportunities in the low-level imperative code generated during translation to machine code. One problem with current low-level optimizations is that their effectiveness is limited by the obscured control flow caused by Haskell's high-level abstractions. My thesis is that trace-based optimization techniques can be used to improve the effectiveness of low-level optimizations for Haskell programs. I claim three unique contributions in this work. The first contribution is to expose some properties of low-level Haskell codes by looking at the mix of operations performed by the selected benchmark codes and comparing them to the low-level codes coming from traditional programming languages. The low-level measurements reveal that the control flow is obscured by indirect jumps caused by the implementation of lazy evaluation, higher-order functions, and the separately managed stacks used by Haskell programs. My second contribution is a study on the effectiveness of a dynamic binary trace-based optimizer running on Haskell programs. My results show that while viable program traces frequently occur in Haskell programs the overhead associated with maintaing the traces in a dynamic optimization system outweigh the benefits we get from running the traces. To reduce the runtime overheads, I explore a way to find traces in a separate profiling step. My final contribution is to build and evaluate a static trace-based optimizer for Haskell programs. The static optimizer uses profiling data to find traces in a Haskell program and then restructures the code around the traces to increase the scope available to the low-level optimizer. My results show that we can successfully build traces in Haskell programs, and the optimized code yields a speedup over existing low-level optimizers of up to 86% with an average speedup of 5% across 32 benchmarks
    corecore