4 research outputs found

    Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation

    No full text
    Exploitation of instruction-level parallelism is an effective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be applied to increase instruction-level parallelism. This paper describes and evaluates a software technique, dynamic memory disambiguation, that permits loops containing loads and stores to be scheduled more aggressively, thereby exposing more instruction-level parallelism. The results of our evaluation show that when dynamic memory disambiguation is applied in conjunction with loop unrolling, register renaming, and static memory disambiguation, the ILP of memory-intensive benchmarks can be increased by as much as 300 percent over loops where dynamic memory disambiguation is not performed. Our measurements also indicate that for the programs that benefit the most from these optimizations, the register usage does not exceed the number of registers on most high-performance processors. Keywords: loop unrolling, dyn..

    Improving Instruction-level Parallelism by Loop Unrolling and Dynamic Memory Disambiguation

    No full text
    Exploitation of instruction-level parallelism is an ejfective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be applied to increase instruction-level parallelism. This paper describes and evaluates a software technique, dynamic memory disambiguation, that permits loops containing loads and stores to be scheduled more aggressively, thereby exposing more instruction-level parallelism. The results of our evaluation show that when dynamic memory disambiguation is applied in conjunction with loop unrolling, register renaming, and static memory disambiguation, the ILP of memory-intensive benchmarks can be increased by as much as 300 percent over loops where dynamic memory disambiguation is not performed. Our measurements also indicate that for the programs that benefit the most from these optimizations, the register usage does not exceed the number of registers on most high-performunce processors

    Loops optimization for Xingo Project

    Get PDF
    Orientador: Rodolfo Jardim de AzevedoDissertação (mestrado profissional) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: As otimizações implementadas em compiladores proporcionam uma melhora significativa de desempenho dos programas. Em muitos casos, proporcionam também a redução do tamanho do programa. Quase todos os programas em produção são compilados com diretivas de otimização, para obter máximo desempenho.Para o estudo de novas técnicas de otimização, faz-se necessário um ambiente de testes no qual essas técnicas possam ser incorporadas facilmente. O projeto Xingó foi desenvolvido com esse intuito. Gerando código C compilável, o Xingó proporciona facilmente a verificação do resultado das otimizações implementadas.Este trabalho mostra a implementa¸c¿ao de algumas otimizações em loops no projeto Xingó, demonstrando a viabilidade de novas otimizações serem incorporadas. Além disso, este trabalho analisa o resultado da utiliza¸c¿ao de ferramentas disponíveis no mercado que verificam a corretude de cada uma das otimizações e que avaliam o desempenho do sistema com as otimizações implementadasAbstract: Software performance is signifcantly improved by the optimizations implemented on the compilers. In some cases, the compiler optimizations also reduces the size of the software.It is necessary to have a test environment in order to study the result of optimization technics. The Xingó project was developed with such a concept in mind. By generating C compilable code, Xingó allows easy visualization of the results of new optimization technics.This work shows the implementation of some loop optimizations on the Xingó project, demonstrating that it can incorporate new optimizations. Besides that, this work shows the results from the usage of available tools that checks each optimization correctness and also tools that analyses the performance of the system with the optimizations incorporated.MestradoEngenharia de ComputaçãoMestre em Computaçã

    Instruction scheduling in micronet-based asynchronous ILP processors

    Get PDF
    corecore