14 research outputs found

    Improved algorithm for identification of switch tables in executable code

    Get PDF
    This paper introduces novel approach for identification of switch tables in executable code. Compared to existing solutions based on SSA intermediate representation and different propagation techniques, developed algorithm is more accurate and has greater flexibility. Set of iterative algorithms based on Pearson, weighted Pearson and Spearman correlation coefficients is introduced in this paper. Simple noise value filtering and improved segmentation algorithm are described

    Um Algoritmo Inter-Procedural para Análise de Largura de Variáveis

    Get PDF
    Durante este projeto foi desenvolvido um algoritmo inter-procedural que é capaz de processar programas com milhões de instruções assembly. Ao contrário de muitos trabalhos anteriores, nosso algoritmo trata comparações entre variáveis sem recorrer a algoritmos custosos. Nós obtemos sensibilidade ao fluxo de execução usando como representação intermediária o formato e-SSA (Extended Static Single Assignment) descrito por Bodik. Nós também mostramos que processar os componentes fortemente conexos do grafo em ordem topológica não só reduz o tempo de execução do programa, mas também aumenta sua precisão. Nós implementamos nossa técnica em LLVM, um compilador industrial, e fomos capazes de processar cerca de quatro milhões de instruções assembly em poucos segundos

    Software-controlled operand-gating

    Get PDF
    Operand gating is a technique for improving processor energy efficiency by gating off sections of the data path that are unneeded by short-precision (narrow) operands. A method for implementing software-controlled power gating is proposed and evaluated. The instruction set architecture (ISA) is enhanced to include opcodes that specify operand widths (if not already included in the ISA). A compiler or a binary translator uses statically available information to determine initial value ranges. The technique is enhanced through a profile-based analysis that results in the specialization of certain code regions for a given value range. After the analysis, instruction opcodes are assigned using the minimum required width. To evaluate this technique the Alpha instruction set is enhanced to include opcodes for 8, 16, and 32 bit operands. Applying the proposed software technique to the Speclnt95 benchmarks results in energy-delay savings of 14%. When combined with previously proposed hardware-based techniques, the energy-delay benefit is 28%.Peer ReviewedPostprint (published version

    A Lightweight Security Analyzer inside GCC

    Get PDF
    Abstrac

    Neural Methods for Resolving Hard-to-Predict Branches

    Get PDF
    This work presents a new category of branch predictors designed to be addendums to existing state of the art prediction mechanisms. We call these neural network inspired predictors Shallow Online Neural (SON) Predictors as they utilize easily quantizable shallow networks and exhibit online training as opposed to other related works. This predictor is apt as both a branch prediction scheme and as a TAGE confidence predictor.M.S

    BRAT: Branch Prediction Via Adaptive Training

    Get PDF
    In this thesis, BRAT is researched as a new hardware structure for cost-efficient branch prediction. Relying on the fundamentals of machine learning, BRAT computes a branch decision through a multi-layer neural network. To demonstrate the merits of BRAT, it is used to predict branches in a typical pipeline and evaluate its accuracy. By utilizing a hidden layer and activation functions, BRAT is able to introduce non-linearity and enable more accurate prediction of branch outcomes because this structure exposes relationships that may not be easily captured by a perceptron based approach or other popular methods. The memory utilized by BRAT scales linearly with the number of inputs in the decision process. At most memory footprints, BRAT is competitive with state-of-the-art branch predictors of equivalent memory budgets. Additionally, as the memory footprint is increased, it is shown how BRAT scales and how larger predictors in the future may perform.M.S

    A compiler cost model for speculative multithreading chip-multiprocessor architectures

    Get PDF

    Análise de valor para determinação do tempo de execução no pior caso (WCET) de tarefas em sistemas de tempo real

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, 2015A utilização de sistemas computacionais na sociedade tem se expandido e as aplicações com requisitos de tempo real são mais comuns, variando em relação à complexidade e às necessidades de garantia no atendimento de restrições temporais (deadlines). Uma propriedade importante na definição do comportamento temporal de uma tarefa é o tempo de computação, que é o tempo necessário para a execução completada tarefa. Um dos grandes problemas de obtê-lo está ligado à análise da microarquitetura do processador. Considerando um processador que possui memória de dados com latência variada, é necessário a análise de valor para identificar a região de memória que a instrução acessa (memória principal ou ScrathPad Memory), para que o pior tempo de execução dos programas não seja consideravelmente superestimado. O objetivo deste trabalho é usar a análise de valor para determinar o tempo correto de acesso à memória, através da identificação da região de memória que cada instrução acessa, com a finalidade de obter um limite superior do WCET menos pessimista.Abstract: The use of computer systems in our society has expanded and applications with real-time requirements are getting more usual, varying in relation to the complexity and the necessity of guaranting deadlines. An important restriction in defining the temporal behavior of a task is the computation time, i.e., the time necessary to complete the task. Amajor problem in obtaining WCET is the processor microarchitecture analysis. Considering a processor with a data memory that has varying latency, value analysis is necessary to identify the memory region tha teach instruction accesses (main memory or ScrathPad Memory), so the worst execution time of programs are not considerably overestimated.The objective of this work is to use value analysis to obtain the correct memory access time by identifying the region of memory each instruction accesses, obtaining WCET upper bounds that are less pessimistic

    Accurate Static Branch Prediction by Value Range Propagation

    No full text
    The ability to predict at compile time the likelihood of a particular branch being taken provides valuable information for several optimizations, including global instruction scheduling, code layout, function inlining, interprocedural register allocation and many high level optimizations. Previous attempts at static branch prediction have either used simple heuristics, which can be quite inaccurate, or put the burden onto the programmer by using execution profiling data or source code hints. This paper presents a new approach to static branch prediction called value range propagation. This method tracks the weighted value ranges of variables through a program, much like constant propagation. These value ranges may be either numeric or symbolic in nature. Branch prediction is then performed by simply consulting the value range of the appropriate variable. Heuristics are used as a fallback for cases where the value range of the variable cannot be determined statically. In the process, va..

    Accurate static branch prediction by value range propagation

    No full text
    corecore