4 research outputs found

    Worst-Case Execution Time Analysis of Predicated Architectures

    Get PDF
    The time-predictable design of computer architectures for the use in (hard) real-time systems is becoming more and more important, due to the increasing complexity of modern computer architectures. The design of predictable processor pipelines recently received considerable attention. The goal here is to find a trade-off between predictability and computing power. Branches and jumps are particularly problematic for high-performance processors. For one, branches are executed late in the pipeline. This either leads to high branch penalties (flushing) or complex software/hardware techniques (branch predictors). Another side-effect of branches is that they make it difficult to exploit instruction-level parallelism due to control dependencies. Predicated computer architectures allow to attach a predicate to the instructions in a program. An instruction is then only executed when the predicate evaluates to true and otherwise behaves like a simple nop instruction. Predicates can thus be used to convert control dependencies into data dependencies, which helps to address both of the aforementioned problems. A downside of predicated instructions is the precise worst-case execution time (WCET) analysis of programs making use of them. Predicated memory accesses, for instance, may or may not have an impact on the processor\u27s cache and thus need to be considered by the cache analysis. Predication potentially has an impact on all analysis phases of a WCET analysis tool. We thus explore a preprocessing step that explicitly unfolds the control-flow graph, which allows us to apply standard analyses that are themselves not aware of predication

    IR-Level Versus Machine-Level If-Conversion for Predicated Architectures

    Get PDF
    If-conversion is a simple yet powerful optimization that converts control dependences into data dependences. It allows elimination of branches and increases available instruction level parallelism and thus overall performance. If-conversion can either be applied alone or in combination with other techniques that increase the size of scheduling regions. The presence of hardware support for predicated execution allows if-conversion to be broadly applied in a given program. This makes it necessary to guide the optimization using heuristic estimates regarding its potential benefit. Similar to other transformations in an optimizing compiler, if-conversion inherently su↵ers from phase ordering issues. Driven by these facts, we developed two algorithms for if-conversion targeting the TI TMS320C64x+ architecture within the LLVM framework. Each implementation targets a di↵erent level of code abstraction. While one targets the intermediate representation, the other addresses machine-level code. Both make use of an adapted set of estimation heuristics and prove to be successful in general, but each one exhibits di↵erent strengths and weaknesses. High-level if-conversion, applied before other control flow transformations, has more freedom to operate. But in contrast to its machine-level counterpart, which is more restricted, its estimations of runtime are less accurate. Our results from experimental evaluation show a mean speedup close to 14 % for both algorithms on a set of programs from the MiBench and DSPstone benchmark suites. We give a comparison of the implemented optimizations and discuss gained insights on the topics of ifconversion, phase ordering issues and profitability analysis

    Center for Programming Models for Scalable Parallel Computing: Future Programming Models

    Full text link

    If-Conversion in SSA Form

    No full text
    corecore