39 research outputs found

    Alternative implementations of two-level adaptive branch prediction

    Get PDF
    As the issue rate and depth of pipelining of high performance Superscalar processors increase, the importance of an excellent branch predictor becomes more vital to delivering the potential performance of a wide-issue, deep pipelined microarchitecture. We propose a new dynamic branch predictor (Two-Level Adaptive Branch Prediction) that achieves substantially higher accuracy than any other scheme reported in the literature. The mechanism uses two levels of branch history information to make predictions, the history of the last L branches encountered, and the branch behavior for the last s occurrences of the specific pattern of these k branches. We have identified three variations of the Two-Level Adaptive Branch Prediction, depending on how finely we resolve the history information gathered. We compute the hardware costs of implementing each of the three variations, and use these costs in evaluating their relative effectiveness. We measure the branch prediction accuracy of the three variations of Two-Level Adaptive Branch Prediction, along with several other popular proposed dynamic and static prediction schemes, on the SPEC benchmarks. We show that the average prediction accuracy for TwoLevel Adaptive Branch Prediction is 97 percent, while the other known schemes achieve at most 94.4 percent average prediction accuracy. We measure the effectiveness of different prediction algorithms and different amounts of history and pattern information. We measure the costs of each variation to obtain the same prediction accuracy.

    [[alternative]]The Design of Advanced Value Predictors

    Get PDF
    計畫編號:NSC90-2213-E032-019研究期間:200108~200207研究經費:357,000[[sponsorship]]行政院國家科學委員

    Software trace cache

    Get PDF
    We explore the use of compiler optimizations, which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying hardware resources regardless of the specific details of the processor/architecture in order to increase fetch performance. The Software Trace Cache (STC) is a code layout algorithm with a broader target than previous layout optimizations. We target not only an improvement in the instruction cache hit rate, but also an increase in the effective fetch width of the fetch engine. The STC algorithm organizes basic blocks into chains trying to make sequentially executed basic blocks reside in consecutive memory positions, then maps the basic block chains in memory to minimize conflict misses in the important sections of the program. We evaluate and analyze in detail the impact of the STC, and code layout optimizations in general, on the three main aspects of fetch performance; the instruction cache hit rate, the effective fetch width, and the branch prediction accuracy. Our results show that layout optimized, codes have some special characteristics that make them more amenable for high-performance instruction fetch. They have a very high rate of not-taken branches and execute long chains of sequential instructions; also, they make very effective use of instruction cache lines, mapping only useful instructions which will execute close in time, increasing both spatial and temporal locality.Peer ReviewedPostprint (published version

    Better branch prediction through prophet/critic hybrids

    Get PDF
    The prophet/critic hybrid conditional branch predictor has two component predictors. The prophet uses a branch's history to predict its direction. We call this prediction and the ones for branches following it the branch future. The critic uses the branch's history and future to critique the prophet's prediction. The hybrid combines the prophet's prediction with the critique, either agrees or disagree, forming the branch's overall prediction. Results shows these hybrids can reduce mispredicts by 39 percent and improve processor performance by 7.8 percent.Peer ReviewedPostprint (published version

    Timing analysis of embedded software for speculative processors

    Get PDF

    Energy-aware fetch mechanism: trace cache and BTB customization

    Get PDF
    corecore