1 research outputs found

    Characterization and Improvement of Load/Store Cache-Based Prefetching

    No full text
    A common mechanism to perform hardware-based prefetching for regular accesses to arrays and chained lists is based on a Load/Store cache (LSC). An LSC associates the address of a ld/st instruction with its individual behavior at every entry. We show that the implementation cost of the LSC is rather high, and that using it is inefficient. We aim to decrease the cost of the LSC but not its performance. This may be done preventing useless instructions from being stored in the LSC. We propose eliminating those instructions that never miss, and those that follow a sequential pattern. This may be carried out by inserting a ld/st instruction in the LSC whenever it misses in the data cache (on-miss insertion), and issuing sequential prefetching simultaneously. After having analyzed the performance of this proposal through a cycle-by-cycle simulation over a set of 25 benchmarks selected from SPEC95, SPEC92 and Perfect Club, we conclude that an LSC of only 8 entries, which combines on-miss insertion and sequential prefetching, performs better than a conventional LSC of 512 entries. We think that the low cost of the proposal makes it worth being taken into account for the development of future microprocessors
    corecore