412 research outputs found

    Working Sets Past and Present

    Get PDF

    Evaluation of Cache Inclusion Policies in Cache Management

    Get PDF
    Processor speed has been increasing at a higher rate than the speed of memories over the last years. Caches were designed to mitigate this gap and, ever since, several cache management techniques have been designed to further improve performance. Most techniques have been designed and evaluated on non-inclusive caches even though many modern processors implement either inclusive or exclusive policies. Exclusive caches benefit from a larger effective capacity, so they might become more popular when the number of cores per last-level cache increases. This thesis aims to demonstrate that the best cache management techniques for exclusive caches do not necessarily have to be the same as for non-inclusive or inclusive caches. To assess this statement we evaluated several cache management techniques with different inclusion policies, number of cores and cache sizes. We found that the configurations for inclusive and non-inclusive policies usually performed similarly, but for exclusive caches the best configurations were indeed different. Prefetchers impacted performance more than replacement policies, and determined which configurations were the best ones. Also, exclusive caches showed a higher speedup on multi-core. The least recently used (LRU) replacement policy is among the best policies for any prefetcher combination in exclusive caches but is the one used as a baseline in most cache replacement policy research. Therefore, we conclude that the results in this thesis motivate further research on prefetchers and replacement policies targeted to exclusive caches

    Randomized cache placement for eliminating conflicts

    Get PDF
    Applications with regular patterns of memory access can experience high levels of cache conflict misses. In shared-memory multiprocessors conflict misses can be increased significantly by the data transpositions required for parallelization. Techniques such as blocking which are introduced within a single thread to improve locality, can result in yet more conflict misses. The tension between minimizing cache conflicts and the other transformations needed for efficient parallelization leads to complex optimization problems for parallelizing compilers. This paper shows how the introduction of a pseudorandom element into the cache index function can effectively eliminate repetitive conflict misses and produce a cache where miss ratio depends solely on working set behavior. We examine the impact of pseudorandom cache indexing on processor cycle times and present practical solutions to some of the major implementation issues for this type of cache. Our conclusions are supported by simulations of a superscalar out-of-order processor executing the SPEC95 benchmarks, as well as from cache simulations of individual loop kernels to illustrate specific effects. We present measurements of instructions committed per cycle (IPC) when comparing the performance of different cache architectures on whole-program benchmarks such as the SPEC95 suite.Peer ReviewedPostprint (published version

    Context Switching with Multiple Register Windows: A RISC Performance Study

    Get PDF
    Although previous studies have shown that a large file of overlapping register windows can greatly reduce procedure call/return overhead, the effects of register windows in a multiprogramming environment are poorly understood. This paper investigates the performance of multiprogrammed, reduced instruction set computers (RISCs) as a function of window management strategy. Using an analytic model that reflects context switch and procedure call overheads, we analyze the performance of simple, linearly self-recursive programs. For more complex programs, we present the results of a simulation study. These studies show that a simple strategy that saves all windows prior to a context switch, but restores only a single window following a context switch, performs near optimally

    Expanding symmetric multiprocessor capability through gang scheduling

    Full text link

    A Classification and Survey of Computer System Performance Evaluation Techniques

    Get PDF
    Classification and survey of computer system performance evaluation technique

    Dynamic Programming as a Scheduling Tool in Multiprogrammed Computing Systems

    No full text
    A potentially parallel iterative algorithm for the solution of the unconstrained N-stage decision problem of Dynamic Programming is developed. This new solution method, known as Variable Metric Dynamic Programming, is based on the use of variable metric minimisation techniques to develop quadratic approximations to the optimal cost function for each stage. The algorithm is applied to various test problems, and a comparison with an existing similar algorithm proves favourable. The Variable Metric Dynamic Programming solution method is used in the implementation of an adaptive highlevel scheduling mechanism on a multiprogrammed computer in a university environment. This demonstrates a practical application of the new algorithm. More importantly, the application of Variable Metric Dynamic Programming to a scheduling problem illustrates how Mathematical Programming may be used in complex computer scheduling problems to provide in a natural way the required dynamic feedback mechanisms

    The evaluation of computer performance by means of state-dependent queueing network models

    Get PDF
    Imperial Users onl
    • …
    corecore