32 research outputs found

    Evaluation of the Cedar memory system: Configuration of 16 by 16

    Get PDF
    Some basic results on the performance of the Cedar multiprocessor system are presented. Empirical results on the 16 processor 16 memory bank system configuration, which show the behavior of the Cedar system under different modes of operation are presented

    Preliminary basic performance analysis of the Cedar multiprocessor memory system

    Get PDF
    Some preliminary basic results on the performance of the Cedar multiprocessor memory system are presented. Empirical results are presented and used to calibrate a memory system simulator which is then used to discuss the scalability of the system

    Characterizing the Behavior of Sparse Algorithms on Caches

    Get PDF
    Abstract Sparse computations constitute one of the most important area of numerical algebra and scientific computing. While there are many studies on the locality of dense codes, few deaf with the locality of sparse codes. Because of indirect addressing, sparse codes exhibit irregular patterns of references. In this paper, the behavior on cache of one of the most frequent primitives SpMxV Sparse Ma.triz-Vector mtdttp2g is analyzed. A model of its references is built, and then performance bottlenecks of SpMxV are analyzed using model and simulations. Main parameters are identified and their role is explained and quantified. Then, this analysis is used to discuss optimizations of SpMxV. Moreover a blocking technique which takes into account the specifics of sparse codes is proposed

    A strategy for array management in local memory

    Get PDF
    Projet CHLOEOne major point in loop restructuring for data locality optimization is the choice and the evaluation of a data locality criteria. We show in this paper how to compute approximations of window sets defined by Gannon, Jalby and Gallivan (the window associated with an iteration i describes the "active" portion of array : elements which have already been referenced before iteration i and which will be referenced after iteration i. Such a notion is extremely useful for data localization since it identifies the portions of arrays which are worth keeping in local memory because they are going to be referenced later. The computation of these window approximations can be performed symbolically at compile time and generates simple geometrical shape that simplifies the management of the data transfers. This allow to derive a global strategy of data management for local memories. . . Moreover, the effects of loop transformations fit naturally in the geometrical framework we use for the calculations. The determination of window approximations is studied both from a theoretical and a computational point of view and examples of applications are given

    Characterizing the Behavior of Sparse Algorithms on Caches

    Get PDF
    Sparse computations constitute one of the most important area of numerical algebra and scientific computing. While there are many studies on the locality of dense codes, few deal with the locality of sparse codes. Because of indirect addressing, sparse codes exhibit irregular patterns of references. In this paper, the behavior on cache of one of the most frequent primitives SpMxV Sparse Matrix-Vector multiply is analyzed. A model of its references is built, and then performance bottlenecks of SpMxV are analyzed using model and simulations. Main parameters are identified and their role is explained and quantified. Then, this analysis is used to discuss optimizations of SpMxV. Moreover a blocking technique which takes into account the specifics of sparse codes is proposed. Keywords: sparse primitives, cache, performance prediction, data locality. 1 Introduction Due to the increasing difference between memory speed and processor speed, it becomes critical to minimize communications bet..

    Cache Interference Phenomena

    No full text
    The impact of cache interferences on program performance (particularly numerical codes, which heavily use the memory hierarchy) remains unknown. The general knowledge is that cache interferences are highly irregular, in terms of occurrence and intensity. In this paper, the different types of cache interferences that can occur in numerical loop nests are identified. An analytical method is developed for detecting the occurrence of interferences and, more important, for computing the number of cache misses due to interferences. Simulations and experiments on real machines show that the model is generally accurate and that most interference phenomena are captured. Experiments also show that cache interferences can be intense and frequent. Certain parameters such as array base addresses or dimensions can have a strong impact on the occurrence of interferences. Modifying these parameters only can induce global execution time variations of 30% and more. Applications of these modeling techniq..
    corecore