3 research outputs found

    Performance evaluation of NUMA and COMA distributed shared-memory multiprocessors

    No full text
    Due to the character of the original source materials and the nature of batch digitization, quality control issues may be present in this document. Please report any quality issues you encounter to [email protected], referencing the URI of the item.Includes bibliographical references.Issued also on microfiche from Lange Micrographics.Memory architecture is an important component in a distributed shared-memory parallel computer. This thesis studies three shared-memory architectures-Non-Uniform Memory Access (NUMA) with full-mapped directories, Cache-Only Memory Architecture (COMA) with full-mapped directories, and COMA with directories based on a new design using binomial trees. The three architectures were implemented in the Proteus execution driven simulator. Proteus simulated the execution of three applications taken from the SPLASH-2 suite of benchmark parallel programs. Six sets of simulations were run. These simulations provided performance data for a range of values of important design parameters. The parameters studied were page size, block size, number of processors, memory controller speed, cache size and interconnection network topology. These simulations have two major benefits. First, they aid in choosing the best values for key design parameters. Second, these simulations facilitate the direct comparison of COMA vs. NUMA as well as the two directory designs. The simulations show that both COMA architectures generally perform better than NUMA. COMA proved to be less sensitive to suboptimum choices of primary cache and block sizes. In most cases the COMA with full-mapped directories performed a little better than with binomial trees. However, the binomial tree directories require significantly less hardware (eleven versus sixty-four bite per block for the machines simulated in this thesis)
    corecore