2 research outputs found

    Performance analysis of Message Passing Interface collective communication on intel xeon quad-core gigabit ethernet and infiniband clusters

    Get PDF
    The performance of MPI implementation operations still presents critical issues for high performance computing systems, particularly for more advanced processor technology. Consequently, this study concentrates on benchmarking MPI implementation on multi-core architecture by measuring the performance of Open MPI collective communication on Intel Xeon dual quad-core Gigabit Ethernet and InfiniBand clusters using SKaMPI. It focuses on well known collective communication routines such as MPI-Bcast, MPI-AlltoAll, MPI-Scatter and MPI-Gather. From the collection of results, MPI collective communication on InfiniBand clusters had distinctly better performance in terms of latency and throughput. The analysis indicates that the algorithm used for collective communication performed very well for all message sizes except for MPI-Bcast and MPI-Alltoall operation of inter-node communication. However, InfiniBand provides the lowest latency for all operations since it provides applications with an easy to use messaging service, compared to Gigabit Ethernet, which still requests the operating system for access to one of the server communication resources with the complex dance between an application and a network

    MPI communication benchmarking on Intel Xeon dual quad-core processor cluster

    Get PDF
    This paper reports the measurements of MPI communication benchmarking on Khaldun cluster which ran on Linux-based IBM Blade HS21 Servers with Intel Xeon dual quad-core processor and Gigabit Ethernet interconnect. The measurements were done by using SKaMPI and IMB benchmark programs. Significantly, these were the first results produced by using SKaMPI and IMB to analyze the performance of Open MPI implementation on Khaldun cluster. The comparison and analysis of the results of point to point and collective communication from these two benchmark programs were then provided. It showed that different MPI benchmark programs rendered different results since they used different measurement techniques. The results were then compared to the experiment's results that were done on cluster with Opteron dual quad-core processor and Gigabit Ethernet interconnect. The analysis indicated that the architecture of machines used also affected the results
    corecore