55 research outputs found

    Comparison of MPI benchmark programs on shared memory and distributed memory machines (point-to-point communication)

    Get PDF
    There are several benchmark programs available to measure the performance of MPI on parallel computers. The most common use of MPI benchmarks software are SKaMPI, Pallas MPI Benchmark, MPBench, Mpptest and MPIBench. It is interesting to analyze the differences between different benchmark. Presently, there have been few comparisons done between the different benchmarks. Thus, in this paper we discuss a comparison of the techniques used and the functionality of each benchmark, and also a comparison of the results on a distributed memory machine and shared memory machine for point-to-point communication. All of the MPI benchmarks listed above will be compared in this analysis. It is expected that the results from different benchmarks should be similar, however this analysis found substantial differences in the results for certain MPI communications, particularly for shared memory machines

    Performance analysis of Message Passing Interface collective communication on intel xeon quad-core gigabit ethernet and infiniband clusters

    Get PDF
    The performance of MPI implementation operations still presents critical issues for high performance computing systems, particularly for more advanced processor technology. Consequently, this study concentrates on benchmarking MPI implementation on multi-core architecture by measuring the performance of Open MPI collective communication on Intel Xeon dual quad-core Gigabit Ethernet and InfiniBand clusters using SKaMPI. It focuses on well known collective communication routines such as MPI-Bcast, MPI-AlltoAll, MPI-Scatter and MPI-Gather. From the collection of results, MPI collective communication on InfiniBand clusters had distinctly better performance in terms of latency and throughput. The analysis indicates that the algorithm used for collective communication performed very well for all message sizes except for MPI-Bcast and MPI-Alltoall operation of inter-node communication. However, InfiniBand provides the lowest latency for all operations since it provides applications with an easy to use messaging service, compared to Gigabit Ethernet, which still requests the operating system for access to one of the server communication resources with the complex dance between an application and a network

    Progress in various TCP variants (February 2009)

    Get PDF
    Transport Control Protocol (TCP), a basic communication language, consists of a set of rules that control communication. There are many versions of TCP which modified time to time as per need. Initially we discuss the basic functions of TCP and their role to control the congestion then graphically examine slow start, congestion avoidance, fast retransmission and fast recovery. This paper compares the performance of different TCP variants specifically Tahoe, Reno, New Reno, Westwood, Selective Acknowledgment (SACK), Forward Acknowledgement (FACK) and Vegas. TCP Vegas algorithm is explained with new structure mechanism and new congestion avoidance and modified slow start mechanisms. Subsequently, a table derived evaluates TCP variants on the basis of algorithm. We conversed the progress, and evaluated advantages and disadvantages of above TCP variants

    The affects of caching in browser stage on the performance of web items delivery

    Get PDF
    Network congestion remains one of the main barriers to the continuing success of the Internet. Caching is a way to reduce traffic load on the server and network backbone, which improves the efficiency and scalability of web items delivery. Caching in computer networks might be performed in different stages. In this article, we investigate the load that web pages can put on a network and how caching can reduce the bandwidth requirements. This article concludes that caching in browser stage improves the delivery of web items

    A comprehensive survey of the current trends and extensions for the proxy mobile IPv6 protocol

    Get PDF
    Network based mobility management has attracted significant research interest due to its salient feature of relieving mobile nodes from participating in the mobility process. This feature of relying the mobility functions on the network entities would indeed eases the deployment of mobility solutions. Proxy Mobile IPv6 (PMIPv6) is considered as a promising network-based mobility management protocol in the next-generation mobile network. However, since the emergence of basic specification of the PMIPv6 protocol, it is still being developed in different directions to enhance its performance in order to ensure the best service for mobile users. This paper presents the PMIPv6 basic specifications and surveys the different extensions that have been considered by both the standardization bodies and researchers to enhance the basic PMIPv6 protocol with interesting features needed to offer a richer mobility experience, namely, clustering, fast handoff, route optimization, network mobility support, and load sharing. The research works conducted for these extensions are analyzed to specify the main issues that should be considered during the design of such extensions. Also, an integrated solution is proposed to show the possibility of combining more than one enhancement feature into a single integrated scheme

    A low cost route optimization scheme for cluster-based proxy MIPv6 protocol

    Get PDF
    Proxy Mobile IPv6 (PMIPv6) is a network based mobility protocol which has been designed to relieve the mobile nodes (MNs) from participating in the mobility process and to reduce the long handoff latency of the MIPv6 protocol. However, PMIPv6 incurs a long communication path due to the triangle routing problem, in which, all packets sent by MNs are obligated to pass through the local mobility anchor. Several solutions have been proposed to mitigate this issue. However, they still incur high signaling overhead to recover the Route Optimization (RO) status after handoff. In this paper, we propose a Cluster-Based RO (CBRO) scheme for the clustered architecture of the PMIPv6, in which, the Mobile Access Gateways (MAGs) are grouped into clusters with a distinguished Head MAG (HMAG) for each. In the proposed CBRO, the RO process is relied on the HMAGs to reduce the handoff latency while achieving a fast recovery of the optimized path after handoff. The proposed CBRO is evaluated analytically and compared with the basic PMIP and the current RO schemes. The obtained numerical results have shown that the proposed CBRO outperforms all other schemes in terms of signaling cost required to recover the RO status after handoff and the total cost performance metrics

    On-demand Multi-Rate, Carrier Sense and Hidden Node Interference-Aware Channel Assignment Scheme in Wireless Mesh Network

    Get PDF
    The proposed interference avoidance channel assignment schemes can provide a significant advantage in term of aggregated throughput by assigning channels only to the active nodes and minimizing the intra-flow and inter-flow interference caused by hidden nodes. However, the negative aspect of the channel assignment schemes for the multi-rate multi-hop wireless mesh networks is the capacity reduction caused by channel reuse over low and high data rate links at the carrier sense and hidden node ranges. This paper proposed an On-demand Multi-Rate, Carrier Sense and Hidden node Interference-Aware channel assignment scheme (AODV-MRCSHDIA) to minimize the interference caused by low rate links at the carrier sense and hidden node ranges on the network throughput. The simulation experiment has been conducted to evaluate the AODVMRCSHDIA over the existing schemes in term of packet delivery ratio and end-to-end delay

    On modeling parallel programs for static mapping: a comparative study

    Get PDF
    Heterogeneous parallel architecture (HPA) are inherently more complicated than their homogeneous counterpart. HPAs allow composition of conventional processors, with specialised processors that target particular types of task. However, this makes mapping and scheduling even more complicated and difficult in parallel applications. Therefore, it is crucial to use a robust modelling approach that can capture all the critical characteristics of the application and facilitate the achieving of optimal mapping. In this study, we perform a concise theoretical analysis as well as a comparison of the existing modelling approaches of parallel applications. The theoretical perspective includes both formal concepts and mathematical definitions based on existing scholarly literature. The important characteristics, success factors and challenges of these modelling approaches have been compared and categorised. The results of the theoretical analysis and comparisons show that the existing modelling approaches still need improvement in parallel application modelling in many aspects such as covered metrics and heterogeneity of processors and networks. Moreover, the results assist us to introduce a new approach, which improves the quality of mapping by taking heterogeneity in action and covering more metrics that help to justify the results in a more accurate way

    Towards accelerated agent-based crowd simulation for Hajj and Umrah

    Get PDF
    There are many scientific applications ranging from weather prediction to oil and gas exploration that requires high-performance computing. It aids industries and researchers to enrich further their advancements. With the advent of general purpose computing over GPUs, most of the applications above are shifting towards High-Performance Computing (HPC). Agent-based crowd simulation is one of the candidates that requires high-performance computing. This type of application is used to predict crowd movement in highly congested areas. One of the most crucial scenarios in which this application can be used is to mimic the movement of the multi-cultural crowd performing Hajj and Umrah in Masjid Al-Haram, Makkah. Adequate performance for an agent-based crowd system is a common problem in computer science. While the existing event planning software, specifically for Hajj and Umrah, are unable to provide the required performance. The main reason is the increasing amount of autonomous pilgrims every year. In this paper, we propose a high performance agent-based crowd simulation that represents pilgrim movement during these rituals. The performance is achieved by parallelizing an open source steering library called OpenSteer using CUDA over GPU. By using our technique, event organizers will be able to simulate large crowds and will also be able to predict whether the developed event plan is viable or not. We have also discussed the architecture and implementation of this parallel Hajj simulation
    corecore