Search CORE

16 research outputs found

Overlapping of Communication and Computation and Early Binding: Fundamental Mechanisms for Improving Parallel Performance on Clusters of Workstations

Author: Dimitrov Rossen Petkov
Publication venue: Scholars Junction
Publication date: 12/05/2001
Field of study

This study considers software techniques for improving performance on clusters of workstations and approaches for designing message-passing middleware that facilitate scalable, parallel processing. Early binding and overlapping of communication and computation are identified as fundamental approaches for improving parallel performance and scalability on clusters. Currently, cluster computers using the Message-Passing Interface for interprocess communication are the predominant choice for building high-performance computing facilities, which makes the findings of this work relevant to a wide audience from the areas of high-performance computing and parallel processing. The performance-enhancing techniques studied in this work are presently underutilized in practice because of the lack of adequate support by existing message-passing libraries and are also rarely considered by parallel algorithm designers. Furthermore, commonly accepted methods for performance analysis and evaluation of parallel systems omit these techniques and focus primarily on more obvious communication characteristics such as latency and bandwidth. This study provides a theoretical framework for describing early binding and overlapping of communication and computation in models for parallel programming. This framework defines four new performance metrics that facilitate new approaches for performance analysis of parallel systems and algorithms. This dissertation provides experimental data that validate the correctness and accuracy of the performance analysis based on the new framework. The theoretical results of this performance analysis can be used by designers of parallel system and application software for assessing the quality of their implementations and for predicting the effective performance benefits of early binding and overlapping. This work presents MPI/Pro, a new MPI implementation that is specifically optimized for clusters of workstations interconnected with high-speed networks. This MPI implementation emphasizes features such as persistent communication, asynchronous processing, low processor overhead, and independent message progress. These features are identified as critical for delivering maximum performance to applications. The experimental section of this dissertation demonstrates the capability of MPI/Pro to facilitate software techniques that result in significant application performance improvements. Specific demonstrations with Virtual Interface Architecture and TCP/IP over Ethernet are offered

Scholars Junction - Mississippi State University Institutional Repository

M-VIA on the PowerPC architecture

Author: Hiedajat Nicky Sagitta
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2003
Field of study

One of the basic principles of cluster communication is to have the smallest cost in the time consumed on delivering messages between nodes. The Virtual Interface Architecture (VIA) is a communication protocol for system area networks (SAN) that bypasses much of the overhead of traditional network protocol stacks and provides more direct access to the network interface controller (NIC). The aim of our research was to investigate if VIA would perform well on PowerPC processors which have a different architecture than the previous processors used with VIA. For that reason, VIA was implemented on PowerPC and new driver support was added. The results indicate that VIA performs better than TCP/IP on large message sizes but not on small message sizes

Digital Repository @ Iowa State University (ISU)

Performance measurement and analysis of PC based cluster server using SET of Architecture and modeling a scalable High performance cluster

Author: Mehta Mihir J.
Publication venue
Publication date: 01/04/2006
Field of study

Not availabl

Etheses - A Saurashtra University Library Service

VIA Communication Performance on a Gigabit Ethernet Cluster

Author: Q. O. Snell
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A NEW GENERATION CHEMICAL FLOODING SIMULATOR Semi-annual Report for the Period

Author: Gary A Pope
Kamy Sepehrnoori
Mojdeh Delshad
Project Manager Sue Mehlhoff
Publication venue
Publication date: 06/03/2020
Field of study

ABSTRACT 4 SUMMARY 4 Task 1: Formulation and development of Solution Scheme

CiteSeerX

Recommended from our members

Implementation of MP{_}Lite for the VI Architecture

Author: Chen Weiyi
Publication venue: Ames Laboratory
Publication date: 31/12/2002
Field of study

MP{_}Lite is a light weight message-passing library designed to deliver the maximum performance to applications in a portable and user friendly manner. The Virtual Interface (VI) architecture is a user-level communication protocol that bypasses the operating system to provide much better performance than traditional network architectures. By combining the high efficiency of MP{_}Lite and high performance of the VI architecture, they are able to implement a high performance message-passing library that has much lower latency and better throughput. The design and implementation of MP{_}Lite for M-VIA, which is a modular implementation of the VI architecture on Linux, is discussed in this thesis. By using the eager protocol for sending short messages, MP{_}Lite M-VIA has much lower latency on both Fast Ethernet and Gigabit Ethernet. The handshake protocol and RDMA mechanism provides double the throughput that MPICH can deliver for long messages. MP{_}Lite M-VIA also has the ability to channel-bonding multiple network interface cards to increase the potential bandwidth between nodes. Using multiple Fast Ethernet cards can double or even triple the maximum throughput without increasing the cost of a PC cluster greatly

UNT Digital Library

Optimizing message-passing performance within symmetric multiprocessor systems

Author: Chen Xuehua
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2003
Field of study

The Message Passing Interface (MPI) has been widely used in the area of parallel computing due to its portability, scalability, and ease of use. Message passing within Symmetric Multiprocessor (SMP) systems is an import part of any MPI library since it enables parallel programs to run efficiently on SMP systems, or clusters of SMP systems when combined with other ways of communication such as TCP/IP. Most message-passing implementations use a shared memory pool as an intermediate buffer to hold messages, some lock mechanisms to protect the pool, and some synchronization mechanism for coordinating the processes. However, the performance varies significantly depending on how these are implemented. The work here implements two SMP message-passing modules using lock-based and lock-free approaches for MPLi̲te, a compact library that implements a subset of the most commonly used MPI functions. Various optimization techniques have been used to optimize the performance. These two modules are evaluated using a communication performance analysis tool called NetPIPE, and compared with the implementations of other MPI libraries such as MPICH, MPICH2, LAM/MPI and MPI/PRO. Performance tools such as PAPI and VTune are used to gather some runtime information at the hardware level. This information together with some cache theory and the hardware configuration is used to explain various performance phenomena. Tests using a real application have shown the performance of the different implementations in real practice. These results all show that the improvements of the new techniques over existing implementations

Digital Repository @ Iowa State University (ISU)

One-Sided Communication for High Performance Computing Applications

Author: Barrett Brian William
Publication venue: [Bloomington, Ind.] : Indiana University
Publication date: 01/01/2009
Field of study

Thesis (Ph.D.) - Indiana University, Computer Sciences, 2009Parallel programming presents a number of critical challenges to application developers. Traditionally, message passing, in which a process explicitly sends data and another explicitly receives the data, has been used to program parallel applications. With the recent growth in multi-core processors, the level of parallelism necessary for next generation machines is cause for concern in the message passing community. The one-sided programming paradigm, in which only one of the two processes involved in communication actively participates in message transfer, has seen increased interest as a potential replacement for message passing. One-sided communication does not carry the heavy per-message overhead associated with modern message passing libraries. The paradigm offers lower synchronization costs and advanced data manipulation techniques such as remote atomic arithmetic and synchronization operations. These combine to present an appealing interface for applications with random communication patterns, which traditionally present message passing implementations with difficulties. This thesis presents a taxonomy of both the one-sided paradigm and of applications which are ideal for the one-sided interface. Three case studies, based on real-world applications, are used to motivate both taxonomies and verify the applicability of the MPI one-sided communication and Cray SHMEM one-sided interfaces to real-world problems. While our results show a number of short-comings with existing implementations, they also suggest that a number of applications could beneﬁt from the one-sided paradigm. Finally, an implementation of the MPI one-sided interface within Open MPI is presented, which provides a number of unique performance features necessary for efficient use of the one-sided programming paradigm

IUScholarWorks (University of Indiana)