Search CORE

20 research outputs found

Overlapping of Communication and Computation and Early Binding: Fundamental Mechanisms for Improving Parallel Performance on Clusters of Workstations

Author: Dimitrov Rossen Petkov
Publication venue: Scholars Junction
Publication date: 12/05/2001
Field of study

This study considers software techniques for improving performance on clusters of workstations and approaches for designing message-passing middleware that facilitate scalable, parallel processing. Early binding and overlapping of communication and computation are identified as fundamental approaches for improving parallel performance and scalability on clusters. Currently, cluster computers using the Message-Passing Interface for interprocess communication are the predominant choice for building high-performance computing facilities, which makes the findings of this work relevant to a wide audience from the areas of high-performance computing and parallel processing. The performance-enhancing techniques studied in this work are presently underutilized in practice because of the lack of adequate support by existing message-passing libraries and are also rarely considered by parallel algorithm designers. Furthermore, commonly accepted methods for performance analysis and evaluation of parallel systems omit these techniques and focus primarily on more obvious communication characteristics such as latency and bandwidth. This study provides a theoretical framework for describing early binding and overlapping of communication and computation in models for parallel programming. This framework defines four new performance metrics that facilitate new approaches for performance analysis of parallel systems and algorithms. This dissertation provides experimental data that validate the correctness and accuracy of the performance analysis based on the new framework. The theoretical results of this performance analysis can be used by designers of parallel system and application software for assessing the quality of their implementations and for predicting the effective performance benefits of early binding and overlapping. This work presents MPI/Pro, a new MPI implementation that is specifically optimized for clusters of workstations interconnected with high-speed networks. This MPI implementation emphasizes features such as persistent communication, asynchronous processing, low processor overhead, and independent message progress. These features are identified as critical for delivering maximum performance to applications. The experimental section of this dissertation demonstrates the capability of MPI/Pro to facilitate software techniques that result in significant application performance improvements. Specific demonstrations with Virtual Interface Architecture and TCP/IP over Ethernet are offered

Scholars Junction - Mississippi State University Institutional Repository

A NEW GENERATION CHEMICAL FLOODING SIMULATOR Semi-annual Report for the Period

Author: Gary A Pope
Kamy Sepehrnoori
Mojdeh Delshad
Project Manager Sue Mehlhoff
Publication venue
Publication date: 06/03/2020
Field of study

ABSTRACT 4 SUMMARY 4 Task 1: Formulation and development of Solution Scheme

CiteSeerX

VIA Communication Performance on a Gigabit Ethernet Cluster

Author: Q. O. Snell
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

High Performance Java Remote Method Invocation for Parallel Computing on Clusters

Author: López Taboada Guillermo
Teijeiro Barjas Carlos
Touriño Juan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/11/2007
Field of study

This is a post-peer-review, pre-copyedit version. The final authenticated version is available online at: http://dx.doi.org/10.1109/ISCC.2007.4381536[Abstract] This paper presents a more efficient Java remote method invocation (RMI) implementation for high-speed clusters. The use of Java for parallel programming on clusters is limited by the lack of efficient communication middleware and high-speed cluster interconnect support. This implementation overcomes these limitations through a more efficient Java RMI protocol based on several basic assumptions on clusters. Moreover, the use of a high performance sockets library provides with direct high-speed interconnect support. The performance evaluation of this middleware on a gigabit Ethernet (GbE) and a scalable coherent interface (SCI) cluster shows experimental evidence of throughput increase. Moreover, qualitative aspects of the solution such as transparency to the user, interoperability with other systems and no need of source code modification can augment the performance of existing parallel Java codes and boost the development of new high performance Java RMI applications.Ministerio de Education y Ciencia; TIN2004-07797-C02Xunta de Galicia; PGIDIT06PXIB105228PR

Repositorio da Universidade da Coruña

Crossref

Modeling of multi-cluster systems in the presence of network heterogeneity

Author: Abawajy Jemal
Akbari Mohammad
Javadi Bahman
Publication venue: Istanbul Technical University
Publication date: 01/01/2005
Field of study

Deakin Research Online

Performance measurement and analysis of PC based cluster server using SET of Architecture and modeling a scalable High performance cluster

Author: Mehta Mihir J.
Publication venue
Publication date: 01/04/2006
Field of study

Not availabl

Etheses - A Saurashtra University Library Service

Performance analysis of heterogeneous multi-cluster systems

Author: Abawajy Jemal
Akbari Mohammad K.
Javadi Bahman
Publication venue: IEEE Computer Society
Publication date: 01/01/2005
Field of study

When building a cost-effective high-performance parallel processing system, a performance model is a useful tool for exploring the design space and examining various parameters. However, performance analysis in such systems has proven to be a challenging task that requires the innovative performance analysis tools and methods to keep up with the rapid evolution and ever increasing complexity of such systems. To this end, we propose an analytical model for heterogeneous multi-cluster systems. The model takes into account stochastic quantities as well as network heterogeneity in bandwidth and latency in each cluster. Also, blocking and non-blocking network architecture model is proposed and are used in performance analysis of the system. The message latency is used as the primary performance metric. The model is validated by constructing a set of simulators to simulate different types of clusters, and by comparing the modeled results with the simulated ones.<br /

Deakin Research Online

TOP500 Sublist for November 2001

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Design of efficient Java communications for high performance computing

Author: López Taboada Guillermo
Publication venue
Publication date: 01/01/2009
Field of study

[Abstract] There is an increasing interest to adopt Java as the parallel programming language for the multi-core era. Although Java offers important advantages, such as built-in multithreading and networking support, productivity and portability, the lack of efficient communication middleware is an important drawback for its uptake in High Performance Computing (HPC). This PhD Thesis presents the design, implementation and evaluation of several solutions to improve this situation: (1) a high performance Java sockets implementation (JFS, Java Fast Sockets) on high-speed networks (e.g., Myrinet, InfiniBand) and shared memory (e.g., multi-core) machines; (2) a low-level messaging device, iodev, which efficiently overlaps communication and computation; and (3) a more scalable Java message-passing library, Fast MPJ (F-MPJ). Furthermore, new Java parallel benchmarks have been implemented and used for the performance evaluation of the developed middleware. The final and main conclusion is that the use of Java for HPC is feasible and even advisable when looking for productive development, provided that efficient communication middleware is made available, such as the projects presented in this Thesis.[Resumen] La tesis doctoral "Design of Efficient Java Communications for High Performance Computing" parte de la hipótesis inicial de que es posible desarrollar aplicaciones Java en computación de altas prestaciones, un ámbito en el que el rendimiento es crucial, siempre que esté disponible un middleware de comunicación eficiente. Así, se han diseñado, desarrollado y evaluado diferentes bibliotecas de comunicación en Java, desde el nivel de sockets al de paso de mensajes, obteniendo notables incrementos de eficiencia, confirmando que la hipótesis inicial es factible

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Recommended from our members

Implementation of MP{_}Lite for the VI Architecture

Author: Chen Weiyi
Publication venue: Ames Laboratory
Publication date: 31/12/2002
Field of study

MP{_}Lite is a light weight message-passing library designed to deliver the maximum performance to applications in a portable and user friendly manner. The Virtual Interface (VI) architecture is a user-level communication protocol that bypasses the operating system to provide much better performance than traditional network architectures. By combining the high efficiency of MP{_}Lite and high performance of the VI architecture, they are able to implement a high performance message-passing library that has much lower latency and better throughput. The design and implementation of MP{_}Lite for M-VIA, which is a modular implementation of the VI architecture on Linux, is discussed in this thesis. By using the eager protocol for sending short messages, MP{_}Lite M-VIA has much lower latency on both Fast Ethernet and Gigabit Ethernet. The handshake protocol and RDMA mechanism provides double the throughput that MPICH can deliver for long messages. MP{_}Lite M-VIA also has the ability to channel-bonding multiple network interface cards to increase the potential bandwidth between nodes. Using multiple Fast Ethernet cards can double or even triple the maximum throughput without increasing the cost of a PC cluster greatly

UNT Digital Library