Search CORE

21 research outputs found

Trilinos I/O Support (Trios)

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2012
Field of study

Crossref

Software Roadmap to Plug and Play Petaflop/s

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Automating Topology Aware Mapping for Supercomputers

Author: Bhatele Abhinav
Publication venue
Publication date: 01/01/2010
Field of study

Petascale machines with hundreds of thousands of cores are being built. These machines have varying interconnect topologies and large network diameters. Computation is cheap and communication on the network is becoming the bottleneck for scaling of parallel applications. Network contention, specifically, is becoming an increasingly important factor affecting overall performance. The broad goal of this dissertation is performance optimization of parallel applications through reduction of network contention. Most parallel applications have a certain communication topology. Mapping of tasks in a parallel application based on their communication graph, to the physical processors on a machine can potentially lead to performance improvements. Mapping of the communication graph for an application on to the interconnect topology of a machine while trying to localize communication is the research problem under consideration. The farther different messages travel on the network, greater is the chance of resource sharing between messages. This can create contention on the network for networks commonly used today. Evaluative studies in this dissertation show that on IBM Blue Gene and Cray XT machines, message latencies can be severely affected under contention. Realizing this fact, application developers have started paying attention to the mapping of tasks to physical processors to minimize contention. Placement of communicating tasks on nearby physical processors can minimize the distance traveled by messages and reduce the chances of contention. Performance improvements through topology aware placement for applications such as NAMD and OpenAtom are used to motivate this work. Building on these ideas, the dissertation proposes algorithms and techniques for automatic mapping of parallel applications to relieve the application developers of this burden. The effect of contention on message latencies is studied in depth to guide the design of mapping algorithms. The hop-bytes metric is proposed for the evaluation of mapping algorithms as a better metric than the previously used maximum dilation metric. The main focus of this dissertation is on developing topology aware mapping algorithms for parallel applications with regular and irregular communication patterns. The automatic mapping framework is a suite of such algorithms with capabilities to choose the best mapping for a problem with a given communication graph. The dissertation also briefly discusses completely distributed mapping techniques which will be imperative for machines of the future.published or submitted for publicationnot peer reviewe

CiteSeerX

Illinois Digital Environment for Access to Learning and Scholarship Repository

Recommended from our members

The portals 3.3 message passing interface document revision 2.1.

Author: Brightwell Ronald Brian
Hudson Trammell B. (OS Research, Washington, D.C.)
Maccabe Arthur Bernard (University of New Mexico, Albuquerque, NM)
Pedretti Kevin Thomas Tauke
Riesen Rolf E.
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/04/2006
Field of study

UNT Digital Library

Coping at the User-Level with Resource Limitations in the Cray Message Passing Toolkit MPI at Scale: How Not to Spend Your Summer Vacation

Author: Art Mirin
Barry F Smith
Forrest M Hoffman
Glenn E Hammond
Kalyan S Perumalla
Patrick H Worley
Richard T Mills
Publication venue
Publication date: 11/04/2020
Field of study

ABSTRACT: As the number of processor cores available in Cray XT series computers has rapidly grown, users have increasingly encountered instances where an MPI code that has previously worked for years unexpectedly fails at high core counts ("at scale") due to resource limitations being exceeded within the MPI implementation. Here, we examine several examples drawn from user experiences and discuss strategies for working around these difficulties at the user level

CiteSeerX

The portals 3.3 message passing interface document revision 2.1.

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Summary of multi-core hardware and programming model investigations

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

One-Sided Communication for High Performance Computing Applications

Author: Barrett Brian William
Publication venue: [Bloomington, Ind.] : Indiana University
Publication date: 01/01/2009
Field of study

Thesis (Ph.D.) - Indiana University, Computer Sciences, 2009Parallel programming presents a number of critical challenges to application developers. Traditionally, message passing, in which a process explicitly sends data and another explicitly receives the data, has been used to program parallel applications. With the recent growth in multi-core processors, the level of parallelism necessary for next generation machines is cause for concern in the message passing community. The one-sided programming paradigm, in which only one of the two processes involved in communication actively participates in message transfer, has seen increased interest as a potential replacement for message passing. One-sided communication does not carry the heavy per-message overhead associated with modern message passing libraries. The paradigm offers lower synchronization costs and advanced data manipulation techniques such as remote atomic arithmetic and synchronization operations. These combine to present an appealing interface for applications with random communication patterns, which traditionally present message passing implementations with difficulties. This thesis presents a taxonomy of both the one-sided paradigm and of applications which are ideal for the one-sided interface. Three case studies, based on real-world applications, are used to motivate both taxonomies and verify the applicability of the MPI one-sided communication and Cray SHMEM one-sided interfaces to real-world problems. While our results show a number of short-comings with existing implementations, they also suggest that a number of applications could beneﬁt from the one-sided paradigm. Finally, an implementation of the MPI one-sided interface within Open MPI is presented, which provides a number of unique performance features necessary for efficient use of the one-sided programming paradigm

IUScholarWorks (University of Indiana)

Recommended from our members

The Portals 4.0 network programming interface.

Author: Barrett Brian W.
Brightwell Ronald Brian
Hemmert Karl Scott
Hudson Trammell B.
Maccabe Arthur Bernard
Pedretti Kevin Thomas Tauke
Riesen Rolf E.
Underwood Keith Douglas
Wheeler Kyle Bruce
Publication venue: Sandia National Laboratories
Publication date: 01/11/2012
Field of study

This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities

UNT Digital Library

Red Storm usage model :Version 1.12.

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref