34,373 research outputs found
Libra: An Economy driven Job Scheduling System for Clusters
Clusters of computers have emerged as mainstream parallel and distributed
platforms for high-performance, high-throughput and high-availability
computing. To enable effective resource management on clusters, numerous
cluster managements systems and schedulers have been designed. However, their
focus has essentially been on maximizing CPU performance, but not on improving
the value of utility delivered to the user and quality of services. This paper
presents a new computational economy driven scheduling system called Libra,
which has been designed to support allocation of resources based on the users?
quality of service (QoS) requirements. It is intended to work as an add-on to
the existing queuing and resource management system. The first version has been
implemented as a plugin scheduler to the PBS (Portable Batch System) system.
The scheduler offers market-based economy driven service for managing batch
jobs on clusters by scheduling CPU time according to user utility as determined
by their budget and deadline rather than system performance considerations. The
Libra scheduler ensures that both these constraints are met within an O(n)
run-time. The Libra scheduler has been simulated using the GridSim toolkit to
carry out a detailed performance analysis. Results show that the deadline and
budget based proportional resource allocation strategy improves the utility of
the system and user satisfaction as compared to system-centric scheduling
strategies.Comment: 13 page
MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface
Application development for distributed computing "Grids" can benefit from
tools that variously hide or enable application-level management of critical
aspects of the heterogeneous environment. As part of an investigation of these
issues, we have developed MPICH-G2, a Grid-enabled implementation of the
Message Passing Interface (MPI) that allows a user to run MPI programs across
multiple computers, at the same or different sites, using the same commands
that would be used on a parallel computer. This library extends the Argonne
MPICH implementation of MPI to use services provided by the Globus Toolkit for
authentication, authorization, resource allocation, executable staging, and
I/O, as well as for process creation, monitoring, and control. Various
performance-critical operations, including startup and collective operations,
are configured to exploit network topology information. The library also
exploits MPI constructs for performance management; for example, the MPI
communicator construct is used for application-level discovery of, and
adaptation to, both network topology and network quality-of-service mechanisms.
We describe the MPICH-G2 design and implementation, present performance
results, and review application experiences, including record-setting
distributed simulations.Comment: 20 pages, 8 figure
Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies
Grid is an infrastructure that involves the integrated and collaborative use
of computers, networks, databases and scientific instruments owned and managed
by multiple organizations. Grid applications often involve large amounts of
data and/or computing resources that require secure resource sharing across
organizational boundaries. This makes Grid application management and
deployment a complex undertaking. Grid middlewares provide users with seamless
computing ability and uniform access to resources in the heterogeneous Grid
environment. Several software toolkits and systems have been developed, most of
which are results of academic research projects, all over the world. This
chapter will focus on four of these middlewares--UNICORE, Globus, Legion and
Gridbus. It also presents our implementation of a resource broker for UNICORE
as this functionality was not supported in it. A comparison of these systems on
the basis of the architecture, implementation model and several other features
is included.Comment: 19 pages, 10 figure
Scientific Computing Meets Big Data Technology: An Astronomy Use Case
Scientific analyses commonly compose multiple single-process programs into a
dataflow. An end-to-end dataflow of single-process programs is known as a
many-task application. Typically, tools from the HPC software stack are used to
parallelize these analyses. In this work, we investigate an alternate approach
that uses Apache Spark -- a modern big data platform -- to parallelize
many-task applications. We present Kira, a flexible and distributed astronomy
image processing toolkit using Apache Spark. We then use the Kira toolkit to
implement a Source Extractor application for astronomy images, called Kira SE.
With Kira SE as the use case, we study the programming flexibility, dataflow
richness, scheduling capacity and performance of Apache Spark running on the
EC2 cloud. By exploiting data locality, Kira SE achieves a 2.5x speedup over an
equivalent C program when analyzing a 1TB dataset using 512 cores on the Amazon
EC2 cloud. Furthermore, we show that by leveraging software originally designed
for big data infrastructure, Kira SE achieves competitive performance to the C
implementation running on the NERSC Edison supercomputer. Our experience with
Kira indicates that emerging Big Data platforms such as Apache Spark are a
performant alternative for many-task scientific applications
Developing High Performance Computing Resources for Teaching Cluster and Grid Computing courses
High-Performance Computing (HPC) and the ability to process large amounts of data are of
paramount importance for UK business and economy as outlined by Rt Hon David Willetts
MP at the HPC and Big Data conference in February 2014. However there is a shortage of
skills and available training in HPC to prepare and expand the workforce for the HPC and
Big Data research and development. Currently, HPC skills are acquired mainly by students
and staff taking part in HPC-related research projects, MSc courses, and at the dedicated
training centres such as Edinburgh University’s EPCC. There are few UK universities teaching
the HPC, Clusters and Grid Computing courses at the undergraduate level. To address the
issue of skills shortages in the HPC it is essential to provide teaching and training as part of
both postgraduate and undergraduate courses. The design and development of such courses is
challenging since the technologies and software in the fields of large scale distributed systems
such as Cluster, Cloud and Grid computing are undergoing continuous change. The students
completing the HPC courses should be proficient in these evolving technologies and equipped
with practical and theoretical skills for future jobs in this fast developing area.
In this paper we present our experience in developing the HPC, Cluster and Grid modules
including a review of existing HPC courses offered at the UK universities. The topics covered in
the modules are described, as well as the coursework projects based on practical laboratory work.
We conclude with an evaluation based on our experience over the last ten years in developing
and delivering the HPC modules on the undergraduate courses, with suggestions for future work
From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation
Starting from a high-level problem description in terms of partial
differential equations using abstract tensor notation, the Chemora framework
discretizes, optimizes, and generates complete high performance codes for a
wide range of compute architectures. Chemora extends the capabilities of
Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient
manner for complex applications, without low-level code tuning. Chemora
achieves parallelism through MPI and multi-threading, combining OpenMP and
CUDA. Optimizations include high-level code transformations, efficient loop
traversal strategies, dynamically selected data and instruction cache usage
strategies, and JIT compilation of GPU code tailored to the problem
characteristics. The discretization is based on higher-order finite differences
on multi-block domains. Chemora's capabilities are demonstrated by simulations
of black hole collisions. This problem provides an acid test of the framework,
as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific
Programmin
- …