1,077 research outputs found
Many-Task Computing and Blue Waters
This report discusses many-task computing (MTC) generically and in the
context of the proposed Blue Waters systems, which is planned to be the largest
NSF-funded supercomputer when it begins production use in 2012. The aim of this
report is to inform the BW project about MTC, including understanding aspects
of MTC applications that can be used to characterize the domain and
understanding the implications of these aspects to middleware and policies.
Many MTC applications do not neatly fit the stereotypes of high-performance
computing (HPC) or high-throughput computing (HTC) applications. Like HTC
applications, by definition MTC applications are structured as graphs of
discrete tasks, with explicit input and output dependencies forming the graph
edges. However, MTC applications have significant features that distinguish
them from typical HTC applications. In particular, different engineering
constraints for hardware and software must be met in order to support these
applications. HTC applications have traditionally run on platforms such as
grids and clusters, through either workflow systems or parallel programming
systems. MTC applications, in contrast, will often demand a short time to
solution, may be communication intensive or data intensive, and may comprise
very short tasks. Therefore, hardware and software for MTC must be engineered
to support the additional communication and I/O and must minimize task dispatch
overheads. The hardware of large-scale HPC systems, with its high degree of
parallelism and support for intensive communication, is well suited for MTC
applications. However, HPC systems often lack a dynamic resource-provisioning
feature, are not ideal for task communication via the file system, and have an
I/O system that is not optimized for MTC-style applications. Hence, additional
software support is likely to be required to gain full benefit from the HPC
hardware
Large-scale grid-enabled lattice-Boltzmann simulations of complex fluid flow in porous media and under shear
Well designed lattice-Boltzmann codes exploit the essentially embarrassingly
parallel features of the algorithm and so can be run with considerable
efficiency on modern supercomputers. Such scalable codes permit us to simulate
the behaviour of increasingly large quantities of complex condensed matter
systems. In the present paper, we present some preliminary results on the large
scale three-dimensional lattice-Boltzmann simulation of binary immiscible fluid
flows through a porous medium derived from digitised x-ray microtomographic
data of Bentheimer sandstone, and from the study of the same fluids under
shear. Simulations on such scales can benefit considerably from the use of
computational steering and we describe our implementation of steering within
the lattice-Boltzmann code, called LB3D, making use of the RealityGrid steering
library. Our large scale simulations benefit from the new concept of capability
computing, designed to prioritise the execution of big jobs on major
supercomputing resources. The advent of persistent computational grids promises
to provide an optimal environment in which to deploy these mesoscale simulation
methods, which can exploit the distributed nature of compute, visualisation and
storage resources to reach scientific results rapidly; we discuss our work on
the grid-enablement of lattice-Boltzmann methods in this context.Comment: 17 pages, 6 figures, accepted for publication in
Phil.Trans.R.Soc.Lond.
The impact of supercomputers on experimentation: A view from a national laboratory
The relative roles of large scale scientific computers and physical experiments in several science and engineering disciplines are discussed. Increasing dependence on computers is shown to be motivated both by the rapid growth in computer speed and memory, which permits accurate numerical simulation of complex physical phenomena, and by the rapid reduction in the cost of performing a calculation, which makes computation an increasingly attractive complement to experimentation. Computer speed and memory requirements are presented for selected areas of such disciplines as fluid dynamics, aerodynamics, aerothermodynamics, chemistry, atmospheric sciences, astronomy, and astrophysics, together with some examples of the complementary nature of computation and experiment. Finally, the impact of the emerging role of computers in the technical disciplines is discussed in terms of both the requirements for experimentation and the attainment of previously inaccessible information on physical processes
DALiuGE: A Graph Execution Framework for Harnessing the Astronomical Data Deluge
The Data Activated Liu Graph Engine - DALiuGE - is an execution framework for
processing large astronomical datasets at a scale required by the Square
Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex
data reduction pipelines consisting of both data sets and algorithmic
components and an implementation run-time to execute such pipelines on
distributed resources. By mapping the logical view of a pipeline to its
physical realisation, DALiuGE separates the concerns of multiple stakeholders,
allowing them to collectively optimise large-scale data processing solutions in
a coherent manner. The execution in DALiuGE is data-activated, where each
individual data item autonomously triggers the processing on itself. Such
decentralisation also makes the execution framework very scalable and flexible,
supporting pipeline sizes ranging from less than ten tasks running on a laptop
to tens of millions of concurrent tasks on the second fastest supercomputer in
the world. DALiuGE has been used in production for reducing interferometry data
sets from the Karl E. Jansky Very Large Array and the Mingantu Ultrawide
Spectral Radioheliograph; and is being developed as the execution framework
prototype for the Science Data Processor (SDP) consortium of the Square
Kilometre Array (SKA) telescope. This paper presents a technical overview of
DALiuGE and discusses case studies from the CHILES and MUSER projects that use
DALiuGE to execute production pipelines. In a companion paper, we provide
in-depth analysis of DALiuGE's scalability to very large numbers of tasks on
two supercomputing facilities.Comment: 31 pages, 12 figures, currently under review by Astronomy and
Computin
Investigating an API for resilient exascale computing.
Increased HPC capability comes with increased complexity, part counts, and fault occurrences. In- creasing the resilience of systems and applications to faults is a critical requirement facing the viability of exascale systems, as the overhead of traditional checkpoint/restart is projected to outweigh its bene ts due to fault rates outpacing I/O bandwidths. As faults occur and propagate throughout hardware and software layers, pervasive noti cation and handling mechanisms are necessary. This report describes an initial investigation of fault types and programming interfaces to mitigate them. Proof-of-concept APIs are presented for the frequent and important cases of memory errors and node failures, and a strategy proposed for lesystem failures. These involve changes to the operating system, runtime, I/O library, and application layers. While a single API for fault handling among hardware and OS and application system-wide remains elusive, the e ort increased our understanding of both the mountainous challenges and the promising trailheads.
AstroGrid-D: Grid Technology for Astronomical Science
We present status and results of AstroGrid-D, a joint effort of
astrophysicists and computer scientists to employ grid technology for
scientific applications. AstroGrid-D provides access to a network of
distributed machines with a set of commands as well as software interfaces. It
allows simple use of computer and storage facilities and to schedule or monitor
compute tasks and data management. It is based on the Globus Toolkit middleware
(GT4). Chapter 1 describes the context which led to the demand for advanced
software solutions in Astrophysics, and we state the goals of the project. We
then present characteristic astrophysical applications that have been
implemented on AstroGrid-D in chapter 2. We describe simulations of different
complexity, compute-intensive calculations running on multiple sites, and
advanced applications for specific scientific purposes, such as a connection to
robotic telescopes. We can show from these examples how grid execution improves
e.g. the scientific workflow. Chapter 3 explains the software tools and
services that we adapted or newly developed. Section 3.1 is focused on the
administrative aspects of the infrastructure, to manage users and monitor
activity. Section 3.2 characterises the central components of our architecture:
The AstroGrid-D information service to collect and store metadata, a file
management system, the data management system, and a job manager for automatic
submission of compute tasks. We summarise the successfully established
infrastructure in chapter 4, concluding with our future plans to establish
AstroGrid-D as a platform of modern e-Astronomy.Comment: 14 pages, 12 figures Subjects: data analysis, image processing,
robotic telescopes, simulations, grid. Accepted for publication in New
Astronom
Adaptation, deployment and evaluation of a railway simulator in cloud environments
Many scientific areas make extensive use of computer simulations to study realworld
processes. As they become more complex and resource-intensive, traditional
programming paradigms running on supercomputers have shown to be limited by
their hardware resources.
The Cloud and its elastic nature has been increasingly seen as a valid alternative
for simulation execution, as it aims to provide virtually infinite resources, thus
unlimited scalability. In order to bene t from this, simulators must be adapted to
this paradigm since cloud migration tends to add virtualization and communication
overhead.
This work has the main objective of migrating a power consumption railway
simulator to the Cloud, with minimal impact in the original code and preserving
performance. We propose a data-centric adaptation based in MapReduce to distribute
the simulation load across several nodes while minimising data transmission.
We deployed our solution on an Amazon EC2 virtual cluster and measured its
performance. We did the same in in our local cluster to compare the solution's performance
against the original application when the Cloud's overhead is not present.
Our tests show that the resulting application is highly scalable and shows a better
overall performance regarding the original simulator in both environments.
This document summarises the author's work during the whole adaptation development
process .Ingeniería Informátic
Steering in computational science: mesoscale modelling and simulation
This paper outlines the benefits of computational steering for high
performance computing applications. Lattice-Boltzmann mesoscale fluid
simulations of binary and ternary amphiphilic fluids in two and three
dimensions are used to illustrate the substantial improvements which
computational steering offers in terms of resource efficiency and time to
discover new physics. We discuss details of our current steering
implementations and describe their future outlook with the advent of
computational grids.Comment: 40 pages, 11 figures. Accepted for publication in Contemporary
Physic
- …