10,015 research outputs found
Parallel containers: a tool for applying parallel computing applications on clusters
Parallel and cluster computing remain somewhat difficult to apply quickly for many applications
domains. Recent developments in computer libraries such as the Standard Template
Library of the C++ language and the Message Passing Package associated with the Python
Language provide a way to implement very high level parallel containers in support of application
programming. A parallel container is an implementation of a data structure such as a
list, or vector, or set, that has associated with it the necessary methods and state knowledge
to distribute the contents of the structure across the memory of a parallel computer or a
computer cluster. A key idea is that of the parallel iterator which allows a single high level
statement written by the applications programmer to invoke a parallel operation across the
entire data structure’s contents while avoiding the need for knowledge of how the distribution
is actually carried out. This transparency approach means that optimised parallel algorithms
can be separated from the applications domain code, maximising reuse of the parallel computing
infrastructure and libraries. This paper describes our initial experiments with C++
parallel containers
Transparent Orchestration of Task-based Parallel Applications in Containers Platforms
This paper presents a framework to easily build and execute parallel applications in container-based distributed computing platforms in a user-transparent way. The proposed framework is a combination of the COMP Superscalar (COMPSs) programming model and runtime, which provides a straightforward way to develop task-based parallel applications from sequential codes, and containers management platforms that ease the deployment of applications in computing environments (as Docker, Mesos or Singularity). This framework provides scientists and developers with an easy way to implement parallel distributed applications and deploy them in a one-click fashion. We have built a prototype which integrates COMPSs with different containers engines in different scenarios: i) a Docker cluster, ii) a Mesos cluster, and iii) Singularity in an HPC cluster. We have evaluated the overhead in the building phase, deployment and execution of two benchmark applications compared to a Cloud testbed based on KVM and OpenStack and to the usage of bare metal nodes. We have observed an important gain in comparison to cloud environments during the building and deployment phases. This enables better adaptation of resources with respect to the computational load. In contrast, we detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.This work is partly supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316 project, by the Generalitat de Catalunya under contracts 2014-SGR-1051 and 2014-SGR-1272, and by the European Union through the Horizon 2020 research and innovation program under grant 690116 (EUBra-BIGSEA Project). Results presented in this paper were obtained using the Chameleon testbed supported by the National Science Foundation.Peer ReviewedPostprint (author's final draft
A Multi-GPU Programming Library for Real-Time Applications
We present MGPU, a C++ programming library targeted at single-node multi-GPU
systems. Such systems combine disproportionate floating point performance with
high data locality and are thus well suited to implement real-time algorithms.
We describe the library design, programming interface and implementation
details in light of this specific problem domain. The core concepts of this
work are a novel kind of container abstraction and MPI-like communication
methods for intra-system communication. We further demonstrate how MGPU is used
as a framework for porting existing GPU libraries to multi-device
architectures. Putting our library to the test, we accelerate an iterative
non-linear image reconstruction algorithm for real-time magnetic resonance
imaging using multiple GPUs. We achieve a speed-up of about 1.7 using 2 GPUs
and reach a final speed-up of 2.1 with 4 GPUs. These promising results lead us
to conclude that multi-GPU systems are a viable solution for real-time MRI
reconstruction as well as signal-processing applications in general.Comment: 15 pages, 10 figure
funcX: A Federated Function Serving Fabric for Science
Exploding data volumes and velocities, new computational methods and
platforms, and ubiquitous connectivity demand new approaches to computation in
the sciences. These new approaches must enable computation to be mobile, so
that, for example, it can occur near data, be triggered by events (e.g.,
arrival of new data), be offloaded to specialized accelerators, or run remotely
where resources are available. They also require new design approaches in which
monolithic applications can be decomposed into smaller components, that may in
turn be executed separately and on the most suitable resources. To address
these needs we present funcX---a distributed function as a service (FaaS)
platform that enables flexible, scalable, and high performance remote function
execution. funcX's endpoint software can transform existing clouds, clusters,
and supercomputers into function serving systems, while funcX's cloud-hosted
service provides transparent, secure, and reliable function execution across a
federated ecosystem of endpoints. We motivate the need for funcX with several
scientific case studies, present our prototype design and implementation, show
optimizations that deliver throughput in excess of 1 million functions per
second, and demonstrate, via experiments on two supercomputers, that funcX can
scale to more than more than 130000 concurrent workers.Comment: Accepted to ACM Symposium on High-Performance Parallel and
Distributed Computing (HPDC 2020). arXiv admin note: substantial text overlap
with arXiv:1908.0490
HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges
High Performance Computing (HPC) clouds are becoming an alternative to
on-premise clusters for executing scientific applications and business
analytics services. Most research efforts in HPC cloud aim to understand the
cost-benefit of moving resource-intensive applications from on-premise
environments to public cloud platforms. Industry trends show hybrid
environments are the natural path to get the best of the on-premise and cloud
resources---steady (and sensitive) workloads can run on on-premise resources
and peak demand can leverage remote resources in a pay-as-you-go manner.
Nevertheless, there are plenty of questions to be answered in HPC cloud, which
range from how to extract the best performance of an unknown underlying
platform to what services are essential to make its usage easier. Moreover, the
discussion on the right pricing and contractual models to fit small and large
users is relevant for the sustainability of HPC clouds. This paper brings a
survey and taxonomy of efforts in HPC cloud and a vision on what we believe is
ahead of us, including a set of research challenges that, once tackled, can
help advance businesses and scientific discoveries. This becomes particularly
relevant due to the fast increasing wave of new HPC applications coming from
big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR
- …