28,037 research outputs found
Evaluation of Docker Containers for Scientific Workloads in the Cloud
The HPC community is actively researching and evaluating tools to support
execution of scientific applications in cloud-based environments. Among the
various technologies, containers have recently gained importance as they have
significantly better performance compared to full-scale virtualization, support
for microservices and DevOps, and work seamlessly with workflow and
orchestration tools. Docker is currently the leader in containerization
technology because it offers low overhead, flexibility, portability of
applications, and reproducibility. Singularity is another container solution
that is of interest as it is designed specifically for scientific applications.
It is important to conduct performance and feature analysis of the container
technologies to understand their applicability for each application and target
execution environment. This paper presents a (1) performance evaluation of
Docker and Singularity on bare metal nodes in the Chameleon cloud (2) mechanism
by which Docker containers can be mapped with InfiniBand hardware with RDMA
communication and (3) analysis of mapping elements of parallel workloads to the
containers for optimal resource management with container-ready orchestration
tools. Our experiments are targeted toward application developers so that they
can make informed decisions on choosing the container technologies and
approaches that are suitable for their HPC workloads on cloud infrastructure.
Our performance analysis shows that scientific workloads for both Docker and
Singularity based containers can achieve near-native performance. Singularity
is designed specifically for HPC workloads. However, Docker still has
advantages over Singularity for use in clouds as it provides overlay networking
and an intuitive way to run MPI applications with one container per rank for
fine-grained resources allocation
Deploying AI Frameworks on Secure HPC Systems with Containers
The increasing interest in the usage of Artificial Intelligence techniques
(AI) from the research community and industry to tackle "real world" problems,
requires High Performance Computing (HPC) resources to efficiently compute and
scale complex algorithms across thousands of nodes. Unfortunately, typical data
scientists are not familiar with the unique requirements and characteristics of
HPC environments. They usually develop their applications with high-level
scripting languages or frameworks such as TensorFlow and the installation
process often requires connection to external systems to download open source
software during the build. HPC environments, on the other hand, are often based
on closed source applications that incorporate parallel and distributed
computing API's such as MPI and OpenMP, while users have restricted
administrator privileges, and face security restrictions such as not allowing
access to external systems. In this paper we discuss the issues associated with
the deployment of AI frameworks in a secure HPC environment and how we
successfully deploy AI frameworks on SuperMUC-NG with Charliecloud.Comment: 6 pages, 2 figures, 2019 IEEE High Performance Extreme Computing
Conferenc
- …