491 research outputs found
Virtualizing the Stampede2 Supercomputer with Applications to HPC in the Cloud
Methods developed at the Texas Advanced Computing Center (TACC) are described
and demonstrated for automating the construction of an elastic, virtual cluster
emulating the Stampede2 high performance computing (HPC) system. The cluster
can be built and/or scaled in a matter of minutes on the Jetstream self-service
cloud system and shares many properties of the original Stampede2, including:
i) common identity management, ii) access to the same file systems, iii)
equivalent software application stack and module system, iv) similar job
scheduling interface via Slurm.
We measure time-to-solution for a number of common scientific applications on
our virtual cluster against equivalent runs on Stampede2 and develop an
application profile where performance is similar or otherwise acceptable. For
such applications, the virtual cluster provides an effective form of "cloud
bursting" with the potential to significantly improve overall turnaround time,
particularly when Stampede2 is experiencing long queue wait times. In addition,
the virtual cluster can be used for test and debug without directly impacting
Stampede2. We conclude with a discussion of how science gateways can leverage
the TACC Jobs API web service to incorporate this cloud bursting technique
transparently to the end user.Comment: 6 pages, 0 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
Software-Defined Cloud Computing: Architectural Elements and Open Challenges
The variety of existing cloud services creates a challenge for service
providers to enforce reasonable Software Level Agreements (SLA) stating the
Quality of Service (QoS) and penalties in case QoS is not achieved. To avoid
such penalties at the same time that the infrastructure operates with minimum
energy and resource wastage, constant monitoring and adaptation of the
infrastructure is needed. We refer to Software-Defined Cloud Computing, or
simply Software-Defined Clouds (SDC), as an approach for automating the process
of optimal cloud configuration by extending virtualization concept to all
resources in a data center. An SDC enables easy reconfiguration and adaptation
of physical resources in a cloud infrastructure, to better accommodate the
demand on QoS through a software that can describe and manage various aspects
comprising the cloud environment. In this paper, we present an architecture for
SDCs on data centers with emphasis on mobile cloud applications. We present an
evaluation, showcasing the potential of SDC in two use cases-QoS-aware
bandwidth allocation and bandwidth-aware, energy-efficient VM placement-and
discuss the research challenges and opportunities in this emerging area.Comment: Keynote Paper, 3rd International Conference on Advances in Computing,
Communications and Informatics (ICACCI 2014), September 24-27, 2014, Delhi,
Indi
Next Generation Cloud Computing: New Trends and Research Directions
The landscape of cloud computing has significantly changed over the last
decade. Not only have more providers and service offerings crowded the space,
but also cloud infrastructure that was traditionally limited to single provider
data centers is now evolving. In this paper, we firstly discuss the changing
cloud infrastructure and consider the use of infrastructure from multiple
providers and the benefit of decentralising computing away from data centers.
These trends have resulted in the need for a variety of new computing
architectures that will be offered by future cloud infrastructure. These
architectures are anticipated to impact areas, such as connecting people and
devices, data-intensive computing, the service space and self-learning systems.
Finally, we lay out a roadmap of challenges that will need to be addressed for
realising the potential of next generation cloud systems.Comment: Accepted to Future Generation Computer Systems, 07 September 201
Design Space Exploration and Resource Management of Multi/Many-Core Systems
The increasing demand of processing a higher number of applications and related data on computing platforms has resulted in reliance on multi-/many-core chips as they facilitate parallel processing. However, there is a desire for these platforms to be energy-efficient and reliable, and they need to perform secure computations for the interest of the whole community. This book provides perspectives on the aforementioned aspects from leading researchers in terms of state-of-the-art contributions and upcoming trends
Single-Board-Computer Clusters for Cloudlet Computing in Internet of Things
The number of connected sensors and devices is expected to increase to billions in the near
future. However, centralised cloud-computing data centres present various challenges to meet the
requirements inherent to Internet of Things (IoT) workloads, such as low latency, high throughput
and bandwidth constraints. Edge computing is becoming the standard computing paradigm for
latency-sensitive real-time IoT workloads, since it addresses the aforementioned limitations related
to centralised cloud-computing models. Such a paradigm relies on bringing computation close to
the source of data, which presents serious operational challenges for large-scale cloud-computing
providers. In this work, we present an architecture composed of low-cost Single-Board-Computer
clusters near to data sources, and centralised cloud-computing data centres. The proposed
cost-efficient model may be employed as an alternative to fog computing to meet real-time IoT
workload requirements while keeping scalability. We include an extensive empirical analysis to
assess the suitability of single-board-computer clusters as cost-effective edge-computing micro data
centres. Additionally, we compare the proposed architecture with traditional cloudlet and cloud
architectures, and evaluate them through extensive simulation. We finally show that acquisition costs
can be drastically reduced while keeping performance levels in data-intensive IoT use cases.Ministerio de Economía y Competitividad TIN2017-82113-C2-1-RMinisterio de Economía y Competitividad RTI2018-098062-A-I00European Union’s Horizon 2020 No. 754489Science Foundation Ireland grant 13/RC/209
RFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing
The rigid MPI programming model and batch scheduling dominate
high-performance computing. While clouds brought new levels of elasticity into
the world of computing, supercomputers still suffer from low resource
utilization rates. To enhance supercomputing clusters with the benefits of
serverless computing, a modern cloud programming paradigm for pay-as-you-go
execution of stateless functions, we present rFaaS, the first RDMA-aware
Function-as-a-Service (FaaS) platform. With hot invocations and decentralized
function placement, we overcome the major performance limitations of FaaS
systems and provide low-latency remote invocations in multi-tenant
environments. We evaluate the new serverless system through a series of
microbenchmarks and show that remote functions execute with negligible
performance overheads. We demonstrate how serverless computing can bring
elastic resource management into MPI-based high-performance applications.
Overall, our results show that MPI applications can benefit from modern cloud
programming paradigms to guarantee high performance at lower resource costs
Accelerating Climate and Weather Simulations through Hybrid Computing
Unconventional multi- and many-core processors (e.g. IBM (R) Cell B.E.(TM) and NVIDIA (R) GPU) have emerged as effective accelerators in trial climate and weather simulations. Yet these climate and weather models typically run on parallel computers with conventional processors (e.g. Intel, AMD, and IBM) using Message Passing Interface. To address challenges involved in efficiently and easily connecting accelerators to parallel computers, we investigated using IBM's Dynamic Application Virtualization (TM) (IBM DAV) software in a prototype hybrid computing system with representative climate and weather model components. The hybrid system comprises two Intel blades and two IBM QS22 Cell B.E. blades, connected with both InfiniBand(R) (IB) and 1-Gigabit Ethernet. The system significantly accelerates a solar radiation model component by offloading compute-intensive calculations to the Cell blades. Systematic tests show that IBM DAV can seamlessly offload compute-intensive calculations from Intel blades to Cell B.E. blades in a scalable, load-balanced manner. However, noticeable communication overhead was observed, mainly due to IP over the IB protocol. Full utilization of IB Sockets Direct Protocol and the lower latency production version of IBM DAV will reduce this overhead
- …