8 research outputs found

    A SOFTWARE DEFINED NETWORKING ARCHITECTURE FOR HIGH PERFORMANCE CLOUDS 1

    Get PDF
    ABSTRACT-Multi-tenant clouds with resource virtualization offer elasticity of resources and elimination of initial cluster setup cost and time for applications. However, poor network performance, performance variation and noisy neighbors are some of the challenges for execution of high performance applications on public clouds. Utilizing these virtualized resources for scientific applications, which have complex communication patterns, require low latency communication mechanisms and a rich set of communication constructs. To minimize the virtualization overhead, a novel approach for low latency networking for HPC Clouds is proposed and implemented over a multi-technology software defined network. The efficiency of the proposed low-latency SDN is analyzed and evaluated for high performance applications. The results of the experiments show that the latest Mellanox FDR InfiniBand interconnect and Mellanox OpenStack plugin gives the best performance for implementing virtual machine based high performance clouds with large message sizes

    Evaluation of the network performance in a high performance computing cloud

    Get PDF
    Pilvipalvelut mahdollistavat resurssien joustavan käytön. Erityisesti niin sanoituissa Infrastructure-as-a-Service -pilvipalveluissa käyttäjän voivat virtualisoinnin kautta ajaa sovelluksiaan omissa virtuaalikoneissaan ja siten muokata sovellusten ajoympäristöä omien tarpeidensa mukaan. Näissä palveluissa käytettävä virtualisointi lisää yleisrasitetta, joka heikentää sekä laskennan että I/O-laitteiden suorituskykyä. Tässä työssä evaluoidaan tällaisen pilvipalvelun verkon suorituskykyä. Palvelussa käytetty verkkoteknologia pohjautuu InfiniBand-arkkitehtuuriin, joka on yleinen teknologia erityisesti suurteholaskennassa käytettävissä klusterijärjestelmissä. Evaluointimenetelmät tutkivat verkon latenssia ja läpisyöttöä (engl. throughput) eri skenaarioissa, joissa suureita tutkitaan sekä ilman virtualisointia että virtualisoinnin kanssa. Skenaarioiden tarkoituksena on kartoittaa yleisrasitteeseen voimakkaimmin vaikuttavia tekijöitä. Tämän lisäksi työssä evaluoidaan erityistä SR-IOV-teknologiaa, joka mahdollistaa fyysisen laitteen esittämisen joukkona virtuaalikoneisiin liitettäviä virtuaalilaitteita. Teknologian avulla voidaan yleisesti tehostaa I/O laitteiden suorituskykyä virtuaalikoneissa. Tämän evaluoinnin yhteydessä käytettävissä InfiniBand-laitteissa on SR-IOV-tuesta ollut kehitysversio, jota on testettu evaluoitavassa järjestelmässä. Evaluoinnin tulokset osoittavat käytettävän tunnelointiprotokollan sekä virtualisoinnin I/O-tuen puutteen aiheuttavan suurimmat suorituskyvyn menetykset evaluoiduissa skenaarioissa. Evaluoitu SR-IOV-teknologia on tulosten perusteella kaikissa tapauksissa suositeltava käyttöönotettava teknologia suorituskyvyn parantamiseksi.The cloud services enable a flexible use of resources. Especially in so called Infrasturcture-as-a-Service style cloud services the users can run their own applications in their own virtual machines and so customize the whole execution environment as needed. However the virtualization introduces an overhead which decreases the performance of computation and I/O-device access. This work contains a network performance evaluation of this kind of cloud service. The service uses InfiniBand as its network interconnect solution, a technology often used in high performance computing clusters. The evaluation methods study the network latency and throughput in different scenarios. In these scenarios the metrics are studied with and without virtualization. The purpose of these scenarios is to study the major contributing sources for the introduced overhead. This work also contains an evaluation of SR-IOV technology, which enables the mapping from physical device into multiple virtual functions which can be assigned directly to virtual machines. The technology can be used to improve the performance of I/O devices. In this work the SR-IOV technology is studied with InfiniBand devices which are currently having an experimental support for SR-IOV. The evaluation results show that the tunneling protocol used and the lack of hardware support for virtualized I/O are causing the biggest performance losses in the evaluated scenarios. The evaluated SR-IOV technology is, based on the evaluated scenarios, desired in all cases to improve the performance

    Performance characterization of containerization for HPC workloads on InfiniBand clusters: an empirical study

    Get PDF
    Containerization technology offers an appealing alternative for encapsulating and operating applications (and all their dependencies) without being constrained by the performance penalties of using Virtual Machines and, as a result, has got the interest of the High-Performance Computing (HPC) community to obtain fast, customized, portable, flexible, and reproducible deployments of their workloads. Previous work on this area has demonstrated that containerized HPC applications can exploit InfiniBand networks, but has ignored the potential of multi-container deployments which partition the processes that belong to each application into multiple containers in each host. Partitioning HPC applications has demonstrated to be useful when using virtual machines by constraining them to a single NUMA (Non-Uniform Memory Access) domain. This paper conducts a systematical study on the performance of multi-container deployments with different network fabrics and protocols, focusing especially on Infiniband networks. We analyze the impact of container granularity and its potential to exploit processor and memory affinity to improve applications’ performance. Our results show that default Singularity can achieve near bare-metal performance but does not support fine-grain multi-container deployments. Docker and Singularity-instance have similar behavior in terms of the performance of deployment schemes with different container granularity and affinity. This behavior differs for the several network fabrics and protocols, and depends as well on the application communication patterns and the message size. Moreover, deployments on Infiniband are also more impacted by the computation and memory allocation, and because of that, they can exploit the affinity better.We thank Lenovo for providing the testbed to run the experiments in this paper. This work was partially supported by Lenovo as part of Lenovo-BSC collaboration agreement, by the Spanish Government under contract PID2019-107255GB-C22, and by the Generalitat de Catalunya under contract 2017-SGR-1414 and under Grant No. 2020 FI-B 00257.Peer ReviewedPostprint (published version

    HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

    Full text link
    High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

    Supercomputing Frontiers

    Get PDF
    This open access book constitutes the refereed proceedings of the 6th Asian Supercomputing Conference, SCFA 2020, which was planned to be held in February 2020, but unfortunately, the physical conference was cancelled due to the COVID-19 pandemic. The 8 full papers presented in this book were carefully reviewed and selected from 22 submissions. They cover a range of topics including file systems, memory hierarchy, HPC cloud platform, container image configuration workflow, large-scale applications, and scheduling

    Evaluating Performance of Serverless Virtualization

    Get PDF
    Abstract. The serverless computing has posed new challenges for cloud vendors that are difficult to solve with existing virtualization technologies. Maintaining security, resource isolation, backwards compatibility and scalability is extremely difficult when the platform should be able to deliver native performance. This paper contains a literature review of recently published results related to the performance of virtualization technologies such as KVM and Docker, and further reports a DESMET benchmarking evaluation against KVM and Docker, as well as Firecracker and gVisor, which are being used by Amazon Web Services and Google Cloud in their cloud services. The context for this research is coming from education, where students return their programming assignments into a source code repository system that further triggers automated tests and potentially other tasks against the submitted code. The used environment consists of several software components, such as web server, database and job executor, and thus represents a common architecture in web-based applications. The results of the research show that Docker is still the most performant virtualization technology amongst the selected ones. Additionally, Firecracker and gVisor perform better in some areas than KVM and thus are viable options for single-tenant environments. Lastly, applications that run untrusted code or have otherwise really high security requirements could potentially leverage from using either Firecracker or gVisor

    Enabling Hyperscale Web Services

    Full text link
    Modern web services such as social media, online messaging, web search, video streaming, and online banking often support billions of users, requiring data centers that scale to hundreds of thousands of servers, i.e., hyperscale. In fact, the world continues to expect hyperscale computing to drive more futuristic applications such as virtual reality, self-driving cars, conversational AI, and the Internet of Things. This dissertation presents technologies that will enable tomorrow’s web services to meet the world’s expectations. The key challenge in enabling hyperscale web services arises from two important trends. First, over the past few years, there has been a radical shift in hyperscale computing due to an unprecedented growth in data, users, and web service software functionality. Second, modern hardware can no longer support this growth in hyperscale trends due to a decline in hardware performance scaling. To enable this new hyperscale era, hardware architects must become more aware of hyperscale software needs and software researchers can no longer expect unlimited hardware performance scaling. In short, systems researchers can no longer follow the traditional approach of building each layer of the systems stack separately. Instead, they must rethink the synergy between the software and hardware worlds from the ground up. This dissertation establishes such a synergy to enable futuristic hyperscale web services. This dissertation bridges the software and hardware worlds, demonstrating the importance of that bridge in realizing efficient hyperscale web services via solutions that span the systems stack. The specific goal is to design software that is aware of new hardware constraints and architect hardware that efficiently supports new hyperscale software requirements. This dissertation spans two broad thrusts: (1) a software and (2) a hardware thrust to analyze the complex hyperscale design space and use insights from these analyses to design efficient cross-stack solutions for hyperscale computation. In the software thrust, this dissertation contributes uSuite, the first open-source benchmark suite of web services built with a new hyperscale software paradigm, that is used in academia and industry to study hyperscale behaviors. Next, this dissertation uses uSuite to study software threading implications in light of today’s hardware reality, identifying new insights in the age-old research area of software threading. Driven by these insights, this dissertation demonstrates how threading models must be redesigned at hyperscale by presenting an automated approach and tool, uTune, that makes intelligent run-time threading decisions. In the hardware thrust, this dissertation architects both commodity and custom hardware to efficiently support hyperscale software requirements. First, this dissertation characterizes commodity hardware’s shortcomings, revealing insights that influenced commercial CPU designs. Based on these insights, this dissertation presents an approach and tool, SoftSKU, that enables cheap commodity hardware to efficiently support new hyperscale software paradigms, improving the efficiency of real-world web services that serve billions of users, saving millions of dollars, and meaningfully reducing the global carbon footprint. This dissertation also presents a hardware-software co-design, uNotify, that redesigns commodity hardware with minimal modifications by using existing hardware mechanisms more intelligently to overcome new hyperscale overheads. Next, this dissertation characterizes how custom hardware must be designed at hyperscale, resulting in industry-academia benchmarking efforts, commercial hardware changes, and improved software development. Based on this characterization’s insights, this dissertation presents Accelerometer, an analytical model that estimates gains from hardware customization. Multiple hyperscale enterprises and hardware vendors use Accelerometer to make well-informed hardware decisions.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169802/1/akshitha_1.pd
    corecore