Search CORE

7 research outputs found

Enabling Fairness in Cloud Computing Infrastructures

Author: Kannan Ram Srivatsa
Publication venue
Publication date
Field of study

Cloud computing has emerged as a key technology in many ways over the past few years, evidenced by the fact that 93% of the organizations is either running applications or experimenting with Infrastructure-as-a-Service (IaaS) cloud. Hence, to meet the demands of a large set of target audience, IaaS cloud service providers consolidate applications belonging to multiple tenants. However, consolidation of applications leads to performance interference with each other as these applications end up competing for the shared resources violating QoS of the executing tenants. This dissertation investigates the implications of interference in consolidated cloud computing environments to enable fairness in the execution of applications across tenants. In this context, this dissertation identifies three key issues in cloud computing infrastructures. We observe that tenants using IaaS public clouds share multi-core datacenter servers. In such a situation, we identify that the applications belonging to tenants might compete for shared architectural resources like Last Level Cache (LLC) and bandwidth to memory, slowing down the execution time of applications. This necessitates a need for a technique that can accurately estimate the slowdown in execution time caused due to multi-tenant execution. Such slowdown estimates can be used to bill tenants appropriately enabling fairness among tenants. For private datacenters, where performance degradation cannot be tolerated, it becomes critical to detect interference and investigate its root cause. Under such circumstances, there is a need for a real-time, lightweight and scalable mechanism that can detect performance degradation and identify the root cause resource which applications are contending for (I/O, network, CPU, Shared Cache). Finally, the advent of microservice computing environments, calls for a need to rethink resource management strategies in multi-tenant execution scenarios. Specifically, we observe that the visibility enabled by microservices execution framework can be exploited to achieve high throughput and resource utilization while still meeting Service Level Agreements (SLAs) in multi-tenant execution scenarios. To enable this, we propose techniques that can dynamically batch and reorder requests propagating through individual microservice stages within an application.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/149844/1/ramsri_1.pd

Deep Blue Documents at the University of Michigan

System Abstractions for Scalable Application Development at the Edge

Author: Hu Bo
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 01/04/2022
Field of study

Recent years have witnessed an explosive growth of Internet of Things (IoT) devices, which collect or generate huge amounts of data. Given diverse device capabilities and application requirements, data processing takes place across a range of settings, from on-device to a nearby edge server/cloud and remote cloud. Consequently, edge-cloud coordination has been studied extensively from the perspectives of job placement, scheduling and joint optimization. Typical approaches focus on performance optimization for individual applications. This often requires domain knowledge of the applications, but also leads to application-specific solutions. Application development and deployment over diverse scenarios thus incur repetitive manual efforts. There are two overarching challenges to provide system-level support for application development at the edge. First, there is inherent heterogeneity at the device hardware level. The execution settings may range from a small cluster as an edge cloud to on-device inference on embedded devices, differing in hardware capability and programming environments. Further, application performance requirements vary significantly, making it even more difficult to map different applications to already heterogeneous hardware. Second, there are trends towards incorporating edge and cloud and multi-modal data. Together, these add further dimensions to the design space and increase the complexity significantly. In this thesis, we propose a novel framework to simplify application development and deployment over a continuum of edge to cloud. Our framework provides key connections between different dimensions of design considerations, corresponding to the application abstraction, data abstraction and resource management abstraction respectively. First, our framework masks hardware heterogeneity with abstract resource types through containerization, and abstracts away the application processing pipelines into generic flow graphs. Further, our framework further supports a notion of degradable computing for application scenarios at the edge that are driven by multimodal sensory input. Next, as video analytics is the killer app of edge computing, we include a generic data management service between video query systems and a video store to organize video data at the edge. We propose a video data unit abstraction based on a notion of distance between objects in the video, quantifying the semantic similarity among video data. Last, considering concurrent application execution, our framework supports multi-application offloading with device-centric control, with a userspace scheduler service that wraps over the operating system scheduler

Yale University

Power Modeling and Resource Optimization in Virtualized Environments

Author: ABDUL SALAM HUMAIRA
Publication venue: Universit\ue0 degli studi di Genova
Publication date: 26/02/2020
Field of study

The provisioning of on-demand cloud services has revolutionized the IT industry. This emerging paradigm has drastically increased the growth of data centers (DCs) worldwide. Consequently, this rising number of DCs is contributing to a large amount of world total power consumption. This has directed the attention of researchers and service providers to investigate a power-aware solution for the deployment and management of these systems and networks. However, these solutions could be bene\ufb01cial only if derived from a precisely estimated power consumption at run-time. Accuracy in power estimation is a challenge in virtualized environments due to the lack of certainty of actual resources consumed by virtualized entities and of their impact on applications\u2019 performance. The heterogeneous cloud, composed of multi-tenancy architecture, has also raised several management challenges for both service providers and their clients. Task scheduling and resource allocation in such a system are considered as an NP-hard problem. The inappropriate allocation of resources causes the under-utilization of servers, hence reducing throughput and energy e\ufb03ciency. In this context, the cloud framework needs an e\ufb00ective management solution to maximize the use of available resources and capacity, and also to reduce the impact of their carbon footprint on the environment with reduced power consumption. This thesis addresses the issues of power measurement and resource utilization in virtualized environments as two primary objectives. At \ufb01rst, a survey on prior work of server power modeling and methods in virtualization architectures is carried out. This helps investigate the key challenges that elude the precision of power estimation when dealing with virtualized entities. A di\ufb00erent systematic approach is then presented to improve the prediction accuracy in these networks, considering the resource abstraction at di\ufb00erent architectural levels. Resource usage monitoring at the host and guest helps in identifying the di\ufb00erence in performance between the two. Using virtual Performance Monitoring Counters (vPMCs) at a guest level provides detailed information that helps in improving the prediction accuracy and can be further used for resource optimization, consolidation and load balancing. Later, the research also targets the critical issue of optimal resource utilization in cloud computing. This study seeks a generic, robust but simple approach to deal with resource allocation in cloud computing and networking. The inappropriate scheduling in the cloud causes under- and over- utilization of resources which in turn increases the power consumption and also degrades the system performance. This work \ufb01rst addresses some of the major challenges related to task scheduling in heterogeneous systems. After a critical analysis of existing approaches, this thesis presents a rather simple scheduling scheme based on the combination of heuristic solutions. Improved resource utilization with reduced processing time can be achieved using the proposed energy-e\ufb03cient scheduling algorithm

Archivio istituzionale della ricerca - Università di Genova

TACKLING PERFORMANCE AND SECURITY ISSUES FOR CLOUD STORAGE SYSTEMS

Author: Kang Luyi
Publication venue
Publication date: 01/01/2022
Field of study

Building data-intensive applications and emerging computing paradigm (e.g., Machine Learning (ML), Artificial Intelligence (AI), Internet of Things (IoT) in cloud computing environments is becoming a norm, given the many advantages in scalability, reliability, security and performance. However, under rapid changes in applications, system middleware and underlying storage device, service providers are facing new challenges to deliver performance and security isolation in the context of shared resources among multiple tenants. The gap between the decades-old storage abstraction and modern storage device keeps widening, calling for software/hardware co-designs to approach more effective performance and security protocols. This dissertation rethinks the storage subsystem from device-level to system-level and proposes new designs at different levels to tackle performance and security issues for cloud storage systems. In the first part, we present an event-based SSD (Solid State Drive) simulator that models modern protocols, firmware and storage backend in detail. The proposed simulator can capture the nuances of SSD internal states under various I/O workloads, which help researchers understand the impact of various SSD designs and workload characteristics on end-to-end performance. In the second part, we study the security challenges of shared in-storage computing infrastructures. Many cloud providers offer isolation at multiple levels to secure data and instance, however, security measures in emerging in-storage computing infrastructures are not studied. We first investigate the attacks that could be conducted by offloaded in-storage programs in a multi-tenancy cloud environment. To defend against these attacks, we build a lightweight Trusted Execution Environment, IceClave to enable security isolation between in-storage programs and internal flash management functions. We show that while enforcing security isolation in the SSD controller with minimal hardware cost, IceClave still keeps the performance benefit of in-storage computing by delivering up to 2.4x better performance than the conventional host-based trusted computing approach. In the third part, we investigate the performance interference problem caused by other tenants' I/O flows. We demonstrate that I/O resource sharing can often lead to performance degradation and instability. The block device abstraction fails to expose SSD parallelism and pass application requirements. To this end, we propose a software/hardware co-design to enforce performance isolation by bridging the semantic gap. Our design can significantly improve QoS (Quality of Service) by reducing throughput penalties and tail latency spikes. Lastly, we explore more effective I/O control to address contention in the storage software stack. We illustrate that the state-of-the-art resource control mechanism, Linux cgroups is insufficient for controlling I/O resources. Inappropriate cgroup configurations may even hurt the performance of co-located workloads under memory intensive scenarios. We add kernel support for limiting page cache usage per cgroup and achieving I/O proportionality

Digital Repository at the University of Maryland

Control over the Cloud : Offloading, Elastic Computing, and Predictive Control

Author: Skarin Per
Publication venue: Department of Automatic Control, Lund University
Publication date: 24/11/2021
Field of study

The thesis studies the use of cloud native software and platforms to implement critical closed loop control. It considers technologies that provide low latency and reliable wireless communication, in terms of edge clouds and massive MIMO, but also approaches industrial IoT and the services of a distributed cloud, as an extension of commercial-of-the-shelf software and systems.First, the thesis defines the cloud control challenge, as control over the cloud and controller offloading. This is followed by a demonstration of closed loop control, using MPC, running on a testbed representing the distributed cloud.The testbed is implemented using an IoT device, clouds, next generation wireless technology, and a distributed execution platform. Platform details are provided and feasibility of the approach is shown. Evaluation includes relocating an on-line MPC to various locations in the distributed cloud. Offloaded control is examined next, through further evaluation of cloud native software and frameworks. This is followed by three controller designs, tailored for use with the cloud. The first controller solves MPC problems in parallel, to implement a variable horizon controller. The second is a hierarchical design, in which rate switching is used to implement constrained control, with a local and a remote mode. The third design focuses on reliability. Here, the MPC problem is extended to include recovery paths that represent a fallback mode. This is used by a control client if it experiences connectivity issues.An implementation is detailed and examined.In the final part of the thesis, the focus is on latency and congestion. A cloud control client can experience long and variable delays, from network and computations, and used services can become overloaded. These problems are approached by using predicted control inputs, dynamically adjusting the control frequency, and using horizontal scaling of the cloud service. Several examples are shown through simulation and on real clouds, including admitting control clients into a cluster that becomes temporarily overloaded

Lund University Publications

Adaptive CPU Allocation for Resource Isolation and Work Conservation

Author: Guo Cong
Publication venue: 'University of Waterloo'
Publication date: 22/06/2020
Field of study

Consolidating multiple workloads on the same physical machine is an effective measure for utilizing resources efficiently and reducing costs. The main objective is to execute multiple demanding workloads using no more than necessary resources while simultaneously maximizing performance. Conventional work-conserving resource managers are designed for this purpose. However, without adequate control, the performance of consolidated workloads may degrade dramatically or become unpredictable because of contention for shared resources. Hence, resource isolation should be enforced according to a sharing policy when there is resource contention among workloads, i.e., each workload should obtain a theoretical share of resources. In reality, it is challenging for state-of-the-art resource managers to achieve both resource isolation and work conservation simultaneously due to complex and dynamic workloads. This thesis proposes adaptive resource allocation to address this sharing problem and studies CPU management as an example. A novel feedback-based resource manager is designed to perform adaptive allocation of CPU resources, taking into account each workload's requirements. First, an application-agnostic metric is proposed as the feedback signal, which can be used to measure the performance change of various applications in a non-invasive and timely way. Second, two alternative feedback-based algorithms are designed to search for the optimal resource allocation for each workload. The adaptive allocation is modelled as a dynamic optimization problem. The algorithms solve this problem by assessing performance changes in response to a change in resource allocation. The algorithms are demonstrated to be capable of handling complex and dynamic workloads. The resource manager proposed in this thesis uses these algorithms to determine the CPU allocation for multiple tenants. A prototype is implemented with four different sharing policies. For three common policies, the experimental evaluation confirms that the resource manager can achieve resource isolation and work conservation simultaneously, while the existing best-practice mechanisms cannot. Moreover, the resource manager can support a novel efficiency policy, which determines CPU sharing based on the overall system efficiency. In addition, a preliminary study shows that the feedback-based methodology for CPU management can be extended to control I/O bandwidth

University of Waterloo's Institutional Repository

Τεχνολογίες Υπολογιστικού Νέφους με έμφαση στη δυναμική αξιολόγηση των παρεχόμενων υπηρεσιών με βάση την ανάλυση της απόδοσης των εφαρμογών και της συγκριτικής αξιολόγησης

Author: Ευαγγελινού Αθανασία-Χαραλαμπία
Publication venue
Publication date: 26/04/2017
Field of study

DSpace at NTUA