17 research outputs found
Resource-Efficient Replication and Migration of Virtual Machines.
Continuous replication and live migration of Virtual Machines (VMs) are two vital tools in a virtualized environment, but they are resource-expensive. Continuously replicating a VM's checkpointed state to a backup host maintains high-availability (HA) of the VM despite host failures, but checkpoint replication can generate significant network traffic. Each replicated VM also incurs a 100% memory overhead, since the backup unproductively reserves the same amount of memory to hold the redundant VM state. Live migration, though being widely used for load-balancing, power-saving, etc., can also generate excessive network traffic, by transferring VM state iteratively. In addition, it can incur a long completion time and degrade application performance.
This thesis explores ways to replicate VMs for HA using resources efficiently, and to migrate VMs fast, with minimal execution disruption and using resources efficiently. First, we investigate the tradeoffs in using different compression methods to reduce the network traffic of checkpoint replication in a HA system. We evaluate gzip, delta and similarity compressions based on metrics that are specifically important in a HA system, and then suggest guidelines for their selection.
Next, we propose HydraVM, a storage-based HA approach that eliminates the unproductive memory reservation made in backup hosts. HydraVM maintains a recent image of a protected VM in a shared storage by taking and consolidating incremental VM checkpoints. When a failure occurs, HydraVM quickly resumes the execution of a failed VM by loading a small amount of essential VM state from the storage. As the VM executes, the VM state not yet loaded is supplied on-demand.
Finally, we propose application-assisted live migration, which skips transfer of VM memory that need not be migrated to execute running applications at the destination. We develop a generic framework for the proposed approach, and then use the framework to build JAVMM, a system that migrates VMs running Java applications skipping transfer of garbage in Java memory. Our evaluation results show that compared to Xen live migration, which is agnostic of running applications, JAVMM can reduce the completion time, network traffic and application downtime caused by Java VM migration, all by up to over 90%.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111575/1/karenhou_1.pd
Small cell cloud proof of concept implementation and monitoring schemes analysis
Cloud Computing has grown exponentially in popularity in the last few years, becoming a key technology for both personal and enterprise applications due to the numerous benefits it offers. On the other hand, Small Cell technology is considered by many to be the solution to the challenges that are expected to arise caused by the continuously increasing number of interconnected mobile devices.
This project presents a basic design and a proof of concept implementation of a Small Cell Cloud, a current research field on mobile communications that aims to leverage the capabilities offered by the parallel and distributed computation of Cloud Computing to enhance Small Cells functionality.
The purpose of the described Small Cell Cloud is to allow application offloading of mobile devices to Small Cells, allowing the execution of more resource demanding applications at the same time energy consumption is reduced in those devices.
Furthermore, a detailed analysis on different Small Cell monitoring schemes is carried out, comparing the achieved performance with each of them in terms of data reliability and generated network traffic.
Finally, based on the proof of concept implementation and a series of stress performance test, conclusions on the viability of the proposed Small Cell Cloud design and the most appropriate monitoring scheme are presented. Guidelines for future research work are also provided, considering the work developed in this project as a first step towards a new mobile technology.IngenierĂa de TelecomunicaciĂł
Recommended from our members
Transiency-driven Resource Management for Cloud Computing Platforms
Modern distributed server applications are hosted on enterprise or cloud data centers that provide computing, storage, and networking capabilities to these applications. These applications are built using the implicit assumption that the underlying servers will be stable and normally available, barring for occasional faults. In many emerging scenarios, however, data centers and clouds only provide transient, rather than continuous, availability of their servers. Transiency in modern distributed systems arises in many contexts, such as green data centers powered using renewable intermittent sources, and cloud platforms that provide lower-cost transient servers which can be unilaterally revoked by the cloud operator.
Transient computing resources are increasingly important, and existing fault-tolerance and resource management techniques are inadequate for transient servers because applications typically assume continuous resource availability. This thesis presents research in distributed systems design that treats transiency as a first-class design principle. I show that combining transiency-specific fault-tolerance mechanisms with resource management policies to suit application characteristics and requirements, can yield significant cost and performance benefits. These mechanisms and policies have been implemented and prototyped as part of software systems, which allow a wide range of applications, such as interactive services and distributed data processing, to be deployed on transient servers, and can reduce cloud computing costs by up to 90\%.
This thesis makes contributions to four areas of computer systems research: transiency-specific fault-tolerance, resource allocation, abstractions, and resource reclamation. For reducing the impact of transient server revocations, I develop two fault-tolerance techniques that are tailored to transient server characteristics and application requirements. For interactive applications, I build a derivative cloud platform that masks revocations by transparently moving application-state between servers of different types. Similarly, for distributed data processing applications, I investigate the use of application level periodic checkpointing to reduce the performance impact of server revocations. For managing and reducing the risk of server revocations, I investigate the use of server portfolios that allow transient resource allocation to be tailored to application requirements.
Finally, I investigate how resource providers (such as cloud platforms) can provide transient resource availability without revocation, by looking into alternative resource reclamation techniques. I develop resource deflation, wherein a server\u27s resources are fractionally reclaimed, allowing the application to continue execution albeit with fewer resources. Resource deflation generalizes revocation, and the deflation mechanisms and cluster-wide policies can yield both high cluster utilization and low application performance degradation
Programming Persistent Memory
Beginning and experienced programmers will use this comprehensive guide to persistent memory programming. You will understand how persistent memory brings together several new software/hardware requirements, and offers great promise for better performance and faster application startup times—a huge leap forward in byte-addressable capacity compared with current DRAM offerings. This revolutionary new technology gives applications significant performance and capacity improvements over existing technologies. It requires a new way of thinking and developing, which makes this highly disruptive to the IT/computing industry. The full spectrum of industry sectors that will benefit from this technology include, but are not limited to, in-memory and traditional databases, AI, analytics, HPC, virtualization, and big data. Programming Persistent Memory describes the technology and why it is exciting the industry. It covers the operating system and hardware requirements as well as how to create development environments using emulated or real persistent memory hardware. The book explains fundamental concepts; provides an introduction to persistent memory programming APIs for C, C++, JavaScript, and other languages; discusses RMDA with persistent memory; reviews security features; and presents many examples. Source code and examples that you can run on your own systems are included. What You’ll Learn Understand what persistent memory is, what it does, and the value it brings to the industry Become familiar with the operating system and hardware requirements to use persistent memory Know the fundamentals of persistent memory programming: why it is different from current programming methods, and what developers need to keep in mind when programming for persistence Look at persistent memory application development by example using the Persistent Memory Development Kit (PMDK) Design and optimize data structures for persistent memory Study how real-world applications are modified to leverage persistent memory Utilize the tools available for persistent memory programming, application performance profiling, and debugging Who This Book Is For C, C++, Java, and Python developers, but will also be useful to software, cloud, and hardware architects across a broad spectrum of sectors, including cloud service providers, independent software vendors, high performance compute, artificial intelligence, data analytics, big data, etc
Sinatra: Stateful Instantaneous Updates for Commercial Browsers Through Multi-Version eXecution
Browsers are the main way in which most users experience the internet, which makes them a prime target for malicious entities. The best defense for the common user is to keep their browser always up-to-date, installing updates as soon as they are available. Unfortunately, updating a browser is disruptive as it results in loss of user state. Even though modern browsers reopen all pages (tabs) after an update to minimize inconvenience, this approach still loses all local user state in each page (e.g., contents of unsubmitted forms, including associated JavaScript validation state) and assumes that pages can be refreshed and result in the same contents. We believe this is an important barrier that keeps users from updating their browsers as frequently as possible.
In this paper, we present the design, implementation, and evaluation of Sinatra, which supports instantaneous browser updates that do not result in any data loss through a novel Multi-Version eXecution (MVX) approach for JavaScript programs, combined with a sophisticated proxy. Sinatra works in pure JavaScript, does not require any browser support, thus works on closed-source browsers, and requires trivial changes to each target page, that can be automated. First, Sinatra captures all the non-determinism available to a JavaScript program (e.g., event handlers executed, expired timers, invocations of Math.random). Our evaluation shows that Sinatra requires 6MB to store such events, and the memory grows at a modest rate of 253KB/s as the user keeps interacting with each page. When an update becomes available, Sinatra transfer the state by re-executing the same set of non-deterministic events on the new browser. During this time, which can be as long as 1.5 seconds, Sinatra uses MVX to allow the user to keep interacting with the old browser. Finally, Sinatra changes the roles in less than 10ms, and the user starts interacting with the new browser, effectively performing a browser update with zero downtime and no loss of state
Performance Benchmarking of Application Monitoring Frameworks
Application-level monitoring of continuously operating software systems provides insights into their dynamic behavior, helping to maintain their performance and availability during runtime. Such monitoring may cause a significant runtime overhead to the monitored system, depending on the number and location of used instrumentation probes. In order to improve a system’s instrumentation and to reduce the caused monitoring overhead, it is necessary to know the performance impact of each probe. While many monitoring frameworks are claiming to have minimal impact on the performance, these claims are often not backed up with a detailed performance evaluation determining the actual cost of monitoring. Benchmarks can be used as an effective and affordable way for these evaluations. However, no benchmark specifically targeting the overhead of monitoring itself exists. Furthermore, no established benchmark engineering methodology exists that provides guidelines for the design, execution, and analysis of benchmarks. This thesis introduces a benchmark approach to measure the performance overhead of application-level monitoring frameworks. The core contributions of this approach are 1) a definition of common causes of monitoring overhead, 2) a general benchmark engineering methodology, 3) the MooBench micro-benchmark to measure and quantify causes of monitoring overhead, and 4) detailed performance evaluations of three different application-level monitoring frameworks. Extensive experiments demonstrate the feasibility and practicality of the approach and validate the benchmark results. The developed benchmark is available as open source software and the results of all experiments are available for download to facilitate further validation and replication of the results
Softwarization of Large-Scale IoT-based Disasters Management Systems
The Internet of Things (IoT) enables objects to interact and cooperate with each other for reaching common objectives. It is very useful in large-scale disaster management systems where humans are likely to fail when they attempt to perform search and rescue operations in high-risk sites. IoT can indeed play a critical role in all phases of large-scale disasters (i.e. preparedness, relief, and recovery). Network softwarization aims at designing, architecting, deploying, and managing network components primarily based on software programmability properties. It relies on key technologies, such as cloud computing, Network Functions Virtualization (NFV), and Software Defined Networking (SDN). The key benefits are agility and cost efficiency. This thesis proposes softwarization approaches to tackle the key challenges related to large-scale IoT based disaster management systems.
A first challenge faced by large-scale IoT disaster management systems is the dynamic formation of an optimal coalition of IoT devices for the tasks at hand. Meeting this challenge is critical for cost efficiency. A second challenge is an interoperability. IoT environments remain highly heterogeneous. However, the IoT devices need to interact. Yet another challenge is Quality of Service (QoS). Disaster management applications are known to be very QoS sensitive, especially when it comes to delay.
To tackle the first challenge, we propose a cloud-based architecture that enables the formation of efficient coalitions of IoT devices for search and rescue tasks. The proposed architecture enables the publication and discovery of IoT devices belonging to different cloud providers. It also comes with a coalition formation algorithm. For the second challenge, we propose an NFV and SDN based - architecture for on-the-fly IoT gateway provisioning. The gateway functions are provisioned as Virtual Network Functions (VNFs) that are chained on-the-fly in the IoT domain using SDN. When it comes to the third challenge, we rely on fog computing to meet the QoS and propose algorithms that provision IoT applications components in hybrid NFV based - cloud/fogs. Both stationary and mobile fog nodes are considered. In the case of mobile fog nodes, a Tabu Search-based heuristic is proposed. It finds a near-optimal solution and we numerically show that it is faster than the Integer Linear Programming (ILP) solution by several orders of magnitude
Recommended from our members
Finding, Measuring, and Reducing Inefficiencies in Contemporary Computer Systems
Computer systems have become increasingly diverse and specialized in recent years. This complexity supports a wide range of new computing uses and users, but is not without cost: it has become difficult to maintain the efficiency of contemporary general purpose computing systems. Computing inefficiencies, which include nonoptimal runtimes, excessive energy use, and limits to scalability, are a serious problem that can result in an inability to apply computing to solve the world's most important problems. Beyond the complexity and vast diversity of modern computing platforms and applications, a number of factors make improving general purpose efficiency challenging, including the requirement that multiple levels of the computer system stack be examined, that legacy hardware devices and software may stand in the way of achieving efficiency, and the need to balance efficiency with reusability, programmability, security, and other goals.
This dissertation presents five case studies, each demonstrating different ways in which the measurement of emerging systems can provide actionable advice to help keep general purpose computing efficient. The first of the five case studies is Parallel Block Vectors, a new profiling method for understanding parallel programs with a fine-grained, code-centric perspective aids in both future hardware design and in optimizing software to map better to existing hardware. Second is a project that defines a new way of measuring application interference on a datacenter's worth of chip-multiprocessors, leading to improved scheduling where applications can more effectively utilize available hardware resources. Next is a project that uses the GT-Pin tool to define a method for accelerating the simulation of GPGPUs, ultimately allowing for the development of future hardware with fewer inefficiencies. The fourth project is an experimental energy survey that compares and combines the latest energy efficiency solutions at different levels of the stack to properly evaluate the state of the art and to find paths forward for future energy efficiency research. The final project presented is NRG-Loops, a language extension that allows programs to measure and intelligently adapt their own power and energy use
Datacenter Architectures for the Microservices Era
Modern internet services are shifting away from single-binary, monolithic services into numerous loosely-coupled microservices that interact via Remote Procedure Calls (RPCs), to improve programmability, reliability, manageability, and scalability of cloud services.
Computer system designers are faced with many new challenges with microservice-based architectures, as individual RPCs/tasks are only a few microseconds in most microservices. In this dissertation, I seek to address the most notable challenges that arise due to the dissimilarities of the modern microservice based and classic monolithic cloud services, and design novel server architectures and runtime systems that enable efficient execution of µs-scale microservices on modern hardware.
In the first part of my dissertation, I seek to address the problem of Killer Microseconds, which refers to µs-scale “holes” in CPU schedules caused by stalls to access fast I/O devices or brief idle times between requests in high throughput µs-scale microservices. Whereas modern computing platforms can efficiently hide ns-scale and ms-scale stalls through micro-architectural techniques and OS context switching, they lack efficient support to hide the latency of µs-scale stalls. In chapter II, I propose Duplexity, a heterogeneous server architecture that employs aggressive multithreading to hide the latency of killer microseconds, without sacrificing the Quality-of-Service (QoS) of latency-sensitive microservices. Duplexity is able to achieve 1.9× higher core utilization and 2.7× lower iso-throughput 99th-percentile tail latency over an SMT-based server design, on average.
In chapters III-IV, I comprehensively investigate the problem of tail latency in the context of microservices and address multiple aspects of it. First, in chapter III, I characterize the tail latency behavior of microservices and provide general guidelines for optimizing computer systems from a queuing perspective to minimize tail latency. Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are enqueued behind rare, long ones, due to Head-of-Line (HoL) blocking. Next, in chapter IV, I introduce Q-Zilla,
a scheduling framework to tackle tail latency from a queuing perspective, and CoreZilla, a microarchitectural instantiation of the framework. Q-Zilla is composed of the ServerQueue Decoupled Size-Interval Task Assignment (SQD-SITA) scheduling algorithm and the Express-lane Simultaneous Multithreading (ESMT) microarchitecture, which together seek to address HoL blocking by providing an “express-lane” for short tasks, protecting them from queuing behind rare, long ones. By combining the ESMT microarchitecture and the SQD-SITA scheduling algorithm, CoreZilla is able to improves tail latency over a conventional SMT core with 2, 4, and 8 contexts by 2.25×, 3.23×, and 4.38×, on average, respectively, and outperform a theoretical 32-core scale-up organization by 12%, on average, with 8 contexts.
Finally, in chapters V-VI, I investigate the tail latency problem of microservices from a cluster, rather than server-level, perspective. Whereas Service Level Objectives (SLOs) define end-to-end latency targets for the entire service to ensure user satisfaction, with microservice-based applications, it is unclear how to scale individual microservices when end-to-end SLOs are violated or underutilized. I introduce Parslo as an analytical framework for partial SLO allocation in virtualized cloud microservices. Parslo takes a microservice
graph as an input and employs a Gradient Descent-based approach to allocate “partial SLOs” to different microservice nodes, enabling independent auto-scaling of individual microservices. Parslo achieves the optimal solution, minimizing the total cost for the entire service deployment, and is applicable to general microservice graphs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167978/1/miramir_1.pd
Intelligent Load Balancing in Cloud Computer Systems
Cloud computing is an established technology allowing users to share resources on a large scale, never before seen in IT history. A cloud system connects multiple individual servers in order to process related tasks in several environments at the same time. Clouds are typically more cost-effective than single computers of comparable computing performance. The sheer physical size of the system itself means that thousands of machines may be involved. The focus of this research was to design a strategy to dynamically allocate tasks without overloading Cloud nodes which would result in system stability being maintained at minimum cost. This research has added the following new contributions to the state of knowledge: (i) a novel taxonomy and categorisation of three classes of schedulers, namely OS-level, Cluster and Big Data, which highlight their unique evolution and underline their different objectives; (ii) an abstract model of cloud resources utilisation is specified, including multiple types of resources and consideration of task migration costs; (iii) a virtual machine live migration was experimented with in order to create a formula which estimates the network traffic generated by this process; (iv) a high-fidelity Cloud workload simulator, based on a month-long workload traces from Google's computing cells, was created; (v) two possible approaches to resource management were proposed and examined in the practical part of the manuscript: the centralised metaheuristic load balancer and the decentralised agent-based system. The project involved extensive experiments run on the University of Westminster HPC cluster, and the promising results are presented together with detailed discussions and a conclusion