17 research outputs found

    Resource-Efficient Replication and Migration of Virtual Machines.

    Full text link
    Continuous replication and live migration of Virtual Machines (VMs) are two vital tools in a virtualized environment, but they are resource-expensive. Continuously replicating a VM's checkpointed state to a backup host maintains high-availability (HA) of the VM despite host failures, but checkpoint replication can generate significant network traffic. Each replicated VM also incurs a 100% memory overhead, since the backup unproductively reserves the same amount of memory to hold the redundant VM state. Live migration, though being widely used for load-balancing, power-saving, etc., can also generate excessive network traffic, by transferring VM state iteratively. In addition, it can incur a long completion time and degrade application performance. This thesis explores ways to replicate VMs for HA using resources efficiently, and to migrate VMs fast, with minimal execution disruption and using resources efficiently. First, we investigate the tradeoffs in using different compression methods to reduce the network traffic of checkpoint replication in a HA system. We evaluate gzip, delta and similarity compressions based on metrics that are specifically important in a HA system, and then suggest guidelines for their selection. Next, we propose HydraVM, a storage-based HA approach that eliminates the unproductive memory reservation made in backup hosts. HydraVM maintains a recent image of a protected VM in a shared storage by taking and consolidating incremental VM checkpoints. When a failure occurs, HydraVM quickly resumes the execution of a failed VM by loading a small amount of essential VM state from the storage. As the VM executes, the VM state not yet loaded is supplied on-demand. Finally, we propose application-assisted live migration, which skips transfer of VM memory that need not be migrated to execute running applications at the destination. We develop a generic framework for the proposed approach, and then use the framework to build JAVMM, a system that migrates VMs running Java applications skipping transfer of garbage in Java memory. Our evaluation results show that compared to Xen live migration, which is agnostic of running applications, JAVMM can reduce the completion time, network traffic and application downtime caused by Java VM migration, all by up to over 90%.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111575/1/karenhou_1.pd

    Small cell cloud proof of concept implementation and monitoring schemes analysis

    Get PDF
    Cloud Computing has grown exponentially in popularity in the last few years, becoming a key technology for both personal and enterprise applications due to the numerous benefits it offers. On the other hand, Small Cell technology is considered by many to be the solution to the challenges that are expected to arise caused by the continuously increasing number of interconnected mobile devices. This project presents a basic design and a proof of concept implementation of a Small Cell Cloud, a current research field on mobile communications that aims to leverage the capabilities offered by the parallel and distributed computation of Cloud Computing to enhance Small Cells functionality. The purpose of the described Small Cell Cloud is to allow application offloading of mobile devices to Small Cells, allowing the execution of more resource demanding applications at the same time energy consumption is reduced in those devices. Furthermore, a detailed analysis on different Small Cell monitoring schemes is carried out, comparing the achieved performance with each of them in terms of data reliability and generated network traffic. Finally, based on the proof of concept implementation and a series of stress performance test, conclusions on the viability of the proposed Small Cell Cloud design and the most appropriate monitoring scheme are presented. Guidelines for future research work are also provided, considering the work developed in this project as a first step towards a new mobile technology.IngenierĂ­a de TelecomunicaciĂł

    Programming Persistent Memory

    Get PDF
    Beginning and experienced programmers will use this comprehensive guide to persistent memory programming. You will understand how persistent memory brings together several new software/hardware requirements, and offers great promise for better performance and faster application startup times—a huge leap forward in byte-addressable capacity compared with current DRAM offerings. This revolutionary new technology gives applications significant performance and capacity improvements over existing technologies. It requires a new way of thinking and developing, which makes this highly disruptive to the IT/computing industry. The full spectrum of industry sectors that will benefit from this technology include, but are not limited to, in-memory and traditional databases, AI, analytics, HPC, virtualization, and big data. Programming Persistent Memory describes the technology and why it is exciting the industry. It covers the operating system and hardware requirements as well as how to create development environments using emulated or real persistent memory hardware. The book explains fundamental concepts; provides an introduction to persistent memory programming APIs for C, C++, JavaScript, and other languages; discusses RMDA with persistent memory; reviews security features; and presents many examples. Source code and examples that you can run on your own systems are included. What You’ll Learn Understand what persistent memory is, what it does, and the value it brings to the industry Become familiar with the operating system and hardware requirements to use persistent memory Know the fundamentals of persistent memory programming: why it is different from current programming methods, and what developers need to keep in mind when programming for persistence Look at persistent memory application development by example using the Persistent Memory Development Kit (PMDK) Design and optimize data structures for persistent memory Study how real-world applications are modified to leverage persistent memory Utilize the tools available for persistent memory programming, application performance profiling, and debugging Who This Book Is For C, C++, Java, and Python developers, but will also be useful to software, cloud, and hardware architects across a broad spectrum of sectors, including cloud service providers, independent software vendors, high performance compute, artificial intelligence, data analytics, big data, etc

    Sinatra: Stateful Instantaneous Updates for Commercial Browsers Through Multi-Version eXecution

    Get PDF
    Browsers are the main way in which most users experience the internet, which makes them a prime target for malicious entities. The best defense for the common user is to keep their browser always up-to-date, installing updates as soon as they are available. Unfortunately, updating a browser is disruptive as it results in loss of user state. Even though modern browsers reopen all pages (tabs) after an update to minimize inconvenience, this approach still loses all local user state in each page (e.g., contents of unsubmitted forms, including associated JavaScript validation state) and assumes that pages can be refreshed and result in the same contents. We believe this is an important barrier that keeps users from updating their browsers as frequently as possible. In this paper, we present the design, implementation, and evaluation of Sinatra, which supports instantaneous browser updates that do not result in any data loss through a novel Multi-Version eXecution (MVX) approach for JavaScript programs, combined with a sophisticated proxy. Sinatra works in pure JavaScript, does not require any browser support, thus works on closed-source browsers, and requires trivial changes to each target page, that can be automated. First, Sinatra captures all the non-determinism available to a JavaScript program (e.g., event handlers executed, expired timers, invocations of Math.random). Our evaluation shows that Sinatra requires 6MB to store such events, and the memory grows at a modest rate of 253KB/s as the user keeps interacting with each page. When an update becomes available, Sinatra transfer the state by re-executing the same set of non-deterministic events on the new browser. During this time, which can be as long as 1.5 seconds, Sinatra uses MVX to allow the user to keep interacting with the old browser. Finally, Sinatra changes the roles in less than 10ms, and the user starts interacting with the new browser, effectively performing a browser update with zero downtime and no loss of state

    Performance Benchmarking of Application Monitoring Frameworks

    Get PDF
    Application-level monitoring of continuously operating software systems provides insights into their dynamic behavior, helping to maintain their performance and availability during runtime. Such monitoring may cause a significant runtime overhead to the monitored system, depending on the number and location of used instrumentation probes. In order to improve a system’s instrumentation and to reduce the caused monitoring overhead, it is necessary to know the performance impact of each probe. While many monitoring frameworks are claiming to have minimal impact on the performance, these claims are often not backed up with a detailed performance evaluation determining the actual cost of monitoring. Benchmarks can be used as an effective and affordable way for these evaluations. However, no benchmark specifically targeting the overhead of monitoring itself exists. Furthermore, no established benchmark engineering methodology exists that provides guidelines for the design, execution, and analysis of benchmarks. This thesis introduces a benchmark approach to measure the performance overhead of application-level monitoring frameworks. The core contributions of this approach are 1) a definition of common causes of monitoring overhead, 2) a general benchmark engineering methodology, 3) the MooBench micro-benchmark to measure and quantify causes of monitoring overhead, and 4) detailed performance evaluations of three different application-level monitoring frameworks. Extensive experiments demonstrate the feasibility and practicality of the approach and validate the benchmark results. The developed benchmark is available as open source software and the results of all experiments are available for download to facilitate further validation and replication of the results

    Softwarization of Large-Scale IoT-based Disasters Management Systems

    Get PDF
    The Internet of Things (IoT) enables objects to interact and cooperate with each other for reaching common objectives. It is very useful in large-scale disaster management systems where humans are likely to fail when they attempt to perform search and rescue operations in high-risk sites. IoT can indeed play a critical role in all phases of large-scale disasters (i.e. preparedness, relief, and recovery). Network softwarization aims at designing, architecting, deploying, and managing network components primarily based on software programmability properties. It relies on key technologies, such as cloud computing, Network Functions Virtualization (NFV), and Software Defined Networking (SDN). The key benefits are agility and cost efficiency. This thesis proposes softwarization approaches to tackle the key challenges related to large-scale IoT based disaster management systems. A first challenge faced by large-scale IoT disaster management systems is the dynamic formation of an optimal coalition of IoT devices for the tasks at hand. Meeting this challenge is critical for cost efficiency. A second challenge is an interoperability. IoT environments remain highly heterogeneous. However, the IoT devices need to interact. Yet another challenge is Quality of Service (QoS). Disaster management applications are known to be very QoS sensitive, especially when it comes to delay. To tackle the first challenge, we propose a cloud-based architecture that enables the formation of efficient coalitions of IoT devices for search and rescue tasks. The proposed architecture enables the publication and discovery of IoT devices belonging to different cloud providers. It also comes with a coalition formation algorithm. For the second challenge, we propose an NFV and SDN based - architecture for on-the-fly IoT gateway provisioning. The gateway functions are provisioned as Virtual Network Functions (VNFs) that are chained on-the-fly in the IoT domain using SDN. When it comes to the third challenge, we rely on fog computing to meet the QoS and propose algorithms that provision IoT applications components in hybrid NFV based - cloud/fogs. Both stationary and mobile fog nodes are considered. In the case of mobile fog nodes, a Tabu Search-based heuristic is proposed. It finds a near-optimal solution and we numerically show that it is faster than the Integer Linear Programming (ILP) solution by several orders of magnitude

    Datacenter Architectures for the Microservices Era

    Full text link
    Modern internet services are shifting away from single-binary, monolithic services into numerous loosely-coupled microservices that interact via Remote Procedure Calls (RPCs), to improve programmability, reliability, manageability, and scalability of cloud services. Computer system designers are faced with many new challenges with microservice-based architectures, as individual RPCs/tasks are only a few microseconds in most microservices. In this dissertation, I seek to address the most notable challenges that arise due to the dissimilarities of the modern microservice based and classic monolithic cloud services, and design novel server architectures and runtime systems that enable efficient execution of µs-scale microservices on modern hardware. In the first part of my dissertation, I seek to address the problem of Killer Microseconds, which refers to µs-scale “holes” in CPU schedules caused by stalls to access fast I/O devices or brief idle times between requests in high throughput µs-scale microservices. Whereas modern computing platforms can efficiently hide ns-scale and ms-scale stalls through micro-architectural techniques and OS context switching, they lack efficient support to hide the latency of µs-scale stalls. In chapter II, I propose Duplexity, a heterogeneous server architecture that employs aggressive multithreading to hide the latency of killer microseconds, without sacrificing the Quality-of-Service (QoS) of latency-sensitive microservices. Duplexity is able to achieve 1.9× higher core utilization and 2.7× lower iso-throughput 99th-percentile tail latency over an SMT-based server design, on average. In chapters III-IV, I comprehensively investigate the problem of tail latency in the context of microservices and address multiple aspects of it. First, in chapter III, I characterize the tail latency behavior of microservices and provide general guidelines for optimizing computer systems from a queuing perspective to minimize tail latency. Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are enqueued behind rare, long ones, due to Head-of-Line (HoL) blocking. Next, in chapter IV, I introduce Q-Zilla, a scheduling framework to tackle tail latency from a queuing perspective, and CoreZilla, a microarchitectural instantiation of the framework. Q-Zilla is composed of the ServerQueue Decoupled Size-Interval Task Assignment (SQD-SITA) scheduling algorithm and the Express-lane Simultaneous Multithreading (ESMT) microarchitecture, which together seek to address HoL blocking by providing an “express-lane” for short tasks, protecting them from queuing behind rare, long ones. By combining the ESMT microarchitecture and the SQD-SITA scheduling algorithm, CoreZilla is able to improves tail latency over a conventional SMT core with 2, 4, and 8 contexts by 2.25×, 3.23×, and 4.38×, on average, respectively, and outperform a theoretical 32-core scale-up organization by 12%, on average, with 8 contexts. Finally, in chapters V-VI, I investigate the tail latency problem of microservices from a cluster, rather than server-level, perspective. Whereas Service Level Objectives (SLOs) define end-to-end latency targets for the entire service to ensure user satisfaction, with microservice-based applications, it is unclear how to scale individual microservices when end-to-end SLOs are violated or underutilized. I introduce Parslo as an analytical framework for partial SLO allocation in virtualized cloud microservices. Parslo takes a microservice graph as an input and employs a Gradient Descent-based approach to allocate “partial SLOs” to different microservice nodes, enabling independent auto-scaling of individual microservices. Parslo achieves the optimal solution, minimizing the total cost for the entire service deployment, and is applicable to general microservice graphs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167978/1/miramir_1.pd

    Intelligent Load Balancing in Cloud Computer Systems

    Get PDF
    Cloud computing is an established technology allowing users to share resources on a large scale, never before seen in IT history. A cloud system connects multiple individual servers in order to process related tasks in several environments at the same time. Clouds are typically more cost-effective than single computers of comparable computing performance. The sheer physical size of the system itself means that thousands of machines may be involved. The focus of this research was to design a strategy to dynamically allocate tasks without overloading Cloud nodes which would result in system stability being maintained at minimum cost. This research has added the following new contributions to the state of knowledge: (i) a novel taxonomy and categorisation of three classes of schedulers, namely OS-level, Cluster and Big Data, which highlight their unique evolution and underline their different objectives; (ii) an abstract model of cloud resources utilisation is specified, including multiple types of resources and consideration of task migration costs; (iii) a virtual machine live migration was experimented with in order to create a formula which estimates the network traffic generated by this process; (iv) a high-fidelity Cloud workload simulator, based on a month-long workload traces from Google's computing cells, was created; (v) two possible approaches to resource management were proposed and examined in the practical part of the manuscript: the centralised metaheuristic load balancer and the decentralised agent-based system. The project involved extensive experiments run on the University of Westminster HPC cluster, and the promising results are presented together with detailed discussions and a conclusion
    corecore