292 research outputs found

    Experimental Performance Evaluation of Cloud-Based Analytics-as-a-Service

    Full text link
    An increasing number of Analytics-as-a-Service solutions has recently seen the light, in the landscape of cloud-based services. These services allow flexible composition of compute and storage components, that create powerful data ingestion and processing pipelines. This work is a first attempt at an experimental evaluation of analytic application performance executed using a wide range of storage service configurations. We present an intuitive notion of data locality, that we use as a proxy to rank different service compositions in terms of expected performance. Through an empirical analysis, we dissect the performance achieved by analytic workloads and unveil problems due to the impedance mismatch that arise in some configurations. Our work paves the way to a better understanding of modern cloud-based analytic services and their performance, both for its end-users and their providers.Comment: Longer version of the paper in Submission at IEEE CLOUD'1

    High availability using virtualization

    Get PDF
    High availability has always been one of the main problems for a data center. Till now high availability was achieved by host per host redundancy, a highly expensive method in terms of hardware and human costs. A new approach to the problem can be offered by virtualization. Using virtualization, it is possible to achieve a redundancy system for all the services running on a data center. This new approach to high availability allows to share the running virtual machines over the servers up and running, by exploiting the features of the virtualization layer: start, stop and move virtual machines between physical hosts. The system (3RC) is based on a finite state machine with hysteresis, providing the possibility to restart each virtual machine over any physical host, or reinstall it from scratch. A complete infrastructure has been developed to install operating system and middleware in a few minutes. To virtualize the main servers of a data center, a new procedure has been developed to migrate physical to virtual hosts. The whole Grid data center SNS-PISA is running at the moment in virtual environment under the high availability system. As extension of the 3RC architecture, several storage solutions have been tested to store and centralize all the virtual disks, from NAS to SAN, to grant data safety and access from everywhere. Exploiting virtualization and ability to automatically reinstall a host, we provide a sort of host on-demand, where the action on a virtual machine is performed only when a disaster occurs.Comment: PhD Thesis in Information Technology Engineering: Electronics, Computer Science, Telecommunications, pp. 94, University of Pisa [Italy

    An Analysis of Storage Virtualization

    Get PDF
    Investigating technologies and writing expansive documentation on their capabilities is like hitting a moving target. Technology is evolving, growing, and expanding what it can do each and every day. This makes it very difficult when trying to snap a line and investigate competing technologies. Storage virtualization is one of those moving targets. Large corporations develop software and hardware solutions that try to one up the competition by releasing firmware and patch updates to include their latest developments. Some of their latest innovations include differing RAID levels, virtualized storage, data compression, data deduplication, file deduplication, thin provisioning, new file system types, tiered storage, solid state disk, and software updates to coincide these technologies with their applicable hardware. Even data center environmental considerations like reusable energies, data center environmental characteristics, and geographic locations are being used by companies both small and large to reduce operating costs and limit environmental impacts. Companies are even moving to an entire cloud based setup to limit their environmental impact as it could be cost prohibited to maintain your own corporate infrastructure. The trifecta of integrating smart storage architectures to include storage virtualization technologies, reducing footprint to promote energy savings, and migrating to cloud based services will ensure a long-term sustainable storage subsystem

    Differential virtualization for large-scale system modeling

    Get PDF
    Today’s computer networks become more complex than ever with a vast number of connected host systems running a variety of different operating systems and services. Academia and industry alike realize that education in managing such complex systems is extremely important for computer professionals because, with computers, there are many levels of detailed configuration. Configuration points can occur during all facets of computer systems including system design, implementation, and maintenance stages. In order to explore various hypotheses regarding configurations, system modeling is employed – computer professionals and researchers build test environments. Modeling environments require observable systems that are easily configurable at an accelerated rate. Observation abilities increase through re-use and preservation of models. Historical modeling solutions do not efficiently utilize computing resources and require high preservation or restoration cost as the number of modeled systems increases. This research compares a workstation-oriented, virtualization modeling solution using system differences to a workstation-oriented, imaging modeling solution using full system states. The solutions are compared based on computing resource utilization and administrative cost with respect to the number of modeled systems. Our experiments have shown that upon increasing the number of models from 30 to 60, the imaging solution requires an additional 75 minutes; whereas, the difference-based virtualization solution requires an additional three (3) minutes. The imaging solution requires 151 minutes to prepare 60 models, while the difference-based, virtualization solution requires 7 minutes to prepare 60 models. Therefore, the cost for model archival and restoration in the difference-based virtualization modeling solution is lower than that in the full system imaging-based modeling solution. In addition, by using a virtualization solution, multiple systems can be modeled on a single workstation, thus increasing workstation resource utilization. Since virtualization abstracts hardware, virtualized models are less dependent on physical hardware. Thus, by lowering hardware dependency, a virtualized model is further re-usable than a traditional system image. If an organization must perform system modeling and the organization has sufficient workstation resources, using a differential virtualization approach will decrease the time required for model preservation, increase resource utilization, and therefore provide an efficient, scalable, and modular modeling solution

    Client-side Flash Caching for Cloud Systems

    Full text link

    Flash Caching for Cloud Computing Systems

    Get PDF
    As the size of cloud systems and the number of hosted virtual machines (VMs) rapidly grow, the scalability of shared VM storage systems becomes a serious issue. Client-side flash-based caching has the potential to improve the performance of cloud VM storage by employing flash storage available on the VM hosts to exploit the locality inherent in VM IOs. However, there are several challenges to the effective use of flash caching in cloud systems. First, cache configurations such as size, write policy, metadata persistency and RAID level have significant impacts on flash caching. Second, the typical capacity of flash devices is limited compared to the dataset size of consolidated VMs. Finally, flash devices wear out and face serious endurance issues which are aggravated by the use for caching. This dissertation presents the research for addressing these problems of cloud flash caching in the following three aspects. First, it presents a thorough study of different cache configurations including a new cache-optimized RAID configuration using a large amount of long-term traces collected from real-world public and private clouds. Second, it studies an on-demand flash cache management solution for meeting VM cache demands and minimizing device wear-out. It uses a new cache demand model Reuse Working Set (RWS) to capture the data with good temporal locality, and uses the RWS size (RWSS) to model a workload?s cache demand. Finally, to handle situations where a cache is insufficient for VMs? demands, it employs dynamic cache migration to balance cache load across hosts by live migrating cached data along with the VMs. The results show that the cache-optimized RAID improves performance by 137% without sacrificing reliability, compared to traditional RAID. The RWSS-based on-demand cache allocation reduces workload?s cache usage by 78% and lowers the amount of writes sent to cache device by 40%, compared to traditional working set based cache allocation. Combining on-demand cache allocation with dynamic cache migration for 12 concurrent VMs, results show 28% higher hit ratio and 28% lower 90th percentile IO latency, compared to the case without cache allocation

    Cloud computing: survey on energy efficiency

    Get PDF
    International audienceCloud computing is today’s most emphasized Information and Communications Technology (ICT) paradigm that is directly or indirectly used by almost every online user. However, such great significance comes with the support of a great infrastructure that includes large data centers comprising thousands of server units and other supporting equipment. Their share in power consumption generates between 1.1% and 1.5% of the total electricity use worldwide and is projected to rise even more. Such alarming numbers demand rethinking the energy efficiency of such infrastructures. However, before making any changes to infrastructure, an analysis of the current status is required. In this article, we perform a comprehensive analysis of an infrastructure supporting the cloud computing paradigm with regards to energy efficiency. First, we define a systematic approach for analyzing the energy efficiency of most important data center domains, including server and network equipment, as well as cloud management systems and appliances consisting of a software utilized by end users. Second, we utilize this approach for analyzing available scientific and industrial literature on state-of-the-art practices in data centers and their equipment. Finally, we extract existing challenges and highlight future research directions

    Infrastructure Operations Final Report

    No full text
    This document serves as a final report of the activities and achievements of WP5 throughout the whole duration of the project. The document covers the areas of infrastructure operation, service provisioning, support, testing and benchmarking. In addition, the document provides a record of the practical knowledge accumulated during the provision of various public cloud services over a period of almost two years
    • …
    corecore