355 research outputs found

    The Hipster Approach for Improving Cloud System Efficiency

    Get PDF
    In 2013, U.S. data centers accounted for 2.2% of the country’s total electricity consumption, a figure that is projected to increase rapidly over the next decade. Many important data center workloads in cloud computing are interactive, and they demand strict levels of quality-of-service (QoS) to meet user expectations, making it challenging to optimize power consumption along with increasing performance demands. This article introduces Hipster, a technique that combines heuristics and reinforcement learning to improve resource efficiency in cloud systems. Hipster explores heterogeneous multi-cores and dynamic voltage and frequency scaling for reducing energy consumption while managing the QoS of the latency-critical workloads. To improve data center utilization and make best usage of the available resources, Hipster can dynamically assign remaining cores to batch workloads without violating the QoS constraints for the latency-critical workloads. We perform experiments using a 64-bit ARM big.LITTLE platform and show that, compared to prior work, Hipster improves the QoS guarantee for Web-Search from 80% to 96%, and for Memcached from 92% to 99%, while reducing the energy consumption by up to 18%. Hipster is also effective in learning and adapting automatically to specific requirements of new incoming workloads just enough to meet the QoS and optimize resource consumption.This work has been partially supported by the European Union FP7 program through the Mont-Blanc-3 (FP7-ICT-671697) and EUROSERVER (FP7-ICT-610456) projects, by the Ministerio de Economia y Competitividad under contract Computación de Altas Prestaciones VII (TIN2015- 65316-P), and the Departament de Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d Execució Paral lels (2014-SGR-1051). Prior Publication: Rajiv Nishtala, Paul Carpenter, Vinicius Petrucci and Xavier Martorell. Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads. In Proceedings of the 23rd High Performance and Computer Architecture (HPCA 2017). In this work, we extend our previous work in several ways. First, we present an analysis of the size of the reward lookup table and an optimization for the table to improve the scalability of our reinforcement learning mechanism. Second, we demonstrate Hipster’s capability to adapt to changes in the latency-critical application at runtime and still satisfy QoS guarantees of the new incoming applications. Lastly, we present a deployment methodology for setting up new applications managed by Hipster’s runtime system. Author’s addresses: Rajiv Nishtala and Xavier Martorell, Universitat Politècnica de Catalunya and Barcelona Supercomputing Center; Paul Carpenter, Barcelona Supercomputing Center; Vincius Petrucci, Federal University of Bahia, Salvador, Brazil. emails:{rajiv.nishtala, paul.carpenter, xavier.martorell}@bsc.es; email: [email protected] . ACM acknowledges that this contribution was authored or co-authored by an employee, or contractor of the national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. Permission to make digital or hard copies for personal or classroom use is granted. Copies must bear this notice and the full citation on the rst page. Copyrights for components of this work owned by others than ACM must be honored. To copy otherwise, distribute, republish, or post, requires prior speci c permission and/or a fee. Request permissions from [email protected] ReviewedPostprint (author's final draft

    Energy optimizing methodologies on heterogeneous data centers

    Get PDF
    In 2013, U.S. data centers accounted for 2.2% of the country’s total electricity consumption, a figure that is projected to increase rapidly over the next decade. Many important work-loads are interactive, and they demand strict levels of quality-of-service (QoS) to meet user expectations, making it challenging to reduce power consumption due to increasing performance demands

    Adaptive and Application-agnostic Caching in Service Meshes for Resilient Cloud Applications

    Get PDF
    Service meshes factor out code dealing with inter-micro-service communication. The overall resilience of a cloud application is improved if constituent micro-services return stale data, instead of no data at all. This paper proposes and implements application agnostic caching for micro services. While caching is widely employed for serving web service traffic, its usage in inter-micro-service communication is lacking. Micro-services responses are highly dynamic, which requires carefully choosing adaptive time-to-life caching algorithms. Our approach is application agnostic, is cloud native, and supports gRPC. We evaluate our approach and implementation using the micro-service benchmark by Google Cloud called Hipster Shop. Our approach results in caching of about 80% of requests. Results show the feasibility and efficiency of our approach, which encourages implementing caching in service meshes. Additionally, we make the code, experiments, and data publicly available

    Hipster: hybrid task manager for latency-critical cloud workloads

    Get PDF
    In 2013, U. S. data centers accounted for 2.2% of the country's total electricity consumption, a figure that is projected to increase rapidly over the next decade. Many important workloads are interactive, and they demand strict levels of quality-of-service (QoS) to meet user expectations, making it challenging to reduce power consumption due to increasing performance demands. This paper introduces Hipster, a technique that combines heuristics and reinforcement learning to manage latency-critical workloads. Hipster's goal is to improve resource efficiency in data centers while respecting the QoS of the latency-critical workloads. Hipster achieves its goal by exploring heterogeneous multi-cores and dynamic voltage and frequency scaling (DVFS). To improve data center utilization and make best usage of the available resources, Hipster can dynamically assign remaining cores to batch workloads without violating the QoS constraints for the latency-critical workloads. We perform experiments using a 64-bit ARM big.LITTLE platform, and show that, compared to prior work, Hipster improves the QoS guarantee for Web-Search from 80% to 96%, and for Memcached from 92% to 99%, while reducing the energy consumption by up to 18%.Peer ReviewedPostprint (author's final draft

    A Novel Fog Computing Approach for Minimization of Latency in Healthcare using Machine Learning

    Get PDF
    In the recent scenario, the most challenging requirements are to handle the massive generation of multimedia data from the Internet of Things (IoT) devices which becomes very difficult to handle only through the cloud. Fog computing technology emerges as an intelligent solution and uses a distributed environment to operate. The objective of the paper is latency minimization in e-healthcare through fog computing. Therefore, in IoT multimedia data transmission, the parameters such as transmission delay, network delay, and computation delay must be reduced as there is a high demand for healthcare multimedia analytics. Fog computing provides processing, storage, and analyze the data nearer to IoT and end-users to overcome the latency. In this paper, the novel Intelligent Multimedia Data Segregation (IMDS) scheme using Machine learning (k-fold random forest) is proposed in the fog computing environment that segregates the multimedia data and the model used to calculate total latency (transmission, computation, and network). With the simulated results, we achieved 92% as the classification accuracy of the model, an approximately 95% reduction in latency as compared with the pre-existing model, and improved the quality of services in e-healthcare

    Towards Soft Circuit Breaking in Service Meshes via Application-agnostic Caching

    Full text link
    Service meshes factor out code dealing with inter-micro-service communication, such as circuit breaking. Circuit breaking actuation is currently limited to an "on/off" switch, i.e., a tripped circuit breaker will return an application-level error indicating service unavailability to the calling micro-service. This paper proposes a soft circuit breaker actuator, which returns cached data instead of an error. The overall resilience of a cloud application is improved if constituent micro-services return stale data, instead of no data at all. While caching is widely employed for serving web service traffic, its usage in inter-micro-service communication is lacking. Micro-services responses are highly dynamic, which requires carefully choosing adaptive time-to-life caching algorithms. We evaluate our approach through two experiments. First, we quantify the trade-off between traffic reduction and data staleness using a purpose-build service, thereby identifying algorithm configurations that keep data staleness at about 3% or less while reducing network load by up to 30%. Second, we quantify the network load reduction with the micro-service benchmark by Google Cloud called Hipster Shop. Our approach results in caching of about 80% of requests. Results show the feasibility and efficiency of our approach, which encourages implementing caching as a circuit breaking actuator in service meshes

    RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

    Full text link
    Resource sharing between multiple workloads has become a prominent practice among cloud service providers, motivated by demand for improved resource utilization and reduced cost of ownership. Effective resource sharing, however, remains an open challenge due to the adverse effects that resource contention can have on high-priority, user-facing workloads with strict Quality of Service (QoS) requirements. Although recent approaches have demonstrated promising results, those works remain largely impractical in public cloud environments since workloads are not known in advance and may only run for a brief period, thus prohibiting offline learning and significantly hindering online learning. In this paper, we propose RAPID, a novel framework for fast, fully-online resource allocation policy learning in highly dynamic operating environments. RAPID leverages lightweight QoS predictions, enabled by domain-knowledge-inspired techniques for sample efficiency and bias reduction, to decouple control from conventional feedback sources and guide policy learning at a rate orders of magnitude faster than prior work. Evaluation on a real-world server platform with representative cloud workloads confirms that RAPID can learn stable resource allocation policies in minutes, as compared with hours in prior state-of-the-art, while improving QoS by 9.0x and increasing best-effort workload performance by 19-43%

    Adaptive CPU Resource Allocation for Emulator in Kernel-based Virtual Machine

    Full text link
    The technologies of heterogeneous multi-core architectures, co-location, and virtualization can be used to reduce server power consumption and improve system utilization, which are three important technologies for data centers. This article explores the scheduling strategy of Emulator threads within virtual machine processes in a scenario of co-location of multiple virtual machines on heterogeneous multi-core architectures. In this co-location scenario, the scheduling strategy for Emulator threads significantly affects the performance of virtual machines. This article focuses on this thread for the first time in the relevant field. This article found that the scheduling latency metric can well indicate the running status of the vCPU threads and Emulator threads in the virtualization environment, and applied this metric to the design of the scheduling strategy. This article designed an Emulator thread scheduler based on heuristic rules, which, in coordination with the host operating system's scheduler, dynamically adjusts the scheduling scope of Emulator threads to improve the overall performance of virtual machines. The article found that in real application scenarios, the scheduler effectively improved the performance of applications within virtual machines, with a maximum performance improvement of 40.7%

    Improving interoperability in distributed multi-tier software stacks

    Get PDF
    Distributed multi-tier software stacks organise and deploy software components as a hierarchy of interacting tiers. The components are typically heterogeneous, i.e. each component may be written in a different language and may interoperate using a variety of protocols. Tiered software is modular but leads to a range of interoperability challenges including the following. (1) Interoperating components in multiple languages and paradigms increases developer cognitive load since they must simultaneously reason in multiple languages and paradigms. (2) There must be correct interoperation of components, e.g. adherence to the API or communication protocols between components. (3) Interoperation between different components can lead to diverse modes of failure as each component can fail in unique ways. Many of these challenges are the result of contributing factors like tight coupling or polyglot programmming. This thesis investigates techniques to improve heterogeneous interoperability in distributed multi-tier software stacks. Some common approaches include microservices and tierless languages. Microservices are perceived to offer better reliability than components in multi-tier software stacks through the loose coupling of services. The reliability of microservices is investigated by combining the established properties of dependence and state with reliability. This defines a new three-dimensional space: the Microservices Dependency State Reliability (MDSR) classification with six classes. The feasibility of statically identifying MDSR classes is demonstrated with a prototype analyser that identifies all six classes in Flask microservices web applications. The reliability implications of the different MDSR classes are evaluated by running three case study applications (Hipster-Shop, JPyL & WordPress) against a fault injector. Key results are as follows. (1) All applications fail catastrophically if a critical microservice fails. (2) Applications survive the failure of individual minor microservice(s). (3) The failure of any chain of microservices in JPyL & Hipster is catastrophic. (4) Individual microservices do not necessarily have minor reliability implications. In a tierless language, the compiler generates the code for each component and ensures their correct interoperation. They are mainly used to implement web stacks. However, their use in implementing IoT stacks is less common. This investigation compares interoperation in tiered and tierless IoT stacks through the systematic evaluation of four implementations of the prototype UoG smart campus IoT system: two tierless and two Python-based tiered. Key results of the study are as follows. (1) Tierless languages have the potential to significantly reduce the development effort for IoT systems, requiring 70% less code than the tiered implementations. (2) Tierless languages have the potential to significantly improve the reliability of IoT systems. (3) The first comparison of a tierless codebase for resource-rich sensor nodes and one for resourceconstrained sensor nodes shows that they have very similar functional structure and code sizes - within 7%. Tier elimination is a technique that removes a tier/component by integrating two tiers. Specifically, this thesis investigates the implications of eliminating the Apache web server in a 4-tier web stack: Jupyter Notebook, Apache, Python, Linux (JAPyL) and replacing it with PHP libraries in the frontend webpage to get the 3-tier (JPL). The study reveals the following. (1) The JPL 3-tier web stack requires that the developer uses fewer programming languages and paradigms than JAPyL, i.e two compared with four languages and two compared with three paradigms. (2) JPL requires 42% less code than JAPyL. (3) In JPL, some of the functionalities can be automated due to the decreased abstraction levels at the upper layers of the stack. (4) However, the latency in JPL is two to three times greater than that of JAPyL. So while tier elimination reduces developer effort and semantic friction the tradeoffs are high performance overhead & resource consumption and increasing code complexity
    • …
    corecore