1,791 research outputs found

    Using Performance Forecasting to Accelerate Elasticity

    Get PDF
    Cloud computing facilitates dynamic resource provisioning. The automation of resource management, known as elasticity, has been subject to much research. In this context, monitoring of a running service plays a crucial role, and adjustments are made when certain thresholds are crossed. On such occasions, it is common practice to simply add or remove resources. In this paper we investigate how we can predict the performance of a service to dynamically adjust allocated resources based on predictions. In other words, instead of “repairing” because a threshold has been crossed, we attempt to stay ahead and allocate an optimized amount of resources in advance. To do so, we need to have accurate predictive models that are based on workloads. We present our approach, based on the Universal Scalability Law, and discuss initial experiments

    Adaptive microservice scaling for elastic applications

    Get PDF

    A study to define meteorological uses and performance requirements for the Synchronous Earth Observatory Satellite

    Get PDF
    The potential meteorological uses of the Synchronous Earth Observatory Satellite (SEOS) were studied for detecting and predicting hazards to life, property, or the quality of the environment. Mesoscale meteorological phenonmena, and the observations requirements for SEOS are discussed along with the sensor parameters

    Seamless Support of Low Latency Mobile Applications with NFV-Enabled Mobile Edge-Cloud

    Get PDF
    Emerging mobile multimedia applications, such as augmented reality, have stringent latency requirements and high computational cost. To address this, mobile edge-cloud (MEC) has been proposed as an approach to bring resources closer to users. Recently, in contrast to conventional fixed cloud locations, the advent of network function virtualization (NFV) has, with some added cost due to the necessary decentralization, enhanced MEC with new flexibility in placing MEC services to any nodes capable of virtualizing their resources. In this work, we address the question on how to optimally place resources among NFV- enabled nodes to support mobile multimedia applications with low latency requirement and when to adapt the current resource placements to address workload changes. We first show that the placement optimization problem is NP-hard and propose an online dynamic resource allocation scheme that consists of an adaptive greedy heuristic algorithm and a detection mechanism to identify the time when the system will no longer be able to satisfy the applications’ delay requirement. Our scheme takes into account the effect of current existing techniques (i.e., auto- scaling and load balancing). We design and implement a realistic NFV-enabled MEC simulated framework and show through ex- tensive simulations that our proposal always manages to allocate sufficient resources on time to guarantee continuous satisfaction of the application latency requirements under changing workload while incurring up to 40% less cost in comparison to existing overprovisioning approaches

    Adaptive sliding windows for improved estimation of data center resource utilization

    Get PDF
    Accurate prediction of data center resource utilization is required for capacity planning, job scheduling, energy saving, workload placement, and load balancing to utilize the resources efficiently. However, accurately predicting those resources is challenging due to dynamic workloads, heterogeneous infrastructures, and multi-tenant co-hosted applications. Existing prediction methods use fixed size observation windows which cannot produce accurate results because of not being adaptively adjusted to capture local trends in the most recent data. Therefore, those methods train on large fixed sliding windows using an irrelevant large number of observations yielding to inaccurate estimations or fall for inaccuracy due to degradation of estimations with short windows on quick changing trends. In this paper we propose a deep learning-based adaptive window size selection method, dynamically limiting the sliding window size to capture the trend for the latest resource utilization, then build an estimation model for each trend period. We evaluate the proposed method against multiple baseline and state-of-the-art methods, using real data-center workload data sets. The experimental evaluation shows that the proposed solution outperforms those state-of-the-art approaches and yields 16 to 54% improved prediction accuracy compared to the baseline methods.This work is partially supported by the European ResearchCouncil (ERC) under the EU Horizon 2020 programme(GA 639595), the Spanish Ministry of Economy, Industry andCompetitiveness (TIN2015-65316-P and IJCI2016-27485), theGeneralitat de Catalunya, Spain (2014-SGR-1051) and Universityof the Punjab, Pakistan. The statements made herein are solelythe responsibility of the authors.Peer ReviewedPostprint (published version

    Resource provisioning and scheduling algorithms for hybrid workflows in edge cloud computing

    Get PDF
    In recent years, Internet of Things (IoT) technology has been involved in a wide range of application domains to provide real-time monitoring, tracking and analysis services. The worldwide number of IoT-connected devices is projected to increase to 43 billion by 2023, and IoT technologies are expected to engaged in 25% of business sector. Latency-sensitive applications in scope of intelligent video surveillance, smart home, autonomous vehicle, augmented reality, are all emergent research directions in industry and academia. These applications are required connecting large number of sensing devices to attain the desired level of service quality for decision accuracy in a sensitive timely manner. Moreover, continuous data stream imposes processing large amounts of data, which adds a huge overhead on computing and network resources. Thus, latency-sensitive and resource-intensive applications introduce new challenges for current computing models, i.e, batch and stream. In this thesis, we refer to the integrated application model of stream and batch applications as a hybrid work ow model. The main challenge of the hybrid model is achieving the quality of service (QoS) requirements of the two computation systems. This thesis provides a systemic and detailed modeling for hybrid workflows which describes the internal structure of each application type for purposes of resource estimation, model systems tuning, and cost modeling. For optimizing the execution of hybrid workflows, this thesis proposes algorithms, techniques and frameworks to serve resource provisioning and task scheduling on various computing systems including cloud, edge cloud and cooperative edge cloud. Overall, experimental results provided in this thesis demonstrated strong evidences on the responsibility of proposing different understanding and vision on the applications of integrating stream and batch applications, and how edge computing and other emergent technologies like 5G networks and IoT will contribute on more sophisticated and intelligent solutions in many life disciplines for more safe, secure, healthy, smart and sustainable society

    Data-Driven Methods for Data Center Operations Support

    Get PDF
    During the last decade, cloud technologies have been evolving at an impressive pace, such that we are now living in a cloud-native era where developers can leverage on an unprecedented landscape of (possibly managed) services for orchestration, compute, storage, load-balancing, monitoring, etc. The possibility to have on-demand access to a diverse set of configurable virtualized resources allows for building more elastic, flexible and highly-resilient distributed applications. Behind the scenes, cloud providers sustain the heavy burden of maintaining the underlying infrastructures, consisting in large-scale distributed systems, partitioned and replicated among many geographically dislocated data centers to guarantee scalability, robustness to failures, high availability and low latency. The larger the scale, the more cloud providers have to deal with complex interactions among the various components, such that monitoring, diagnosing and troubleshooting issues become incredibly daunting tasks. To keep up with these challenges, development and operations practices have undergone significant transformations, especially in terms of improving the automations that make releasing new software, and responding to unforeseen issues, faster and sustainable at scale. The resulting paradigm is nowadays referred to as DevOps. However, while such automations can be very sophisticated, traditional DevOps practices fundamentally rely on reactive mechanisms, that typically require careful manual tuning and supervision from human experts. To minimize the risk of outages—and the related costs—it is crucial to provide DevOps teams with suitable tools that can enable a proactive approach to data center operations. This work presents a comprehensive data-driven framework to address the most relevant problems that can be experienced in large-scale distributed cloud infrastructures. These environments are indeed characterized by a very large availability of diverse data, collected at each level of the stack, such as: time-series (e.g., physical host measurements, virtual machine or container metrics, networking components logs, application KPIs); graphs (e.g., network topologies, fault graphs reporting dependencies among hardware and software components, performance issues propagation networks); and text (e.g., source code, system logs, version control system history, code review feedbacks). Such data are also typically updated with relatively high frequency, and subject to distribution drifts caused by continuous configuration changes to the underlying infrastructure. In such a highly dynamic scenario, traditional model-driven approaches alone may be inadequate at capturing the complexity of the interactions among system components. DevOps teams would certainly benefit from having robust data-driven methods to support their decisions based on historical information. For instance, effective anomaly detection capabilities may also help in conducting more precise and efficient root-cause analysis. Also, leveraging on accurate forecasting and intelligent control strategies would improve resource management. Given their ability to deal with high-dimensional, complex data, Deep Learning-based methods are the most straightforward option for the realization of the aforementioned support tools. On the other hand, because of their complexity, this kind of models often requires huge processing power, and suitable hardware, to be operated effectively at scale. These aspects must be carefully addressed when applying such methods in the context of data center operations. Automated operations approaches must be dependable and cost-efficient, not to degrade the services they are built to improve. i
    • …
    corecore