988 research outputs found

    A methodological framework for cloud resource provisioning and scheduling of data parallel applications under uncertainty

    Get PDF
    Data parallel applications are being extensively deployed in cloud environmentsbecause of the possibility of dynamically provisioning storage and computation re-sources. To identify cost-effective solutions that satisfy the desired service levels,resource provisioning and scheduling play a critical role. Nevertheless, the unpre-dictable behavior of cloud performance makes the estimation of the resources actu-ally needed quite complex. In this paper we propose a provisioning and schedulingframework that explicitly tackles uncertainties and performance variability of thecloud infrastructure and of the workload. This framework allows cloud users to es-timate in advance, i.e., prior to the actual execution of the applications, the resourcesettings that cope with uncertainty. We formulate an optimization problem wherethe characteristics not perfectly known or affected by uncertain phenomena arerepresented as random variables modeled by the corresponding probability distri-butions. Provisioning and scheduling decisions \u2013 while optimizing various metrics,such as monetary leasing costs of cloud resources and application execution time \u2013take fully account of uncertainties encountered in cloud environments. To test our framework, we consider data parallel applications characterized by a deadline con-straint and we investigate the impact of their characteristics and of the variabilityof the cloud infrastructure. The experiments show that the resource provisioningand scheduling plans identified by our approach nicely cope with uncertainties andensure that the application deadline is satisfied

    Design and Performance Guarantees in Cloud Computing: Challenges and Opportunities

    Get PDF
    In the last years, cloud computing received an increasing attention both from academia and industry. Most of the solutions proposed in the literature strive to limit the effect of uncertain and unpredictable behaviors that may occur in cloud environments, like for example flash crowds or hardware failures. However, managing uncertainty in a cloud environment is still an open problem. In such a panorama, the service provider is not able to define suitable Service Level Objectives (SLO) that are easy to measure, and control. In this work we analyze two of the critical problems that are encountered in cloud environments, but seldom discussed or addressed in the literature: (1) how to reduce the uncertainty providing suitable control interfaces at different levels of the computing infrastructure; (2) how to assess performance evaluation in order to get probabilistic guarantees for the SLOs. We here briefly describe the two problems and envision some possible control-theoretical solutions

    Incorporating Probabilistic Optimizations for Resource Provisioning of Data Processing Workflows

    Get PDF
    International audienceWorkflow is an important model for big data processing and resource provisioning is crucial to the performance of workflows. Recently, system variations in the cloud and large-scale clusters, such as those in I/O and network performances, have been observed to greatly affect the performance of workflows. Traditional resource provisioning methods, which overlook these variations, can lead to suboptimal resource provisioning results. In this paper, we provide a general solution for workflow performance optimizations considering system variations. Specifically, we model system variations as time-dependent random variables and take their probability distributions as optimization input. Despite its effectiveness, this solution involves heavy computation overhead. Thus, we propose three pruning techniques to simplify workflow structure and reduce the probability evaluation overhead. We implement our techniques in a runtime library, which allows users to incorporate efficient probabilistic optimization into existing resource provisioning methods. Experiments show that probabilistic solutions can improve the performance by 51% compared to state-of-the-art static solutions while guaranteeing budget constraint, and our pruning techniques can greatly reduce the overhead of probabilistic optimization

    DISCO: Achieving Low Latency and High Reliability in Scheduling of Graph-Structured Tasks over Mobile Vehicular Cloud

    Full text link
    To effectively process data across a fleet of dynamic and distributed vehicles, it is crucial to implement resource provisioning techniques that provide reliable, cost-effective, and real-time computing services. This article explores resource provisioning for computation-intensive tasks over mobile vehicular clouds (MVCs). We use undirected weighted graphs (UWGs) to model both the execution of tasks and communication patterns among vehicles in a MVC. We then study low-latency and reliable scheduling of UWG asks through a novel methodology named double-plan-promoted isomorphic subgraph search and optimization (DISCO). In DISCO, two complementary plans are envisioned to ensure effective task completion: Plan A and Plan B.Plan A analyzes the past data to create an optimal mapping (α\alpha) between tasks and the MVC in advance to the practical task scheduling. Plan B serves as a dependable backup, designed to find a feasible mapping (β\beta) in case α\alpha fails during task scheduling due to unpredictable nature of the network.We delve into into DISCO's procedure and key factors that contribute to its success. Additionally, we provide a case study that includes comprehensive comparisons to demonstrate DISCO's exceptional performance in regards to time efficiency and overhead. We further discuss a series of open directions for future research

    EdgeAISim: A Toolkit for Simulation and Modelling of AI Models in Edge Computing Environments

    Get PDF
    To meet next-generation Internet of Things (IoT) application demands, edge computing moves processing power and storage closer to the network edge to minimize latency and bandwidth utilization. Edge computing is becoming increasingly popular as a result of these benefits, but it comes with challenges such as managing resources efficiently. Researchers are utilising Artificial Intelligence (AI) models to solve the challenge of resource management in edge computing systems. However, existing simulation tools are only concerned with typical resource management policies, not the adoption and implementation of AI models for resource management, especially. Consequently, researchers continue to face significant challenges, making it hard and time-consuming to use AI models when designing novel resource management policies for edge computing with existing simulation tools. To overcome these issues, we propose a lightweight Python-based toolkit called EdgeAISim for the simulation and modelling of AI models for designing resource management policies in edge computing environments. In EdgeAISim, we extended the basic components of the EdgeSimPy framework and developed new AI-based simulation models for task scheduling, energy management, service migration, network flow scheduling, and mobility support for edge computing environments. In EdgeAISim, we have utilized advanced AI models such as Multi-Armed Bandit with Upper Confidence Bound, Deep Q-Networks, Deep Q-Networks with Graphical Neural Network, and Actor-Critic Network to optimize power usage while efficiently managing task migration within the edge computing environment. The performance of these proposed models of EdgeAISim is compared with the baseline, which uses a worst-fit algorithm-based resource management policy in different settings. Experimental results indicate that EdgeAISim exhibits a substantial reduction in power consumption, highlighting the compelling success of power optimization strategies in EdgeAISim. The development of EdgeAISim represents a promising step towards sustainable edge computing, providing eco-friendly and energy-efficient solutions that facilitate efficient task management in edge environments for different large-scale scenarios
    • …
    corecore