84 research outputs found

    Dynamic Vector Bin Packing for Online Resource Allocation in the Cloud

    Full text link
    Several cloud-based applications, such as cloud gaming, rent servers to execute jobs which arrive in an online fashion. Each job has a resource demand and must be dispatched to a cloud server which has enough resources to execute the job, which departs after its completion. Under the `pay-as-you-go' billing model, the server rental cost is proportional to the total time that servers are actively running jobs. The problem of efficiently allocating a sequence of online jobs to servers without exceeding the resource capacity of any server while minimizing total server usage time can be modelled as a variant of the dynamic bin packing problem (DBP), called MinUsageTime DBP. In this work, we initiate the study of the problem with multi-dimensional resource demands (e.g. CPU/GPU usage, memory requirement, bandwidth usage, etc.), called MinUsageTime Dynamic Vector Bin Packing (DVBP). We study the competitive ratio (CR) of Any Fit packing algorithms for this problem. We show almost-tight bounds on the CR of three specific Any Fit packing algorithms, namely First Fit, Next Fit, and Move To Front. We prove that the CR of Move To Front is at most (2μ+1)d+1(2\mu+1)d +1, where μ\mu is the ratio of the max/min item durations. For d=1d=1, this significantly improves the previously known upper bound of 6μ+76\mu+7 (Kamali & Lopez-Ortiz, 2015). We then prove the CR of First Fit and Next Fit are bounded by (μ+2)d+1(\mu+2)d+1 and 2μd+12\mu d+1, respectively. Next, we prove a lower bound of (μ+1)d(\mu+1)d on the CR of any Any Fit packing algorithm, an improved lower bound of 2μd2\mu d for Next Fit, and a lower bound of 2μ2\mu for Move To Front in the 1-D case. All our bounds improve or match the best-known bounds for the 1-D case. Finally, we experimentally study the average-case performance of these algorithms on randomly generated synthetic data, and observe that Move To Front outperforms other Any Fit packing algorithms.Comment: 24 pages, to appear at SPAA 202

    Exploring the Tradeoff between Competitive Ratio and Variance in Online-Matching Markets

    Full text link
    In this paper, we propose an online-matching-based model to study the assignment problems arising in a wide range of online-matching markets, including online recommendations, ride-hailing platforms, and crowdsourcing markets. It features that each assignment can request a random set of resources and yield a random utility, and the two (cost and utility) can be arbitrarily correlated with each other. We present two linear-programming-based parameterized policies to study the tradeoff between the \emph{competitive ratio} (CR) on the total utilities and the \emph{variance} on the total number of matches (unweighted version). The first one (SAMP) is to sample an edge according to the distribution extracted from the clairvoyant optimal, while the second (ATT) features a time-adaptive attenuation framework that leads to an improvement over the state-of-the-art competitive-ratio result. We also consider the problem under a large-budget assumption and show that SAMP achieves asymptotically optimal performance in terms of competitive ratio.Comment: This paper was accepted to the 18th Conference on Web and Internet Economics (WINE), 202

    A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment

    Get PDF
    We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function

    Scheduling theory since 1981: an annotated bibliography

    Get PDF

    Online Scheduling with Predictions

    Get PDF
    Online scheduling is the process of allocating resources to tasks to achieve objectives with uncertain information about future conditions or task characteristics. This thesis presents a new online scheduling framework named online scheduling with predictions. The framework uses predictions about unknowns to manage uncertainty in decision-making. It considers that the predictions may be imperfect and include errors, surpassing the traditional assumptions of either complete information in online clairvoyant scheduling or zero information in online non-clairvoyant scheduling. The goal is to create algorithms with predictions that perform better with quality predictions while having bounded performance with poor predictions. The framework includes metrics such as consistency, robustness, and smoothness to evaluate algorithm performance. We prove the fundamental theorems that give tight lower bounds for these metrics. We apply the framework to central scheduling problems and cyber-physical system applications, including minimizing makespan in uniform machine scheduling with job size predictions, minimizing mean response time in single and parallel identical machine scheduling with job size predictions, and maximizing energy output in pulsed power load scheduling with normal load predictions. Analysis and simulations show that this framework outperforms state-of-the-art methods by leveraging predictions

    Parallel Real-Time Scheduling for Latency-Critical Applications

    Get PDF
    In order to provide safety guarantees or quality of service guarantees, many of today\u27s systems consist of latency-critical applications, e.g. applications with timing constraints. The problem of scheduling multiple latency-critical jobs on a multiprocessor or multicore machine has been extensively studied for sequential (non-parallizable) jobs and different system models and different objectives have been considered. However, the computational requirement of a single job is still limited by the capacity of a single core. To provide increasingly complex functionalities of applications and to complete their higher computational demands within the same or even more stringent timing constraints, we must exploit the internal parallelism of jobs, where individual jobs are parallel programs and can potentially utilize more than one core in parallel. However, there is little work considering scheduling multiple parallel jobs that are latency-critical. This dissertation focuses on developing new scheduling strategies, analysis tools, and practical platform design techniques to enable efficient and scalable parallel real-time scheduling for latency-critical applications on multicore systems. In particular, the research is focused on two types of systems: (1) static real-time systems for tasks with deadlines where the temporal properties of the tasks that need to execute is known a priori and the goal is to guarantee the temporal correctness of the tasks prior to their executions; and (2) online systems for latency-critical jobs where multiple jobs arrive over time and the goal to optimize for a performance objective of jobs during the execution. For static real-time systems for parallel tasks, several scheduling strategies, including global earliest deadline first, global rate monotonic and a novel federated scheduling, are proposed, analyzed and implemented. These scheduling strategies have the best known theoretical performance for parallel real-time tasks under any global strategy, any fixed priority scheduling and any scheduling strategy, respectively. In addition, federated scheduling is generalized to systems with multiple criticality levels and systems with stochastic tasks. Both numerical and empirical experiments show that federated scheduling and its variations have good schedulability performance and are efficient in practice. For online systems with multiple latency-critical jobs, different online scheduling strategies are proposed and analyzed for different objectives, including maximizing the number of jobs meeting a target latency, maximizing the profit of jobs, minimizing the maximum latency and minimizing the average latency. For example, a simple First-In-First-Out scheduler is proven to be scalable for minimizing the maximum latency. Based on this theoretical intuition, a more practical work-stealing scheduler is developed, analyzed and implemented. Empirical evaluations indicate that, on both real world and synthetic workloads, this work-stealing implementation performs almost as well as an optimal scheduler
    • …
    corecore