84 research outputs found
Dynamic Vector Bin Packing for Online Resource Allocation in the Cloud
Several cloud-based applications, such as cloud gaming, rent servers to
execute jobs which arrive in an online fashion. Each job has a resource demand
and must be dispatched to a cloud server which has enough resources to execute
the job, which departs after its completion. Under the `pay-as-you-go' billing
model, the server rental cost is proportional to the total time that servers
are actively running jobs. The problem of efficiently allocating a sequence of
online jobs to servers without exceeding the resource capacity of any server
while minimizing total server usage time can be modelled as a variant of the
dynamic bin packing problem (DBP), called MinUsageTime DBP.
In this work, we initiate the study of the problem with multi-dimensional
resource demands (e.g. CPU/GPU usage, memory requirement, bandwidth usage,
etc.), called MinUsageTime Dynamic Vector Bin Packing (DVBP). We study the
competitive ratio (CR) of Any Fit packing algorithms for this problem. We show
almost-tight bounds on the CR of three specific Any Fit packing algorithms,
namely First Fit, Next Fit, and Move To Front. We prove that the CR of Move To
Front is at most , where is the ratio of the max/min item
durations. For , this significantly improves the previously known upper
bound of (Kamali & Lopez-Ortiz, 2015). We then prove the CR of First
Fit and Next Fit are bounded by and , respectively.
Next, we prove a lower bound of on the CR of any Any Fit packing
algorithm, an improved lower bound of for Next Fit, and a lower bound
of for Move To Front in the 1-D case. All our bounds improve or match
the best-known bounds for the 1-D case. Finally, we experimentally study the
average-case performance of these algorithms on randomly generated synthetic
data, and observe that Move To Front outperforms other Any Fit packing
algorithms.Comment: 24 pages, to appear at SPAA 202
Exploring the Tradeoff between Competitive Ratio and Variance in Online-Matching Markets
In this paper, we propose an online-matching-based model to study the
assignment problems arising in a wide range of online-matching markets,
including online recommendations, ride-hailing platforms, and crowdsourcing
markets. It features that each assignment can request a random set of resources
and yield a random utility, and the two (cost and utility) can be arbitrarily
correlated with each other. We present two linear-programming-based
parameterized policies to study the tradeoff between the \emph{competitive
ratio} (CR) on the total utilities and the \emph{variance} on the total number
of matches (unweighted version). The first one (SAMP) is to sample an edge
according to the distribution extracted from the clairvoyant optimal, while the
second (ATT) features a time-adaptive attenuation framework that leads to an
improvement over the state-of-the-art competitive-ratio result. We also
consider the problem under a large-budget assumption and show that SAMP
achieves asymptotically optimal performance in terms of competitive ratio.Comment: This paper was accepted to the 18th Conference on Web and Internet
Economics (WINE), 202
A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment
We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function
Online Scheduling with Predictions
Online scheduling is the process of allocating resources to tasks to achieve objectives with uncertain information about future conditions or task characteristics. This thesis presents a new online scheduling framework named online scheduling with predictions. The framework uses predictions about unknowns to manage uncertainty in decision-making. It considers that the predictions may be imperfect and include errors, surpassing the traditional assumptions of either complete information in online clairvoyant scheduling or zero information in online non-clairvoyant scheduling. The goal is to create algorithms with predictions that perform better with quality predictions while having bounded performance with poor predictions. The framework includes metrics such as consistency, robustness, and smoothness to evaluate algorithm performance. We prove the fundamental theorems that give tight lower bounds for these metrics. We apply the framework to central scheduling problems and cyber-physical system applications, including minimizing makespan in uniform machine scheduling with job size predictions, minimizing mean response time in single and parallel identical machine scheduling with job size predictions, and maximizing energy output in pulsed power load scheduling with normal load predictions. Analysis and simulations show that this framework outperforms state-of-the-art methods by leveraging predictions
Parallel Real-Time Scheduling for Latency-Critical Applications
In order to provide safety guarantees or quality of service guarantees, many of today\u27s systems consist of latency-critical applications, e.g. applications with timing constraints. The problem of scheduling multiple latency-critical jobs on a multiprocessor or multicore machine has been extensively studied for sequential (non-parallizable) jobs and different system models and different objectives have been considered. However, the computational requirement of a single job is still limited by the capacity of a single core. To provide increasingly complex functionalities of applications and to complete their higher computational demands within the same or even more stringent timing constraints, we must exploit the internal parallelism of jobs, where individual jobs are parallel programs and can potentially utilize more than one core in parallel. However, there is little work considering scheduling multiple parallel jobs that are latency-critical.
This dissertation focuses on developing new scheduling strategies, analysis tools, and practical platform design techniques to enable efficient and scalable parallel real-time scheduling for latency-critical applications on multicore systems. In particular, the research is focused on two types of systems: (1) static real-time systems for tasks with deadlines where the temporal properties of the tasks that need to execute is known a priori and the goal is to guarantee the temporal correctness of the tasks prior to their executions; and (2) online systems for latency-critical jobs where multiple jobs arrive over time and the goal to optimize for a performance objective of jobs during the execution.
For static real-time systems for parallel tasks, several scheduling strategies, including global earliest deadline first, global rate monotonic and a novel federated scheduling, are proposed, analyzed and implemented. These scheduling strategies have the best known theoretical performance for parallel real-time tasks under any global strategy, any fixed priority scheduling and any scheduling strategy, respectively. In addition, federated scheduling is generalized to systems with multiple criticality levels and systems with stochastic tasks. Both numerical and empirical experiments show that federated scheduling and its variations have good schedulability performance and are efficient in practice.
For online systems with multiple latency-critical jobs, different online scheduling strategies are proposed and analyzed for different objectives, including maximizing the number of jobs meeting a target latency, maximizing the profit of jobs, minimizing the maximum latency and minimizing the average latency. For example, a simple First-In-First-Out scheduler is proven to be scalable for minimizing the maximum latency. Based on this theoretical intuition, a more practical work-stealing scheduler is developed, analyzed and implemented. Empirical evaluations indicate that, on both real world and synthetic workloads, this work-stealing implementation performs almost as well as an optimal scheduler
- …