2 research outputs found
Scheduling for Cloud-Based Computing Systems to Support Soft Real-Time Applications
Cloud-based computing infrastructure provides an efficient means to support
real-time processing workloads, e.g., virtualized base station processing, and
collaborative video conferencing. This paper addresses resource allocation for
a computing system with multiple resources supporting heterogeneous soft
real-time applications subject to Quality of Service (QoS) constraints on
failures to meet processing deadlines. We develop a general outer bound on the
feasible QoS region for non-clairvoyant resource allocation policies, and an
inner bound for a natural class of policies based on dynamically prioritizing
applications' tasks by favoring those with the largest (QoS) deficits. This
provides an avenue to study the efficiency of two natural resource allocation
policies: (1) priority-based greedy task scheduling for applications with
variable workloads, and (2) priority-based task selection and optimal
scheduling for applications with deterministic workloads. The near-optimality
of these simple policies emerges when task processing deadlines are relatively
large and/or when the number of compute resources is large. Analysis and
simulations show substantial resource savings for such policies over
reservation-based designs.Comment: This is an extended version of this pape
Efficiency and Optimality of Largest Deficit First Prioritization: Resource Allocation for Real-Time Applications
An increasing number of real-time applications with compute and/or
communication deadlines are being supported on shared infrastructure. Such
applications can often tolerate occasional deadline violations without
substantially impacting their Quality of Service (QoS). A fundamental problem
in such systems is deciding how to allocate shared resources so as to meet
applications' QoS requirements. A simple framework to address this problem is
to, (1) dynamically prioritize users as a possibly complex function of their
deficits (difference of achieved vs required QoS), and (2) allocate resources
so to expedite users with higher priority. This paper focuses on a general
class of systems using such priority-based resource allocation. We first
characterize the set of feasible QoS requirements and show the optimality of
max weight-like prioritization. We then consider simple weighted Largest
Deficit First (w-LDF) prioritization policies, where users with higher weighted
QoS deficits are given higher priority. The paper gives an inner bound for the
feasible set under w-LDF policies, and, under an additional monotonicity
assumption, characterizes its geometry leading to a sufficient condition for
optimality. Additional insights on the efficiency ratio of w-LDF policies, the
optimality of hierarchical-LDF and characterization of clustering of failures
are also discussed.Comment: This is an extended version of this pape