1,761 research outputs found
Dynamic Service Rate Control for a Single Server Queue with Markov Modulated Arrivals
We consider the problem of service rate control of a single server queueing
system with a finite-state Markov-modulated Poisson arrival process. We show
that the optimal service rate is non-decreasing in the number of customers in
the system; higher congestion rates warrant higher service rates. On the
contrary, however, we show that the optimal service rate is not necessarily
monotone in the current arrival rate. If the modulating process satisfies a
stochastic monotonicity property the monotonicity is recovered. We examine
several heuristics and show where heuristics are reasonable substitutes for the
optimal control. None of the heuristics perform well in all the regimes.
Secondly, we discuss when the Markov-modulated Poisson process with service
rate control can act as a heuristic itself to approximate the control of a
system with a periodic non-homogeneous Poisson arrival process. Not only is the
current model of interest in the control of Internet or mobile networks with
bursty traffic, but it is also useful in providing a tractable alternative for
the control of service centers with non-stationary arrival rates.Comment: 32 Pages, 7 Figure
Towards Optimality in Parallel Scheduling
To keep pace with Moore's law, chip designers have focused on increasing the
number of cores per chip rather than single core performance. In turn, modern
jobs are often designed to run on any number of cores. However, to effectively
leverage these multi-core chips, one must address the question of how many
cores to assign to each job. Given that jobs receive sublinear speedups from
additional cores, there is an obvious tradeoff: allocating more cores to an
individual job reduces the job's runtime, but in turn decreases the efficiency
of the overall system. We ask how the system should schedule jobs across cores
so as to minimize the mean response time over a stream of incoming jobs.
To answer this question, we develop an analytical model of jobs running on a
multi-core machine. We prove that EQUI, a policy which continuously divides
cores evenly across jobs, is optimal when all jobs follow a single speedup
curve and have exponentially distributed sizes. EQUI requires jobs to change
their level of parallelization while they run. Since this is not possible for
all workloads, we consider a class of "fixed-width" policies, which choose a
single level of parallelization, k, to use for all jobs. We prove that,
surprisingly, it is possible to achieve EQUI's performance without requiring
jobs to change their levels of parallelization by using the optimal fixed level
of parallelization, k*. We also show how to analytically derive the optimal k*
as a function of the system load, the speedup curve, and the job size
distribution.
In the case where jobs may follow different speedup curves, finding a good
scheduling policy is even more challenging. We find that policies like EQUI
which performed well in the case of a single speedup function now perform
poorly. We propose a very simple policy, GREEDY*, which performs near-optimally
when compared to the numerically-derived optimal policy
A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning
Automatic decision-making approaches, such as reinforcement learning (RL),
have been applied to (partially) solve the resource allocation problem
adaptively in the cloud computing system. However, a complete cloud resource
allocation framework exhibits high dimensions in state and action spaces, which
prohibit the usefulness of traditional RL techniques. In addition, high power
consumption has become one of the critical concerns in design and control of
cloud computing systems, which degrades system reliability and increases
cooling cost. An effective dynamic power management (DPM) policy should
minimize power consumption while maintaining performance degradation within an
acceptable level. Thus, a joint virtual machine (VM) resource allocation and
power management framework is critical to the overall cloud computing system.
Moreover, novel solution framework is necessary to address the even higher
dimensions in state and action spaces. In this paper, we propose a novel
hierarchical framework for solving the overall resource allocation and power
management problem in cloud computing systems. The proposed hierarchical
framework comprises a global tier for VM resource allocation to the servers and
a local tier for distributed power management of local servers. The emerging
deep reinforcement learning (DRL) technique, which can deal with complicated
control problems with large state space, is adopted to solve the global tier
problem. Furthermore, an autoencoder and a novel weight sharing structure are
adopted to handle the high-dimensional state space and accelerate the
convergence speed. On the other hand, the local tier of distributed server
power managements comprises an LSTM based workload predictor and a model-free
RL based power manager, operating in a distributed manner.Comment: accepted by 37th IEEE International Conference on Distributed
Computing (ICDCS 2017
- …