4,346 research outputs found
Opportunistic Scheduling as Restless Bandits
In this paper we consider energy efficient scheduling in a multiuser setting
where each user has a finite sized queue and there is a cost associated with
holding packets (jobs) in each queue (modeling the delay constraints). The
packets of each user need to be sent over a common channel. The channel
qualities seen by the users are time-varying and differ across users; also, the
cost incurred, i.e., energy consumed, in packet transmission is a function of
the channel quality. We pose the problem as an average cost Markov Decision
Problem, and prove that this problem is Whittle Indexable. Based on this
result, we propose an algorithm in which the Whittle index of each user is
computed and the user who has the lowest value is selected for transmission. We
evaluate the performance of this algorithm via simulations and show that it
achieves a lower average cost than the Maximum Weight Scheduling and Weighted
Fair Scheduling strategies.Comment: 10 pages, 7 figure
Whittle Indexability in Egalitarian Processor Sharing Systems
The egalitarian processor sharing model is viewed as a restless bandit and
its Whittle indexability is established. A numerical scheme for computing the
Whittle indices is provided, along with supporting numerical experiments.Comment: 27 pages, 6 figure
Distributed Server Allocation for Content Delivery Networks
We propose a dynamic formulation of file-sharing networks in terms of an
average cost Markov decision process with constraints. By analyzing a
Whittle-like relaxation thereof, we propose an index policy in the spirit of
Whittle and compare it by simulations with other natural heuristics.Comment: 22 pages, 10 figure
Optimal Energy-Efficient Policies for Data Centers through Sensitivity-Based Optimization
In this paper, we propose a novel dynamic decision method by applying the
sensitivity-based optimization theory to find the optimal energy-efficient
policy of a data center with two groups of heterogeneous servers. Servers in
Group 1 always work at high energy consumption, while servers in Group 2 may
either work at high energy consumption or sleep at low energy consumption. An
energy-efficient control policy determines the switch between work and sleep
states of servers in Group 2 in a dynamic way. Since servers in Group 1 are
always working with high priority to jobs, a transfer rule is proposed to
migrate the jobs in Group 2 to idle servers in Group 1. To find the optimal
energy-efficient policy, we set up a policy-based Poisson equation, and provide
explicit expressions for its unique solution of performance potentials by means
of the RG-factorization. Based on this, we characterize monotonicity and
optimality of the long-run average profit with respect to the policies under
different service prices. We prove that the bang-bang control is always optimal
for this optimization problem, i.e., we should either keep all servers sleep or
turn on the servers such that the number of working servers equals that of
waiting jobs in Group 2. As an easy adoption of policy forms, we further study
the threshold-type policy and obtain a necessary condition of the optimal
threshold policy. We hope the methodology and results derived in this paper can
shed light to the study of more general energy-efficient data centers.Comment: 50 pages, 3 figures. A paper discusses the energy-efficient policy of
data center with the scheduling of 2 different groups of server
A verification theorem for threshold-indexability of real-state discounted restless bandits
The Whittle index, which characterizes optimal policies for controlling
certain single restless bandit projects (a Markov decision process with two
actions: active and passive) is the basis for a widely used heuristic index
policy for the intractable restless multiarmed bandit problem. Yet two
roadblocks need to be overcome to apply such a policy: the individual projects
in the model at hand must be shown to be indexable, so that they possess a
Whittle index; and the index must be evaluated. Such roadblocks can be
especially vexing when project state spaces are real intervals, as in recent
sensor scheduling applications. This paper presents sufficient conditions for
indexability (relative to a generalized Whittle index) of general real-state
discrete-time restless bandits under the discounted criterion, which are not
based on elucidating properties of the optimal value function and do not
require proving beforehand optimality of threshold policies as in prevailing
approaches. The main contribution is a verification theorem establishing that,
if project performance metrics under threshold policies and an explicitly
defined marginal productivity (MP) index satisfy three conditions, then the
project is indexable with its generalized Whittle index being given by the MP
index, and threshold policies are optimal for dynamic project control.Comment: 1 figure. arXiv admin note: substantial text overlap with
arXiv:1512.0440
Optimal Routing for Delay-Sensitive Traffic in Overlay Networks
We design dynamic routing policies for an overlay network which meet delay
requirements of real-time traffic being served on top of an underlying legacy
network, where the overlay nodes do not know the underlay characteristics. We
pose the problem as a constrained MDP, and show that when the underlay
implements static policies such as FIFO with randomized routing, then a
decentralized policy, that can be computed efficiently in a distributed
fashion, is optimal. Our algorithm utilizes multi-timescale stochastic
approximation techniques, and its convergence relies on the fact that the
recursions asymptotically track a nonlinear differential equation, namely the
replicator equation. Extensive simulations show that the proposed policy indeed
outperforms the existing policies
Channels, Remote Estimation and Queueing Systems With A Utilization-Dependent Component: A Unifying Survey Of Recent Results
In this article, we survey the main models, techniques, concepts, and results
centered on the design and performance evaluation of engineered systems that
rely on a utilization-dependent component (UDC) whose operation may depend on
its usage history or assigned workload. Specifically, we report on research
themes concentrating on the characterization of the capacity of channels and
the design with performance guarantees of remote estimation and queueing
systems. Causes for the dependency of a UDC on past utilization include the use
of replenishable energy sources to power the transmission of information among
the sub-components of a networked system, and the assistance of a human
operator for servicing a queue. Our analysis unveils the similarity of the UDC
models typically adopted in each of the research themes, and it reveals the
differences in the objectives and technical approaches employed. We also
identify new challenges and future research directions inspired by the
cross-pollination among the central concepts, techniques, and problem
formulations of the research themes discussed
A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits
This paper presents sufficient conditions for indexability (existence of the
Whittle index) of general real-state discrete-time restless bandit projects
under the discounted optimality criterion, which are not based on dynamic
programming and do not require establishing first optimality of threshold
policies as in prevailing approaches. The main contribution is a verification
theorem establishing that, if project performance metrics under threshold
policies and an explicitly defined marginal productivity (MP) index satisfy
three conditions, then the project is indexable with its Whittle index being
given by the MP index, in a form implying optimality of threshold policies for
dynamic project control. Further contributions include characterizations of the
index as a Radon-Nikodym derivative and as a shadow price, and a recursive
index-computing scheme.Comment: 2 figures; under revie
A Gradient-Aware Search Algorithm for Constrained Markov Decision Processes
The canonical solution methodology for finite constrained Markov decision
processes (CMDPs), where the objective is to maximize the expected
infinite-horizon discounted rewards subject to the expected infinite-horizon
discounted costs constraints, is based on convex linear programming. In this
brief, we first prove that the optimization objective in the dual linear
program of a finite CMDP is a piece-wise linear convex function (PWLC) with
respect to the Lagrange penalty multipliers. Next, we propose a novel two-level
Gradient-Aware Search (GAS) algorithm which exploits the PWLC structure to find
the optimal state-value function and Lagrange penalty multipliers of a finite
CMDP. The proposed algorithm is applied in two stochastic control problems with
constraints: robot navigation in a grid world and solar-powered unmanned aerial
vehicle (UAV)-based wireless network management. We empirically compare the
convergence performance of the proposed GAS algorithm with binary search (BS),
Lagrangian primal-dual optimization (PDO), and Linear Programming (LP).
Compared with benchmark algorithms, it is shown that the proposed GAS algorithm
converges to the optimal solution faster, does not require hyper-parameter
tuning, and is not sensitive to initialization of the Lagrange penalty
multiplier.Comment: Submitted as a brief paper to the IEEE TNNL
A numerical scheme for a mean field game in some queueing systems based on Markov chain approximation method
We use the Markov chain approximation method to construct approximations for
the solution of the mean field game (MFG) with reflecting barriers studied in
Bayraktar, Budhiraja, and Cohen (2017). The MFG is formulated in terms of a
controlled reflected diffusion with a cost function that depends on the
reflection terms in addition to the standard variables: state, control, and the
mean field term. This MFG arises from the asymptotic analysis of an -player
game for single server queues with strategic servers. By showing that our
scheme is an almost contraction, we establish the convergence of this numerical
scheme over a small time interval.Comment: arXiv admin note: text overlap with arXiv:1605.0901
- …