4,346 research outputs found

    Opportunistic Scheduling as Restless Bandits

    Full text link
    In this paper we consider energy efficient scheduling in a multiuser setting where each user has a finite sized queue and there is a cost associated with holding packets (jobs) in each queue (modeling the delay constraints). The packets of each user need to be sent over a common channel. The channel qualities seen by the users are time-varying and differ across users; also, the cost incurred, i.e., energy consumed, in packet transmission is a function of the channel quality. We pose the problem as an average cost Markov Decision Problem, and prove that this problem is Whittle Indexable. Based on this result, we propose an algorithm in which the Whittle index of each user is computed and the user who has the lowest value is selected for transmission. We evaluate the performance of this algorithm via simulations and show that it achieves a lower average cost than the Maximum Weight Scheduling and Weighted Fair Scheduling strategies.Comment: 10 pages, 7 figure

    Whittle Indexability in Egalitarian Processor Sharing Systems

    Full text link
    The egalitarian processor sharing model is viewed as a restless bandit and its Whittle indexability is established. A numerical scheme for computing the Whittle indices is provided, along with supporting numerical experiments.Comment: 27 pages, 6 figure

    Distributed Server Allocation for Content Delivery Networks

    Full text link
    We propose a dynamic formulation of file-sharing networks in terms of an average cost Markov decision process with constraints. By analyzing a Whittle-like relaxation thereof, we propose an index policy in the spirit of Whittle and compare it by simulations with other natural heuristics.Comment: 22 pages, 10 figure

    Optimal Energy-Efficient Policies for Data Centers through Sensitivity-Based Optimization

    Full text link
    In this paper, we propose a novel dynamic decision method by applying the sensitivity-based optimization theory to find the optimal energy-efficient policy of a data center with two groups of heterogeneous servers. Servers in Group 1 always work at high energy consumption, while servers in Group 2 may either work at high energy consumption or sleep at low energy consumption. An energy-efficient control policy determines the switch between work and sleep states of servers in Group 2 in a dynamic way. Since servers in Group 1 are always working with high priority to jobs, a transfer rule is proposed to migrate the jobs in Group 2 to idle servers in Group 1. To find the optimal energy-efficient policy, we set up a policy-based Poisson equation, and provide explicit expressions for its unique solution of performance potentials by means of the RG-factorization. Based on this, we characterize monotonicity and optimality of the long-run average profit with respect to the policies under different service prices. We prove that the bang-bang control is always optimal for this optimization problem, i.e., we should either keep all servers sleep or turn on the servers such that the number of working servers equals that of waiting jobs in Group 2. As an easy adoption of policy forms, we further study the threshold-type policy and obtain a necessary condition of the optimal threshold policy. We hope the methodology and results derived in this paper can shed light to the study of more general energy-efficient data centers.Comment: 50 pages, 3 figures. A paper discusses the energy-efficient policy of data center with the scheduling of 2 different groups of server

    A verification theorem for threshold-indexability of real-state discounted restless bandits

    Full text link
    The Whittle index, which characterizes optimal policies for controlling certain single restless bandit projects (a Markov decision process with two actions: active and passive) is the basis for a widely used heuristic index policy for the intractable restless multiarmed bandit problem. Yet two roadblocks need to be overcome to apply such a policy: the individual projects in the model at hand must be shown to be indexable, so that they possess a Whittle index; and the index must be evaluated. Such roadblocks can be especially vexing when project state spaces are real intervals, as in recent sensor scheduling applications. This paper presents sufficient conditions for indexability (relative to a generalized Whittle index) of general real-state discrete-time restless bandits under the discounted criterion, which are not based on elucidating properties of the optimal value function and do not require proving beforehand optimality of threshold policies as in prevailing approaches. The main contribution is a verification theorem establishing that, if project performance metrics under threshold policies and an explicitly defined marginal productivity (MP) index satisfy three conditions, then the project is indexable with its generalized Whittle index being given by the MP index, and threshold policies are optimal for dynamic project control.Comment: 1 figure. arXiv admin note: substantial text overlap with arXiv:1512.0440

    Optimal Routing for Delay-Sensitive Traffic in Overlay Networks

    Full text link
    We design dynamic routing policies for an overlay network which meet delay requirements of real-time traffic being served on top of an underlying legacy network, where the overlay nodes do not know the underlay characteristics. We pose the problem as a constrained MDP, and show that when the underlay implements static policies such as FIFO with randomized routing, then a decentralized policy, that can be computed efficiently in a distributed fashion, is optimal. Our algorithm utilizes multi-timescale stochastic approximation techniques, and its convergence relies on the fact that the recursions asymptotically track a nonlinear differential equation, namely the replicator equation. Extensive simulations show that the proposed policy indeed outperforms the existing policies

    Channels, Remote Estimation and Queueing Systems With A Utilization-Dependent Component: A Unifying Survey Of Recent Results

    Full text link
    In this article, we survey the main models, techniques, concepts, and results centered on the design and performance evaluation of engineered systems that rely on a utilization-dependent component (UDC) whose operation may depend on its usage history or assigned workload. Specifically, we report on research themes concentrating on the characterization of the capacity of channels and the design with performance guarantees of remote estimation and queueing systems. Causes for the dependency of a UDC on past utilization include the use of replenishable energy sources to power the transmission of information among the sub-components of a networked system, and the assistance of a human operator for servicing a queue. Our analysis unveils the similarity of the UDC models typically adopted in each of the research themes, and it reveals the differences in the objectives and technical approaches employed. We also identify new challenges and future research directions inspired by the cross-pollination among the central concepts, techniques, and problem formulations of the research themes discussed

    A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits

    Full text link
    This paper presents sufficient conditions for indexability (existence of the Whittle index) of general real-state discrete-time restless bandit projects under the discounted optimality criterion, which are not based on dynamic programming and do not require establishing first optimality of threshold policies as in prevailing approaches. The main contribution is a verification theorem establishing that, if project performance metrics under threshold policies and an explicitly defined marginal productivity (MP) index satisfy three conditions, then the project is indexable with its Whittle index being given by the MP index, in a form implying optimality of threshold policies for dynamic project control. Further contributions include characterizations of the index as a Radon-Nikodym derivative and as a shadow price, and a recursive index-computing scheme.Comment: 2 figures; under revie

    A Gradient-Aware Search Algorithm for Constrained Markov Decision Processes

    Full text link
    The canonical solution methodology for finite constrained Markov decision processes (CMDPs), where the objective is to maximize the expected infinite-horizon discounted rewards subject to the expected infinite-horizon discounted costs constraints, is based on convex linear programming. In this brief, we first prove that the optimization objective in the dual linear program of a finite CMDP is a piece-wise linear convex function (PWLC) with respect to the Lagrange penalty multipliers. Next, we propose a novel two-level Gradient-Aware Search (GAS) algorithm which exploits the PWLC structure to find the optimal state-value function and Lagrange penalty multipliers of a finite CMDP. The proposed algorithm is applied in two stochastic control problems with constraints: robot navigation in a grid world and solar-powered unmanned aerial vehicle (UAV)-based wireless network management. We empirically compare the convergence performance of the proposed GAS algorithm with binary search (BS), Lagrangian primal-dual optimization (PDO), and Linear Programming (LP). Compared with benchmark algorithms, it is shown that the proposed GAS algorithm converges to the optimal solution faster, does not require hyper-parameter tuning, and is not sensitive to initialization of the Lagrange penalty multiplier.Comment: Submitted as a brief paper to the IEEE TNNL

    A numerical scheme for a mean field game in some queueing systems based on Markov chain approximation method

    Full text link
    We use the Markov chain approximation method to construct approximations for the solution of the mean field game (MFG) with reflecting barriers studied in Bayraktar, Budhiraja, and Cohen (2017). The MFG is formulated in terms of a controlled reflected diffusion with a cost function that depends on the reflection terms in addition to the standard variables: state, control, and the mean field term. This MFG arises from the asymptotic analysis of an NN-player game for single server queues with strategic servers. By showing that our scheme is an almost contraction, we establish the convergence of this numerical scheme over a small time interval.Comment: arXiv admin note: text overlap with arXiv:1605.0901
    • …
    corecore