2,615 research outputs found

    The Value-of-Information in Matching with Queues

    Full text link
    We consider the problem of \emph{optimal matching with queues} in dynamic systems and investigate the value-of-information. In such systems, the operators match tasks and resources stored in queues, with the objective of maximizing the system utility of the matching reward profile, minus the average matching cost. This problem appears in many practical systems and the main challenges are the no-underflow constraints, and the lack of matching-reward information and system dynamics statistics. We develop two online matching algorithms: Learning-aided Reward optimAl Matching (LRAM\mathtt{LRAM}) and Dual-LRAM\mathtt{LRAM} (DRAM\mathtt{DRAM}) to effectively resolve both challenges. Both algorithms are equipped with a learning module for estimating the matching-reward information, while DRAM\mathtt{DRAM} incorporates an additional module for learning the system dynamics. We show that both algorithms achieve an O(ϵ+δr)O(\epsilon+\delta_r) close-to-optimal utility performance for any ϵ>0\epsilon>0, while DRAM\mathtt{DRAM} achieves a faster convergence speed and a better delay compared to LRAM\mathtt{LRAM}, i.e., O(δz/ϵ+log(1/ϵ)2))O(\delta_{z}/\epsilon + \log(1/\epsilon)^2)) delay and O(δz/ϵ)O(\delta_z/\epsilon) convergence under DRAM\mathtt{DRAM} compared to O(1/ϵ)O(1/\epsilon) delay and convergence under LRAM\mathtt{LRAM} (δr\delta_r and δz\delta_z are maximum estimation errors for reward and system dynamics). Our results reveal that information of different system components can play very different roles in algorithm performance and provide a systematic way for designing joint learning-control algorithms for dynamic systems

    A Survey on Delay-Aware Resource Control for Wireless Systems --- Large Deviation Theory, Stochastic Lyapunov Drift and Distributed Stochastic Learning

    Full text link
    In this tutorial paper, a comprehensive survey is given on several major systematic approaches in dealing with delay-aware control problems, namely the equivalent rate constraint approach, the Lyapunov stability drift approach and the approximate Markov Decision Process (MDP) approach using stochastic learning. These approaches essentially embrace most of the existing literature regarding delay-aware resource control in wireless systems. They have their relative pros and cons in terms of performance, complexity and implementation issues. For each of the approaches, the problem setup, the general solution and the design methodology are discussed. Applications of these approaches to delay-aware resource allocation are illustrated with examples in single-hop wireless networks. Furthermore, recent results regarding delay-aware multi-hop routing designs in general multi-hop networks are elaborated. Finally, the delay performance of the various approaches are compared through simulations using an example of the uplink OFDMA systems.Comment: 58 pages, 8 figures; IEEE Transactions on Information Theory, 201

    Stable Wireless Network Control Under Service Constraints

    Full text link
    We consider the design of wireless queueing network control policies with particular focus on combining stability with additional application-dependent requirements. Thereby, we consequently pursue a cost function based approach that provides the flexibility to incorporate constraints and requirements of particular services or applications. As typical examples of such requirements, we consider the reduction of buffer underflows in case of streaming traffic, and energy efficiency in networks of battery powered nodes. Compared to the classical throughput optimal control problem, such requirements significantly complicate the control problem. We provide easily verifyable theoretical conditions for stability, and, additionally, compare various candidate cost functions applied to wireless networks with streaming media traffic. Moreover, we demonstrate how the framework can be applied to the problem of energy efficient routing, and we demonstrate the aplication of our framework in cross-layer control problems for wireless multihop networks, using an advanced power control scheme for interference mitigation, based on successive convex approximation. In all scenarios, the performance of our control framework is evaluated using extensive numerical simulations.Comment: Accepted for publication in IEEE Transactions on Control of Network Systems. arXiv admin note: text overlap with arXiv:1208.297

    Learning-aided Stochastic Network Optimization with Imperfect State Prediction

    Full text link
    We investigate the problem of stochastic network optimization in the presence of imperfect state prediction and non-stationarity. Based on a novel distribution-accuracy curve prediction model, we develop the predictive learning-aided control (PLC) algorithm, which jointly utilizes historic and predicted network state information for decision making. PLC is an online algorithm that requires zero a-prior system statistical information, and consists of three key components, namely sequential distribution estimation and change detection, dual learning, and online queue-based control. Specifically, we show that PLC simultaneously achieves good long-term performance, short-term queue size reduction, accurate change detection, and fast algorithm convergence. In particular, for stationary networks, PLC achieves a near-optimal [O(ϵ)[O(\epsilon), O(log(1/ϵ)2)]O(\log(1/\epsilon)^2)] utility-delay tradeoff. For non-stationary networks, \plc{} obtains an [O(ϵ),O(log2(1/ϵ)[O(\epsilon), O(\log^2(1/\epsilon) +min(ϵc/21,ew/ϵ))]+ \min(\epsilon^{c/2-1}, e_w/\epsilon))] utility-backlog tradeoff for distributions that last Θ(max(ϵc,ew2)ϵ1+a)\Theta(\frac{\max(\epsilon^{-c}, e_w^{-2})}{\epsilon^{1+a}}) time, where ewe_w is the prediction accuracy and a=Θ(1)>0a=\Theta(1)>0 is a constant (the Backpressue algorithm \cite{neelynowbook} requires an O(ϵ2)O(\epsilon^{-2}) length for the same utility performance with a larger backlog). Moreover, PLC detects distribution change O(w)O(w) slots faster with high probability (ww is the prediction size) and achieves an O(min(ϵ1+c/2,ew/ϵ)+log2(1/ϵ))O(\min(\epsilon^{-1+c/2}, e_w/\epsilon)+\log^2(1/\epsilon)) convergence time. Our results demonstrate that state prediction (even imperfect) can help (i) achieve faster detection and convergence, and (ii) obtain better utility-delay tradeoffs
    corecore