Search CORE

1,501 research outputs found

Fast-Convergent Learning-aided Control in Energy Harvesting Networks

Author: Huang Longbo
Publication venue
Publication date: 20/03/2015
Field of study

In this paper, we present a novel learning-aided energy management scheme (

\mathtt{LEM}

) for multihop energy harvesting networks. Different from prior works on this problem, our algorithm explicitly incorporates information learning into system control via a step called \emph{perturbed dual learning}.

\mathtt{LEM}

does not require any statistical information of the system dynamics for implementation, and efficiently resolves the challenging energy outage problem. We show that

\mathtt{LEM}

achieves the near-optimal

[O(\epsilon), O(\log(1/\epsilon)^2)]

utility-delay tradeoff with an

O(1/\epsilon^{1-c/2})

energy buffers (

c\in(0,1)

). More interestingly,

\mathtt{LEM}

possesses a \emph{convergence time} of

O(1/\epsilon^{1-c/2} +1/\epsilon^c)

, which is much faster than the

\Theta(1/\epsilon)

time of pure queue-based techniques or the

\Theta(1/\epsilon^2)

time of approaches that rely purely on learning the system statistics. This fast convergence property makes

\mathtt{LEM}

more adaptive and efficient in resource allocation in dynamic environments. The design and analysis of

\mathtt{LEM}

demonstrate how system control algorithms can be augmented by learning and what the benefits are. The methodology and algorithm can also be applied to similar problems, e.g., processing networks, where nodes require nonzero amount of contents to support their actions

arXiv.org e-Print Archive

Crossref

Learning Aided Optimization for Energy Harvesting Devices with Outdated State Information

Author: Neely Michael J.
Yu Hao
Publication venue
Publication date: 25/08/2019
Field of study

This paper considers utility optimal power control for energy harvesting wireless devices with a finite capacity battery. The distribution information of the underlying wireless environment and harvestable energy is unknown and only outdated system state information is known at the device controller. This scenario shares similarity with Lyapunov opportunistic optimization and online learning but is different from both. By a novel combination of Zinkevich's online gradient learning technique and the drift-plus-penalty technique from Lyapunov opportunistic optimization, this paper proposes a learning-aided algorithm that achieves utility within

O(\epsilon)

of the optimal, for any desired

\epsilon>0

, by using a battery with an

O(1/\epsilon)

capacity. The proposed algorithm has low complexity and makes power investment decisions based on system history, without requiring knowledge of the system state or its probability distribution.Comment: This version extends v1 (our INFOCOM 2018 paper): (1) add a new section (Section V) to study the case where utility functions are non-i.i.d. arbitrarily varying (2) add more simulation experiments. The current version is published in IEEE/ACM Transactions on Networkin

arXiv.org e-Print Archive

Crossref

Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications

Author: Dietzfelbinger Martin
Walzer Stefan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

In this paper we identify a new class of sparse near-quadratic random Boolean matrices that have full row rank over F_2 = {0,1} with high probability and can be transformed into echelon form in almost linear time by a simple version of Gauss elimination. The random matrix with dimensions n(1-epsilon) x n is generated as follows: In each row, identify a block of length L = O((log n)/epsilon) at a random position. The entries outside the block are 0, the entries inside the block are given by fair coin tosses. Sorting the rows according to the positions of the blocks transforms the matrix into a kind of band matrix, on which, as it turns out, Gauss elimination works very efficiently with high probability. For the proof, the effects of Gauss elimination are interpreted as a ("coin-flipping") variant of Robin Hood hashing, whose behaviour can be captured in terms of a simple Markov model from queuing theory. Bounds for expected construction time and high success probability follow from results in this area. They readily extend to larger finite fields in place of F_2. By employing hashing, this matrix family leads to a new implementation of a retrieval data structure, which represents an arbitrary function f: S -> {0,1} for some set S of m = (1-epsilon)n keys. It requires m/(1-epsilon) bits of space, construction takes O(m/epsilon^2) expected time on a word RAM, while queries take O(1/epsilon) time and access only one contiguous segment of O((log m)/epsilon) bits in the representation (O(1/epsilon) consecutive words on a word RAM). The method is readily implemented and highly practical, and it is competitive with state-of-the-art methods. In a more theoretical variant, which works only for unrealistically large S, we can even achieve construction time O(m/epsilon) and query time O(1), accessing O(1) contiguous memory words for a query. By well-established methods the retrieval data structure leads to efficient constructions of (static) perfect hash functions and (static) Bloom filters with almost optimal space and very local storage access patterns for queries

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Digitale Bibliothek Thüringen

Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints

Author: Aggarwal Vaneet
Bai Qinbo
Gattami Ather
Publication venue
Publication date: 30/01/2021
Field of study

In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This paper considers the peak constraints, where the agent chooses the policy to maximize the long-term average reward as well as satisfies the constraints at each time. We propose a model-free algorithm that converts CMDP problem to an unconstrained problem and a Q-learning based approach is used. We extend the concept of probably approximately correct (PAC) to define a criterion of

\epsilon

-optimal policy. The proposed algorithm is proved to achieve an

\epsilon

-optimal policy with probability at least

1-p

when the episode

K\geq\Omega(\frac{I^2H^6SA\ell}{\epsilon^2})

, where

S

and

A

is the number of states and actions, respectively,

H

is the number of steps per episode,

I

is the number of constraint functions, and

\ell=\log(\frac{SAT}{p})

. We note that this is the first result on PAC kind of analysis for CMDP with peak constraints, where the transition probabilities are not known apriori. We demonstrate the proposed algorithm on an energy harvesting problem where it outperforms state-of-the-art and performs close to the theoretical upper bound of the studied optimization problem

arXiv.org e-Print Archive

A Bandit Approach to Online Pricing for Heterogeneous Edge Resource Allocation

Author: Bhargava Vijay K.
Cheng Jiaming
Nguyen Duong Thuy Anh
Nguyen Duong Tung
Wang Lele
Publication venue
Publication date: 14/02/2023
Field of study

Edge Computing (EC) offers a superior user experience by positioning cloud resources in close proximity to end users. The challenge of allocating edge resources efficiently while maximizing profit for the EC platform remains a sophisticated problem, especially with the added complexity of the online arrival of resource requests. To address this challenge, we propose to cast the problem as a multi-armed bandit problem and develop two novel online pricing mechanisms, the Kullback-Leibler Upper Confidence Bound (KL-UCB) algorithm and the Min-Max Optimal algorithm, for heterogeneous edge resource allocation. These mechanisms operate in real-time and do not require prior knowledge of demand distribution, which can be difficult to obtain in practice. The proposed posted pricing schemes allow users to select and pay for their preferred resources, with the platform dynamically adjusting resource prices based on observed historical data. Numerical results show the advantages of the proposed mechanisms compared to several benchmark schemes derived from traditional bandit algorithms, including the Epsilon-Greedy, basic UCB, and Thompson Sampling algorithms

arXiv.org e-Print Archive