3 research outputs found

    Fast-Convergent Learning-aided Control in Energy Harvesting Networks

    Full text link
    In this paper, we present a novel learning-aided energy management scheme (LEM\mathtt{LEM}) for multihop energy harvesting networks. Different from prior works on this problem, our algorithm explicitly incorporates information learning into system control via a step called \emph{perturbed dual learning}. LEM\mathtt{LEM} does not require any statistical information of the system dynamics for implementation, and efficiently resolves the challenging energy outage problem. We show that LEM\mathtt{LEM} achieves the near-optimal [O(ϵ),O(log(1/ϵ)2)][O(\epsilon), O(\log(1/\epsilon)^2)] utility-delay tradeoff with an O(1/ϵ1c/2)O(1/\epsilon^{1-c/2}) energy buffers (c(0,1)c\in(0,1)). More interestingly, LEM\mathtt{LEM} possesses a \emph{convergence time} of O(1/ϵ1c/2+1/ϵc)O(1/\epsilon^{1-c/2} +1/\epsilon^c), which is much faster than the Θ(1/ϵ)\Theta(1/\epsilon) time of pure queue-based techniques or the Θ(1/ϵ2)\Theta(1/\epsilon^2) time of approaches that rely purely on learning the system statistics. This fast convergence property makes LEM\mathtt{LEM} more adaptive and efficient in resource allocation in dynamic environments. The design and analysis of LEM\mathtt{LEM} demonstrate how system control algorithms can be augmented by learning and what the benefits are. The methodology and algorithm can also be applied to similar problems, e.g., processing networks, where nodes require nonzero amount of contents to support their actions

    Optimization of Energy Harvesting Mobile Nodes Within Scalable Converter System Based on Reinforcement Learning

    Get PDF
    Microgrid monitoring focusing on power data, such as voltage and current, has become more significant in the development of decentralized power supply system. The power data transmission delay between distributed generator is vital for evaluating the stability and financial outcome of overall grid performance. In this thesis, both hardware and simulation has been discussed for optimizing the data packets transmission delay, energy consumption, and collision rate. To minimize the transmission delay and collision rate, state-action-reward-state-action (SARSA) and Q-learning method based on Markov decision process (MDP) model is used to search the most efficient data transmission scheme for each agent device. A training process comparison between SARSA and Q-learning is given out for representing the training speed of these two methodologies in the scenario of source-relaying-destination model. To balance the exploration and exploitation process involved in these two methods, a parameter is introduced to optimize the cost time of training process. Finally, the simulation result of average throughput and data packets collision rate in the network with 20 agent nodes is presented to indicate the application feasibility of reinforcement learning algorithm in the development of scalable network. The results show that, the average throughput and collision rate stay on the expected ideal performance level for the overall network when the number of nodes is not too large. Also, the hardware development based on Bluetooth Low Energy (BLE) is used to reveal the process of data packets transmission
    corecore