1,012 research outputs found
Dynamic Network State Learning Model for Mobility Based WMSN Routing Protocol
The rising demand of wireless multimedia sensor networks (WMSNs) has motivated academia-industries to develop energy efficient, Quality of Service (QoS) and delay sensitive communication systems to meet major real-world demands like multimedia broadcast, security and surveillance systems, intelligent transport system, etc. Typically, energy efficiency, QoS and delay sensitive transmission are the inevitable requirements of WMSNs. Majority of the existing approaches either use physical layer or system level schemes that individually can’t assure optimal transmission decision to meet the demand. The cumulative efficiency of physical layer power control, adaptive modulation and coding and system level dynamic power management (DPM) are found significant to achieve these demands. With this motivation, in this paper a unified model is derived using enhanced reinforcement learning and stochastic optimization method. Exploiting physical as well as system level network state information, our proposed dynamic network state learning model (NSLM) applies stochastic optimization to learn network state-activity that derives an optimal DPM policy and PHY switching scheduling. NSLM applies known as well as unknown network state variables to derive transmission and PHY switching policy, where it considers DPM as constrained Markov decision process (MDP) problem. Here,the use of Hidden Markov Model and Lagrangian relaxation has made NSLM convergence swift that assures delay-sensitive, QoS enriched, and bandwidth and energy efficient transmission for WMSN under uncertain network conditions. Our proposed NSLM DPM model has outperformed traditional Q-Learning based DPM in terms of buffer cost, holding cost, overflow, energy consumption and bandwidth utilization
Scheduling and Power Control for Wireless Multicast Systems via Deep Reinforcement Learning
Multicasting in wireless systems is a natural way to exploit the redundancy
in user requests in a Content Centric Network. Power control and optimal
scheduling can significantly improve the wireless multicast network's
performance under fading. However, the model based approaches for power control
and scheduling studied earlier are not scalable to large state space or
changing system dynamics. In this paper, we use deep reinforcement learning
where we use function approximation of the Q-function via a deep neural network
to obtain a power control policy that matches the optimal policy for a small
network. We show that power control policy can be learnt for reasonably large
systems via this approach. Further we use multi-timescale stochastic
optimization to maintain the average power constraint. We demonstrate that a
slight modification of the learning algorithm allows tracking of time varying
system statistics. Finally, we extend the multi-timescale approach to
simultaneously learn the optimal queueing strategy along with power control. We
demonstrate scalability, tracking and cross layer optimization capabilities of
our algorithms via simulations. The proposed multi-timescale approach can be
used in general large state space dynamical systems with multiple objectives
and constraints, and may be of independent interest.Comment: arXiv admin note: substantial text overlap with arXiv:1910.0530
Distributed Learning Policies for Power Allocation in Multiple Access Channels
We analyze the problem of distributed power allocation for orthogonal
multiple access channels by considering a continuous non-cooperative game whose
strategy space represents the users' distribution of transmission power over
the network's channels. When the channels are static, we find that this game
admits an exact potential function and this allows us to show that it has a
unique equilibrium almost surely. Furthermore, using the game's potential
property, we derive a modified version of the replicator dynamics of
evolutionary game theory which applies to this continuous game, and we show
that if the network's users employ a distributed learning scheme based on these
dynamics, then they converge to equilibrium exponentially quickly. On the other
hand, a major challenge occurs if the channels do not remain static but
fluctuate stochastically over time, following a stationary ergodic process. In
that case, the associated ergodic game still admits a unique equilibrium, but
the learning analysis becomes much more complicated because the replicator
dynamics are no longer deterministic. Nonetheless, by employing results from
the theory of stochastic approximation, we show that users still converge to
the game's unique equilibrium.
Our analysis hinges on a game-theoretical result which is of independent
interest: in finite player games which admit a (possibly nonlinear) convex
potential function, the replicator dynamics (suitably modified to account for
nonlinear payoffs) converge to an eps-neighborhood of an equilibrium at time of
order O(log(1/eps)).Comment: 11 pages, 8 figures. Revised manuscript structure and added more
material and figures for the case of stochastically fluctuating channels.
This version will appear in the IEEE Journal on Selected Areas in
Communication, Special Issue on Game Theory in Wireless Communication
- …