14,352 research outputs found
Echo State Learning for Wireless Virtual Reality Resource Allocation in UAV-enabled LTE-U Networks
In this paper, the problem of resource management is studied for a network of
wireless virtual reality (VR) users communicating using an unmanned aerial
vehicle (UAV)-enabled LTE-U network. In the studied model, the UAVs act as VR
control centers that collect tracking information from the VR users over the
wireless uplink and, then, send the constructed VR images to the VR users over
an LTE-U downlink. Therefore, resource allocation in such a UAV-enabled LTE-U
network must jointly consider the uplink and downlink links over both licensed
and unlicensed bands. In such a VR setting, the UAVs can dynamically adjust the
image quality and format of each VR image to change the data size of each VR
image, then meet the delay requirement. Therefore, resource allocation must
also take into account the image quality and format. This VR-centric resource
allocation problem is formulated as a noncooperative game that enables a joint
allocation of licensed and unlicensed spectrum bands, as well as a dynamic
adaptation of VR image quality and format. To solve this game, a learning
algorithm based on the machine learning tools of echo state networks (ESNs)
with leaky integrator neurons is proposed. Unlike conventional ESN based
learning algorithms that are suitable for discrete-time systems, the proposed
algorithm can dynamically adjust the update speed of the ESN's state and,
hence, it can enable the UAVs to learn the continuous dynamics of their
associated VR users. Simulation results show that the proposed algorithm
achieves up to 14% and 27.1% gains in terms of total VR QoE for all users
compared to Q-learning using LTE-U and Q-learning using LTE
Joint Channel Selection and Power Control in Infrastructureless Wireless Networks: A Multi-Player Multi-Armed Bandit Framework
This paper deals with the problem of efficient resource allocation in dynamic
infrastructureless wireless networks. Assuming a reactive interference-limited
scenario, each transmitter is allowed to select one frequency channel (from a
common pool) together with a power level at each transmission trial; hence, for
all transmitters, not only the fading gain, but also the number of interfering
transmissions and their transmit powers are varying over time. Due to the
absence of a central controller and time-varying network characteristics, it is
highly inefficient for transmitters to acquire global channel and network
knowledge. Therefore a reasonable assumption is that transmitters have no
knowledge of fading gains, interference, and network topology. Each
transmitting node selfishly aims at maximizing its average reward (or
minimizing its average cost), which is a function of the action of that
specific transmitter as well as those of all other transmitters. This scenario
is modeled as a multi-player multi-armed adversarial bandit game, in which
multiple players receive an a priori unknown reward with an arbitrarily
time-varying distribution by sequentially pulling an arm, selected from a known
and finite set of arms. Since players do not know the arm with the highest
average reward in advance, they attempt to minimize their so-called regret,
determined by the set of players' actions, while attempting to achieve
equilibrium in some sense. To this end, we design in this paper two joint power
level and channel selection strategies. We prove that the gap between the
average reward achieved by our approaches and that based on the best fixed
strategy converges to zero asymptotically. Moreover, the empirical joint
frequencies of the game converge to the set of correlated equilibria. We
further characterize this set for two special cases of our designed game
- …