14,352 research outputs found

    Echo State Learning for Wireless Virtual Reality Resource Allocation in UAV-enabled LTE-U Networks

    Full text link
    In this paper, the problem of resource management is studied for a network of wireless virtual reality (VR) users communicating using an unmanned aerial vehicle (UAV)-enabled LTE-U network. In the studied model, the UAVs act as VR control centers that collect tracking information from the VR users over the wireless uplink and, then, send the constructed VR images to the VR users over an LTE-U downlink. Therefore, resource allocation in such a UAV-enabled LTE-U network must jointly consider the uplink and downlink links over both licensed and unlicensed bands. In such a VR setting, the UAVs can dynamically adjust the image quality and format of each VR image to change the data size of each VR image, then meet the delay requirement. Therefore, resource allocation must also take into account the image quality and format. This VR-centric resource allocation problem is formulated as a noncooperative game that enables a joint allocation of licensed and unlicensed spectrum bands, as well as a dynamic adaptation of VR image quality and format. To solve this game, a learning algorithm based on the machine learning tools of echo state networks (ESNs) with leaky integrator neurons is proposed. Unlike conventional ESN based learning algorithms that are suitable for discrete-time systems, the proposed algorithm can dynamically adjust the update speed of the ESN's state and, hence, it can enable the UAVs to learn the continuous dynamics of their associated VR users. Simulation results show that the proposed algorithm achieves up to 14% and 27.1% gains in terms of total VR QoE for all users compared to Q-learning using LTE-U and Q-learning using LTE

    Joint Channel Selection and Power Control in Infrastructureless Wireless Networks: A Multi-Player Multi-Armed Bandit Framework

    Full text link
    This paper deals with the problem of efficient resource allocation in dynamic infrastructureless wireless networks. Assuming a reactive interference-limited scenario, each transmitter is allowed to select one frequency channel (from a common pool) together with a power level at each transmission trial; hence, for all transmitters, not only the fading gain, but also the number of interfering transmissions and their transmit powers are varying over time. Due to the absence of a central controller and time-varying network characteristics, it is highly inefficient for transmitters to acquire global channel and network knowledge. Therefore a reasonable assumption is that transmitters have no knowledge of fading gains, interference, and network topology. Each transmitting node selfishly aims at maximizing its average reward (or minimizing its average cost), which is a function of the action of that specific transmitter as well as those of all other transmitters. This scenario is modeled as a multi-player multi-armed adversarial bandit game, in which multiple players receive an a priori unknown reward with an arbitrarily time-varying distribution by sequentially pulling an arm, selected from a known and finite set of arms. Since players do not know the arm with the highest average reward in advance, they attempt to minimize their so-called regret, determined by the set of players' actions, while attempting to achieve equilibrium in some sense. To this end, we design in this paper two joint power level and channel selection strategies. We prove that the gap between the average reward achieved by our approaches and that based on the best fixed strategy converges to zero asymptotically. Moreover, the empirical joint frequencies of the game converge to the set of correlated equilibria. We further characterize this set for two special cases of our designed game
    corecore