4 research outputs found
Reinforcement Learning for Joint Optimization of Multiple Rewards
Reinforcement Learning (RL) algorithms such as DQN owe their success to
Markov Decision Processes, and the fact that maximizing the sum of rewards
allows using backward induction and reduce to the Bellman optimality equation.
However, many real-world problems require optimization of an objective that is
non-linear in cumulative rewards for which dynamic programming cannot be
applied directly. For example, in a resource allocation problem, one of the
objectives is to maximize long-term fairness among the users. We notice that
when the function of the sum of rewards is considered, the problem loses its
Markov nature. This paper addresses and formalizes the problem of optimizing a
non-linear function of the long term average of rewards. We propose model-based
and model-free algorithms to learn the policy, where the model-based policy is
shown to achieve a regret of \Tilde{O}\left(KDSA\sqrt{\frac{A}{T}}\right) for
users. Further, using the fairness in cellular base-station scheduling, and
queueing system scheduling as examples, the proposed algorithm is shown to
significantly outperform the conventional RL approaches
Power-spectrum trading for full-duplex D2D communications in cellular networks
Device-to-device (D2D) communications allows two adjacent mobile terminals transmit signal directly without going through base stations, which has been considered as one of the key technologies for future mobile networks. As full-duplex (FD) communications can improve the performance (i.e., throughput, energy efficiency (EE)) of communications systems, it is commonly used in practical D2D communications scenarios. However, FD-enabled D2D communications also results in self-interference. To fully realize the potential benefits of FD-enabled D2D communications, an effective resource allocation mechanism is critical to avoid not only the self-interference of FD-enabled D2D communications but also the interference between D2D users (DUs) and cellular users (CUs). In this paper, we investigate the resource allocation issue for FD-enabled DUs and traditional CUs. Considering the asymmetry of energy and spectrum resources of DUs and CUs, we propose a power-spectrum trading mechanism to achieve mutual benefits for both types of users. A concave-convex procedure algorithm is employed to solve the optimization problem of power allocation, and then a maximum weighted bipartite matching based method is proposed to select proper D2D pairs to maximize the overall system throughput. Numerical results show that the proposed scheme can remarkably improve the overall throughput and EE of FD-enabled D2D communications system
Resource Allocation for Underlay D2D Communication with Proportional Fairness
© 1967-2012 IEEE. As an emerging paradigm, device-to-device (D2D) communication has the capability to complement and enhance the conventional cellular network by offering high spectral and energy efficiency. However, the problem of cochannel interference makes the resource allocation very complex and challenging in underlay D2D communication networks. This paper proposes a novel joint power control and resource scheduling scheme to enhance both the network throughput and the users' fairness of the underlay D2D communication networks. Unlike other previous work in this area, our scheme aims at maximizing the sum of all users' proportional fairness functions, while simultaneously taking into account factors such as fairness, signal-to-interference-plus-noise ratio requirements, and severe interference. The proposed scheme offers a practical solution because it works for lengthy time slots, a realistic scenario for the underlay D2D communication system. We also take into consideration the time-varying feature of user's channel condition in our proposed solution. Numerical results confirm that our proposed scheme not only dramatically improves the system throughput, but also boosts the system fairness while guaranteeing the Quality-of-Service levels of all D2D users and cellular users
Survey on the state-of-the-art in device-to-device communication: A resource allocation perspective
Device to Device (D2D) communication takes advantage of the proximity between the communicating devices in order to achieve efficient resource utilization, improved throughput and energy efficiency, simultaneous serviceability and reduced latency. One of the main characteristics of D2D communication is reuse of the frequency resource in order to improve spectral efficiency of the system. Nevertheless, frequency reuse introduces significantly high interference levels thus necessitating efficient resource allocation algorithms that can enable simultaneous communication sessions through effective channel and/or power allocation. This survey paper presents a comprehensive investigation of the state-of-the-art resource allocation algorithms in D2D communication underlaying cellular networks. The surveyed algorithms are evaluated based on heterogeneous parameters which constitute the elementary features of a resource allocation algorithm in D2D paradigm. Additionally, in order to familiarize the readers with the basic design of the surveyed resource allocation algorithms, brief description of the mode of operation of each algorithm is presented. The surveyed algorithms are divided into four categories based on their technical doctrine i.e., conventional optimization based, Non-Orthogonal-MultipleAccess (NOMA) based, game theory based and machine learning based techniques. Towards the end, several open challenges are remarked as the future research directions in resource allocation for D2D communication