4 research outputs found

    Reinforcement Learning for Joint Optimization of Multiple Rewards

    Full text link
    Reinforcement Learning (RL) algorithms such as DQN owe their success to Markov Decision Processes, and the fact that maximizing the sum of rewards allows using backward induction and reduce to the Bellman optimality equation. However, many real-world problems require optimization of an objective that is non-linear in cumulative rewards for which dynamic programming cannot be applied directly. For example, in a resource allocation problem, one of the objectives is to maximize long-term fairness among the users. We notice that when the function of the sum of rewards is considered, the problem loses its Markov nature. This paper addresses and formalizes the problem of optimizing a non-linear function of the long term average of rewards. We propose model-based and model-free algorithms to learn the policy, where the model-based policy is shown to achieve a regret of \Tilde{O}\left(KDSA\sqrt{\frac{A}{T}}\right) for KK users. Further, using the fairness in cellular base-station scheduling, and queueing system scheduling as examples, the proposed algorithm is shown to significantly outperform the conventional RL approaches

    Power-spectrum trading for full-duplex D2D communications in cellular networks

    Get PDF
    Device-to-device (D2D) communications allows two adjacent mobile terminals transmit signal directly without going through base stations, which has been considered as one of the key technologies for future mobile networks. As full-duplex (FD) communications can improve the performance (i.e., throughput, energy efficiency (EE)) of communications systems, it is commonly used in practical D2D communications scenarios. However, FD-enabled D2D communications also results in self-interference. To fully realize the potential benefits of FD-enabled D2D communications, an effective resource allocation mechanism is critical to avoid not only the self-interference of FD-enabled D2D communications but also the interference between D2D users (DUs) and cellular users (CUs). In this paper, we investigate the resource allocation issue for FD-enabled DUs and traditional CUs. Considering the asymmetry of energy and spectrum resources of DUs and CUs, we propose a power-spectrum trading mechanism to achieve mutual benefits for both types of users. A concave-convex procedure algorithm is employed to solve the optimization problem of power allocation, and then a maximum weighted bipartite matching based method is proposed to select proper D2D pairs to maximize the overall system throughput. Numerical results show that the proposed scheme can remarkably improve the overall throughput and EE of FD-enabled D2D communications system

    Resource Allocation for Underlay D2D Communication with Proportional Fairness

    Full text link
    © 1967-2012 IEEE. As an emerging paradigm, device-to-device (D2D) communication has the capability to complement and enhance the conventional cellular network by offering high spectral and energy efficiency. However, the problem of cochannel interference makes the resource allocation very complex and challenging in underlay D2D communication networks. This paper proposes a novel joint power control and resource scheduling scheme to enhance both the network throughput and the users' fairness of the underlay D2D communication networks. Unlike other previous work in this area, our scheme aims at maximizing the sum of all users' proportional fairness functions, while simultaneously taking into account factors such as fairness, signal-to-interference-plus-noise ratio requirements, and severe interference. The proposed scheme offers a practical solution because it works for lengthy time slots, a realistic scenario for the underlay D2D communication system. We also take into consideration the time-varying feature of user's channel condition in our proposed solution. Numerical results confirm that our proposed scheme not only dramatically improves the system throughput, but also boosts the system fairness while guaranteeing the Quality-of-Service levels of all D2D users and cellular users

    Survey on the state-of-the-art in device-to-device communication: A resource allocation perspective

    Get PDF
    Device to Device (D2D) communication takes advantage of the proximity between the communicating devices in order to achieve efficient resource utilization, improved throughput and energy efficiency, simultaneous serviceability and reduced latency. One of the main characteristics of D2D communication is reuse of the frequency resource in order to improve spectral efficiency of the system. Nevertheless, frequency reuse introduces significantly high interference levels thus necessitating efficient resource allocation algorithms that can enable simultaneous communication sessions through effective channel and/or power allocation. This survey paper presents a comprehensive investigation of the state-of-the-art resource allocation algorithms in D2D communication underlaying cellular networks. The surveyed algorithms are evaluated based on heterogeneous parameters which constitute the elementary features of a resource allocation algorithm in D2D paradigm. Additionally, in order to familiarize the readers with the basic design of the surveyed resource allocation algorithms, brief description of the mode of operation of each algorithm is presented. The surveyed algorithms are divided into four categories based on their technical doctrine i.e., conventional optimization based, Non-Orthogonal-MultipleAccess (NOMA) based, game theory based and machine learning based techniques. Towards the end, several open challenges are remarked as the future research directions in resource allocation for D2D communication
    corecore