6 research outputs found

    Enhancing the performance of energy harvesting wireless communications using optimization and machine learning

    Get PDF
    The motivation behind this thesis is to provide efficient solutions for energy harvesting communications. Firstly, an energy harvesting underlay cognitive radio relaying network is investigated. In this context, the secondary network is an energy harvesting network. Closed-form expressions are derived for transmission power of secondary source and relay that maximizes the secondary network throughput. Secondly, a practical scenario in terms of information availability about the environment is investigated. We consider a communications system with a source capable of harvesting solar energy. Two cases are considered based on the knowledge availability about the underlying processes. When this knowledge is available, an algorithm using this knowledge is designed to maximize the expected throughput, while reducing the complexity of traditional methods. For the second case, when the knowledge about the underlying processes is unavailable, reinforcement learning is used. Thirdly, a number of learning architectures for reinforcement learning are introduced. They are called selector-actor-critic, tuner-actor-critic, and estimator-selector-actor-critic. The goal of the selector-actor-critic architecture is to increase the speed and the efficiency of learning an optimal policy by approximating the most promising action at the current state. The tuner-actor-critic aims at improving the learning process by providing the actor with a more accurate estimation about the value function. Estimator-selector-actor-critic is introduced to support intelligent agents. This architecture mimics rational humans in the way of analyzing available information, and making decisions. Then, a harvesting communications system working in an unknown environment is evaluated when it is supported by the proposed architectures. Fourthly, a realistic energy harvesting communications system is investigated. The state and action spaces of the underlying Markov decision process are continuous. Actor-critic is used to optimize the system performance. The critic uses a neural network to approximate the action-value function. The actor uses policy gradient to optimize the policy\u27s parameters to maximize the throughput

    Optimization and Learning Approaches for Energy Harvesting Wireless Communication Systems

    Get PDF
    Emerging technologies such as Internet of Things (IoT) and Industry 4.0 are now possible thanks to the advances in wireless sensor networks. In such applications, the wireless communication nodes play a key role because they provide the connection between different sensors as well as the communication to the outside world. In general, these wireless communication nodes are battery operated. However, depending on the specific application, charging or replacing the batteries can be too expensive or even infeasible, e.g., when the nodes are located in remote locations or inside structures. Therefore, in order to provide sustainable service and to reduce the operation expenses, energy harvesting (EH) has been considered as a promising technology in which the nodes collect energy from the environment using natural or man-made energy sources such as solar or electromagnetic radiation. The idea behind EH is that the wireless communication nodes can recharge their batteries while in idle mode or while transmitting data to neighboring nodes. As a result, the lifetime of the wireless communication network is not limited by the availability of energy. The consideration of EH brings new challenges in the design of transmission policies. This is because in addition to the fluctuating channel conditions and data arrival processes, the variability of the amount of energy available for the communication should be taken into account. Moreover, the three processes, EH, data arrival and channel fading, should be jointly considered in order to achieve optimum performance. In this context, this dissertation contributes to the research on EH wireless communication networks by considering power allocation and resource allocation problems in four different scenarios, namely, EH point-to-point, EH two-hop, EH broadcast and EH multiple access, which are the fundamental constituents of more complicated networks. Specifically, we determine the optimal allocation policies and the corresponding upper bounds of the achievable performance by considering offline approaches in which non-causal knowledge regarding system dynamics, i.e., the EH, data arrival and channel fading processes, is assumed. Furthermore, we overcome this unrealistic assumption by developing novel learning approaches, based on reinforcement learning, under the practical assumption that only causal knowledge of the system dynamics is available. First, we focus on the EH point-to-point scenario where an EH transmitter sends data to a receiver. For this scenario, we formulate the power allocation problem for throughput maximization considering not only the transmit power, but also the energy consumed by the circuit. Adopting an offline approach, we characterize the optimum power allocation policy and exploit this analysis in the development of a learning approach. Specifically, we develop a novel learning algorithm which considers a realistic EH point-to-point scenario, i.e., only causal knowledge of the system dynamics is assumed to be available. For the proposed learning algorithm, we exploit linear function approximation to cope with the infinite number of values the harvested energy, the incoming data and the channel coefficients can take. In particular, we propose four feature functions which are inspired by the characteristics of the problem and the insights gained from the offline approach. Through numerical simulations, we show that the proposed learning approach achieves a performance close to the offline optimum without the requirement of non-causal knowledge of the system dynamics. Moreover, it can achieve a performance up to 50% higher than the performance of reference learning schemes such as Q-learning, which do not exploit the characteristics of the problem. Secondly, we investigate an EH two-hop scenario in which an EH transmitter communicates with a receiver via an EH relay. For this purpose, we consider the main relaying strategies, namely, decode-and-forward and amplify-and-forward. Furthermore, we consider both, the transmit power and the energy consumed by the circuit in each of the EH nodes. For the EH decode-and-forward relay, we formulate the power allocation problem for throughput maximization and consider an offline approach to find the optimum power allocation policy. We show that the optimal power allocation policies of both nodes, transmitter and relay, depend on each other. Additionally, following a learning approach, we investigate a more realistic scenario in which the EH transmitter and the EH decode-and-forward relay have only partial and causal knowledge about the system dynamics, i.e., each node has only causal knowledge about the EH, data arrival and channel fading processes associated to it. To this aim, two novel learning algorithms are proposed which take into account whether or not the EH nodes cooperate with each other to improve their learning processes. For the cooperative case, we propose the inclusion of a signaling phase in which the EH nodes exchange their current parameters. Through numerical simulations, we show that by providing the nodes with a complete view of the system state in a signaling phase, a performance gain of up to 40% can be achieved compared to the case when no cooperation is considered. Following a similar procedure, we investigate the EH two-hop scenario with an EH amplify-and-forward relay. We show that the resulting power allocation problem for throughput maximization is non-convex. Consequently, we propose an offline approach based on a branch-and-bound algorithm tailored to the EH two-hop scenario to find the optimal power allocation policy. Additionally, a centralized learning algorithm is proposed for the realistic case in which only causal knowledge of the system dynamics is available. The proposed learning approach exploits the fact that, with an amplify-and-forward relay, the communication between the transmitter and the receiver depends on a single effective channel, which is composed of the link between the transmitter and the relay, the relay gain and the channel from the relay to the receiver. By means of numerical simulations, we show that the proposed learning algorithm achieves a performance up to two times higher than the performance achieved by reference schemes. Additionally, the extension of the proposed approaches to EH multi-hop scenarios is discussed. Thirdly, an EH broadcast scenario in which an EH transmitter sends individual data to multiple receivers is studied. We show that the power allocation problem for throughput maximization in this scenario leads to a non-convex problem when an arbitrary number of receivers is considered. However, following an offline approach we find the optimal power allocation policy for the special case when two receivers are considered. Furthermore, inspired by the offline approach for two users, a novel learning approach which does not pose any restriction on the number of receiver nodes is developed. The proposed learning approach is a two-stage learning algorithm which separates the learning task into two subtasks: determining how much power to use in each time interval and deciding how to split this selected power for the transmission of the individual data intended for each receiver. Through numerical simulations, we show that the separation of tasks leads to a performance up to 40% higher than the one achieved by standard learning techniques, specially for large numbers of receivers. Finally, an EH multiple access scenario is considered in which multiple EH transmitters communicate with a single receiver using multiple orthogonal resources. In this case, the focus is on the formulation of the resource allocation problem considering the EH processes at the different transmitters. We show that the resulting resource allocation problem falls into the category of non-linear knapsack problems which are known to be NP-hard. Therefore, we propose an offline approach based on dynamic programming to find the optimal solution. Furthermore, by exploiting the characteristics of the scenario, a novel learning approach is proposed which breaks the original resource allocation problem into smaller subproblems. As a result, it is able to handle the exponential growth of the space of possible solutions when the network size increases. Through numerical simulations, we show that in contrast to conventional reinforcement learning algorithms, the proposed learning approach is able to find the resource allocation policy that aims at maximizing the throughput when the network size is large. Furthermore, it achieves a performance up to 25% higher than the performance of the greedy policy that allocates the resources to the users with the best channel conditions. Additionally, in order to carry out a full assessment of the proposed learning algorithms, we provide convergence guarantees and a computational complexity analysis for all the developed learning approaches in the four considered scenarios
    corecore