379 research outputs found

    Bayesian Optimization Enhanced Deep Reinforcement Learning for Trajectory Planning and Network Formation in Multi-UAV Networks

    Full text link
    In this paper, we employ multiple UAVs coordinated by a base station (BS) to help the ground users (GUs) to offload their sensing data. Different UAVs can adapt their trajectories and network formation to expedite data transmissions via multi-hop relaying. The trajectory planning aims to collect all GUs' data, while the UAVs' network formation optimizes the multi-hop UAV network topology to minimize the energy consumption and transmission delay. The joint network formation and trajectory optimization is solved by a two-step iterative approach. Firstly, we devise the adaptive network formation scheme by using a heuristic algorithm to balance the UAVs' energy consumption and data queue size. Then, with the fixed network formation, the UAVs' trajectories are further optimized by using multi-agent deep reinforcement learning without knowing the GUs' traffic demands and spatial distribution. To improve the learning efficiency, we further employ Bayesian optimization to estimate the UAVs' flying decisions based on historical trajectory points. This helps avoid inefficient action explorations and improves the convergence rate in the model training. The simulation results reveal close spatial-temporal couplings between the UAVs' trajectory planning and network formation. Compared with several baselines, our solution can better exploit the UAVs' cooperation in data offloading, thus improving energy efficiency and delay performance.Comment: 15 pages, 10 figures, 2 algorithm

    Power allocation and energy cooperation for UAV-enabled MmWave networks: A Multi-Agent Deep Reinforcement Learning approach

    Get PDF
    Unmanned Aerial Vehicle (UAV)-assisted cellular networks over the millimeter-wave (mmWave) frequency band can meet the requirements of a high data rate and flexible coverage in next-generation communication networks. However, higher propagation loss and the use of a large number of antennas in mmWave networks give rise to high energy consumption and UAVs are constrained by their low-capacity onboard battery. Energy harvesting (EH) is a viable solution to reduce the energy cost of UAV-enabled mmWave networks. However, the random nature of renewable energy makes it challenging to maintain robust connectivity in UAV-assisted terrestrial cellular networks. Energy cooperation allows UAVs to send their excessive energy to other UAVs with reduced energy. In this paper, we propose a power allocation algorithm based on energy harvesting and energy cooperation to maximize the throughput of a UAV-assisted mmWave cellular network. Since there is channel-state uncertainty and the amount of harvested energy can be treated as a stochastic process, we propose an optimal multi-agent deep reinforcement learning algorithm (DRL) named Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to solve the renewable energy resource allocation problem for throughput maximization. The simulation results show that the proposed algorithm outperforms the Random Power (RP), Maximal Power (MP) and value-based Deep Q-Learning (DQL) algorithms in terms of network throughput.This work was supported by the Agencia Estatal de Investigación of Ministerio de Ciencia e Innovación of Spain under project PID2019-108713RB-C51 MCIN/AEI /10.13039/501100011033Postprint (published version

    Meta-Reinforcement Learning for Timely and Energy-efficient Data Collection in Solar-powered UAV-assisted IoT Networks

    Full text link
    Unmanned aerial vehicles (UAVs) have the potential to greatly aid Internet of Things (IoT) networks in mission-critical data collection, thanks to their flexibility and cost-effectiveness. However, challenges arise due to the UAV's limited onboard energy and the unpredictable status updates from sensor nodes (SNs), which impact the freshness of collected data. In this paper, we investigate the energy-efficient and timely data collection in IoT networks through the use of a solar-powered UAV. Each SN generates status updates at stochastic intervals, while the UAV collects and subsequently transmits these status updates to a central data center. Furthermore, the UAV harnesses solar energy from the environment to maintain its energy level above a predetermined threshold. To minimize both the average age of information (AoI) for SNs and the energy consumption of the UAV, we jointly optimize the UAV trajectory, SN scheduling, and offloading strategy. Then, we formulate this problem as a Markov decision process (MDP) and propose a meta-reinforcement learning algorithm to enhance the generalization capability. Specifically, the compound-action deep reinforcement learning (CADRL) algorithm is proposed to handle the discrete decisions related to SN scheduling and the UAV's offloading policy, as well as the continuous control of UAV flight. Moreover, we incorporate meta-learning into CADRL to improve the adaptability of the learned policy to new tasks. To validate the effectiveness of our proposed algorithms, we conduct extensive simulations and demonstrate their superiority over other baseline algorithms

    A Hybrid Framework of Reinforcement Learning and Convex Optimization for UAV-Based Autonomous Metaverse Data Collection

    Full text link
    Unmanned aerial vehicles (UAVs) are promising for providing communication services due to their advantages in cost and mobility, especially in the context of the emerging Metaverse and Internet of Things (IoT). This paper considers a UAV-assisted Metaverse network, in which UAVs extend the coverage of the base station (BS) to collect the Metaverse data generated at roadside units (RSUs). Specifically, to improve the data collection efficiency, resource allocation and trajectory control are integrated into the system model. The time-dependent nature of the optimization problem makes it non-trivial to be solved by traditional convex optimization methods. Based on the proposed UAV-assisted Metaverse network system model, we design a hybrid framework with reinforcement learning and convex optimization to {cooperatively} solve the time-sequential optimization problem. Simulation results show that the proposed framework is able to reduce the mission completion time with a given transmission power resource.Comment: This paper appears in IEEE Network magazin

    AI-based Radio and Computing Resource Allocation and Path Planning in NOMA NTNs: AoI Minimization under CSI Uncertainty

    Full text link
    In this paper, we develop a hierarchical aerial computing framework composed of high altitude platform (HAP) and unmanned aerial vehicles (UAVs) to compute the fully offloaded tasks of terrestrial mobile users which are connected through an uplink non-orthogonal multiple access (UL-NOMA). To better assess the freshness of information in computation-intensive applications the criterion of age of information (AoI) is considered. In particular, the problem is formulated to minimize the average AoI of users with elastic tasks, by adjusting UAVs trajectory and resource allocation on both UAVs and HAP, which is restricted by the channel state information (CSI) uncertainty and multiple resource constraints of UAVs and HAP. In order to solve this non-convex optimization problem, two methods of multi-agent deep deterministic policy gradient (MADDPG) and federated reinforcement learning (FRL) are proposed to design the UAVs trajectory, and obtain channel, power, and CPU allocations. It is shown that task scheduling significantly reduces the average AoI. This improvement is more pronounced for larger task sizes. On one hand, it is shown that power allocation has a marginal effect on the average AoI compared to using full transmission power for all users. Compared with traditional transmission schemes, the simulation results show our scheduling scheme results in a substantial improvement in average AoI

    Location prediction and trajectory optimization in multi-UAV application missions

    Get PDF
    Unmanned aerial vehicles (a.k.a. drones) have a wide range of applications in e.g., aerial surveillance, mapping, imaging, monitoring, maritime operations, parcel delivery, and disaster response management. Their operations require reliable networking environments and location-based services in air-to-air links with cooperative drones, or air-to-ground links in concert with ground control stations. When equipped with high-resolution video cameras or sensors to gain environmental situation awareness through object detection/tracking, precise location predictions of individual or groups of drones at any instant possible is critical for continuous guidance. The location predictions then can be used in trajectory optimization for achieving efficient operations (i.e., through effective resource utilization in terms of energy or network bandwidth consumption) and safe operations (i.e., through avoidance of obstacles or sudden landing) within application missions. In this thesis, we explain a diverse set of techniques involved in drone location prediction, position and velocity estimation and trajectory optimization involving: (i) Kalman Filtering techniques, and (ii) Machine Learning models such as reinforcement learning and deep-reinforcement learning. These techniques facilitate the drones to follow intelligent paths and establish optimal trajectories while carrying out successful application missions under given resource and network constraints. We detail the techniques using two scenarios. The first scenario involves location prediction based intelligent packet transfer between drones in a disaster response scenario using the various Kalman Filtering techniques. The second scenario involves a learning-based trajectory optimization that uses various reinforcement learning models for maintaining high video resolution and effective network performance in a civil application scenario such as aerial monitoring of persons/objects. We conclude with a list of open challenges and future works for intelligent path planning of drones using location prediction and trajectory optimization techniques.Includes bibliographical references

    Joint Trajectory and Passive Beamforming Design for Intelligent Reflecting Surface-Aided UAV Communications: A Deep Reinforcement Learning Approach

    Get PDF
    In this paper, the intelligent reflecting surface (IRS)-assisted unmanned aerial vehicle (UAV) communication system is studied, where an UAV is deployed to serve the user equipments (UEs) with the assistance of multiple IRSs mounted on several buildings to enhance the communication quality between UAV and UEs. We aim to maximize the overall weighted data rate and geographical fairness of all the UEs via jointly optimizing the UAV's trajectory and the phase shifts of reflecting elements of IRSs. Since the system is complex and the environment is dynamic, it is challenging to derive low-complexity algorithms by using conventional optimization methods. To address this issue, we first propose a deep Q-network (DQN)-based low-complex solution by discretizing the trajectory and phase shift, which is suitable for practical systems with discrete phase-shift control. Furthermore, we propose a deep deterministic policy gradient (DDPG)-based solution to tackle the case with continuous trajectory and phase shift design. The experimental results prove that the proposed solutions achieve better performance compared to other traditional benchmarks.Comment: 12 pages, 13 figure
    • …
    corecore