813 research outputs found

    Bayesian Optimization Enhanced Deep Reinforcement Learning for Trajectory Planning and Network Formation in Multi-UAV Networks

    Full text link
    In this paper, we employ multiple UAVs coordinated by a base station (BS) to help the ground users (GUs) to offload their sensing data. Different UAVs can adapt their trajectories and network formation to expedite data transmissions via multi-hop relaying. The trajectory planning aims to collect all GUs' data, while the UAVs' network formation optimizes the multi-hop UAV network topology to minimize the energy consumption and transmission delay. The joint network formation and trajectory optimization is solved by a two-step iterative approach. Firstly, we devise the adaptive network formation scheme by using a heuristic algorithm to balance the UAVs' energy consumption and data queue size. Then, with the fixed network formation, the UAVs' trajectories are further optimized by using multi-agent deep reinforcement learning without knowing the GUs' traffic demands and spatial distribution. To improve the learning efficiency, we further employ Bayesian optimization to estimate the UAVs' flying decisions based on historical trajectory points. This helps avoid inefficient action explorations and improves the convergence rate in the model training. The simulation results reveal close spatial-temporal couplings between the UAVs' trajectory planning and network formation. Compared with several baselines, our solution can better exploit the UAVs' cooperation in data offloading, thus improving energy efficiency and delay performance.Comment: 15 pages, 10 figures, 2 algorithm

    Online Service Migration in Edge Computing with Incomplete Information: A Deep Recurrent Actor-Critic Method

    Get PDF
    Multi-access Edge Computing (MEC) is an emerging computing paradigm that extends cloud computing to the network edge (e.g., base stations, MEC servers) to support resource-intensive applications on mobile devices. As a crucial problem in MEC, service migration needs to decide where to migrate user services for maintaining high Quality-of-Service (QoS), when users roam between MEC servers with limited coverage and capacity. However, finding an optimal migration policy is intractable due to the highly dynamic MEC environment and user mobility. Many existing works make centralized migration decisions based on complete system-level information, which can be time-consuming and suffer from the scalability issue with the rapidly increasing number of mobile users. To address these challenges, we propose a new learning-driven method, namely Deep Recurrent Actor-Critic based service Migration (DRACM), which is user-centric and can make effective online migration decisions given incomplete system-level information. Specifically, the service migration problem is modeled as a Partially Observable Markov Decision Process (POMDP). To solve the POMDP, we design an encoder network that combines a Long Short-Term Memory (LSTM) and an embedding matrix for effective extraction of hidden information. We then propose a tailored off-policy actor-critic algorithm with a clipped surrogate objective for efficient training. Results from extensive experiments based on real-world mobility traces demonstrate that our method consistently outperforms both the heuristic and state-of-the-art learning-driven algorithms, and achieves near-optimal results on various MEC scenarios

    Intelligent and secure fog-aided internet of drones

    Get PDF
    Internet of drones (IoD), which utilize drones as Internet of Things (IoT) devices, deploys several drones in the air to collect ground information and send them to the IoD gateway for further processing. Computing tasks are usually offloaded to the cloud data center for intensive processing. However, many IoD applications require real-time processing and event response (e.g., disaster response and virtual reality applications). Hence, data processing by the remote cloud may not satisfy the strict latency requirement. Fog computing attaches fog nodes, which are equipped with computing, storage and networking resources, to IoD gateways to assume a substantial amount of computing tasks instead of performing all tasks in the remote cloud, thus enabling immediate service response. Fog-aided IoD provisions future events prediction and image classification by machine learning technologies, where massive training data are collected by drones and analyzed in the fog node. However, the performance of IoD is greatly affected by drones\u27 battery capacities. Also, aggregating all data in the fog node may incur huge network traffic and drone data privacy leakage. To address the challenge of limited drone battery, the power control problem is first investigated in IoD for the data collection service to minimize the energy consumption of a drone while meeting the quality of service (QoS) requirements. A PowEr conTROL (PETROL) algorithm is then proposed to solve this problem and its convergence rate is derived. The task allocation (which distributes tasks to different fog nodes) and the flying control (which adjusts the drone\u27s flying speed) are then jointly optimized to minimize the drone\u27s journey completion time constrained by the drone\u27s battery capacity and task completion deadlines. In consideration of the practical scenario that the future task information is difficult to obtain, an online algorithm is designed to provide strategies for task allocation and flying control when the drone visits each location without knowing the future. The joint optimization of power control and energy harvesting control is also studied to determine each drone\u27s transmission power and the transmitted energy from the charging station in the time-varying IoD network. The objective is to minimize the long-term average system energy cost constrained by the drones\u27 battery capacities and QoS requirements. A Markov Decision Process (MDP) is formulated to characterize the power and energy harvesting control process in time-varying IoD networks. A modified actor-critic reinforcement learning algorithm is then proposed to tackle the problem. To address the challenge of drone data privacy leakage, federated learning (FL) is proposed to preserve drone data privacy by performing local training in drones and sharing training model parameters with a fog node without uploading drone raw data. However, drone privacy can still be divulged to ground eavesdroppers by wiretapping and analyzing uploaded parameters during the FL training process. The power control problem of all drones is hence investigated to maximize the FL system security rate constrained by drone battery capacities and the QoS requirements (e.g., FL training time). This problem is formulated as a non-linear programming problem and an algorithm is designed to obtain the optimum solutions with low computational complexity. All proposed algorithms are demonstrated to perform better than existing algorithms by extensive simulations and can be implemented in the intelligent and secure fog-aided IoD network to improve system performances on energy efficiency, QoS, and security
    • …
    corecore