2,327 research outputs found

    A Deep Reinforcement Learning-Based Framework for Content Caching

    Full text link
    Content caching at the edge nodes is a promising technique to reduce the data traffic in next-generation wireless networks. Inspired by the success of Deep Reinforcement Learning (DRL) in solving complicated control problems, this work presents a DRL-based framework with Wolpertinger architecture for content caching at the base station. The proposed framework is aimed at maximizing the long-term cache hit rate, and it requires no knowledge of the content popularity distribution. To evaluate the proposed framework, we compare the performance with other caching algorithms, including Least Recently Used (LRU), Least Frequently Used (LFU), and First-In First-Out (FIFO) caching strategies. Meanwhile, since the Wolpertinger architecture can effectively limit the action space size, we also compare the performance with Deep Q-Network to identify the impact of dropping a portion of the actions. Our results show that the proposed framework can achieve improved short-term cache hit rate and improved and stable long-term cache hit rate in comparison with LRU, LFU, and FIFO schemes. Additionally, the performance is shown to be competitive in comparison to Deep Q-learning, while the proposed framework can provide significant savings in runtime.Comment: 6 pages, 3 figure

    Learning-based Decision Making in Wireless Communications

    Get PDF
    Fueled by emerging applications and exponential increase in data traffic, wireless networks have recently grown significantly and become more complex. In such large-scale complex wireless networks, it is challenging and, oftentimes, infeasible for conventional optimization methods to quickly solve critical decision-making problems. With this motivation, in this thesis, machine learning methods are developed and utilized for obtaining optimal/near-optimal solutions for timely decision making in wireless networks. Content caching at the edge nodes is a promising technique to reduce the data traffic in next-generation wireless networks. In this context, we in the first part of the thesis study content caching at the wireless network edge using a deep reinforcement learning framework with Wolpertinger architecture. Initially, we develop a learning-based caching policy for a single base station aiming at maximizing the long-term cache hit rate. Then, we extend this study to a wireless communication network with multiple edge nodes. In particular, we propose deep actor-critic reinforcement learning based policies for both centralized and decentralized content caching. Next, with the purpose of making efficient use of limited spectral resources, we develop a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. In the single-user model, in order to evaluate the performance of the proposed channel access policy and the framework\u27s tolerance against uncertainty, we explore different channel switching patterns and different switching probabilities. In the case of multiple users, we analyze the probabilities of each user accessing channels with favorable channel conditions and the probability of collision. Following the analysis of the proposed learning-based dynamic multichannel access policy, we consider adversarial attacks on it. In particular, we propose two adversarial policies, one based on feed-forward neural networks and the other based on deep reinforcement learning policies. Both attack strategies aim at minimizing the accuracy of a deep reinforcement learning based dynamic channel access agent, and we demonstrate and compare their performances. Next, anomaly detection as an active hypothesis test problem is studied. Specifically, we study deep reinforcement learning based active sequential testing for anomaly detection. We assume that there is an unknown number of abnormal processes at a time and the agent can only check with one sensor in each sampling step. To maximize the confidence level of the decision and minimize the stopping time concurrently, we propose a deep actor-critic reinforcement learning framework that can dynamically select the sensor based on the posterior probabilities. Separately, we also regard the detection of threshold crossing as an anomaly detection problem, and analyze it via hierarchical generative adversarial networks (GANs). In the final part of the thesis, to address state estimation and detection problems in the presence of noisy sensor observations and probing costs, we develop a soft actor-critic deep reinforcement learning framework. Moreover, considering Byzantine attacks, we design a GAN-based framework to identify the Byzantine sensors. To evaluate the proposed framework, we measure the performance in terms of detection accuracy, stopping time, and the total probing cost needed for detection

    Soft Actor-Critic Learning-Based Joint Computing, Pushing, and Caching Framework in MEC Networks

    Full text link
    To support future 6G mobile applications, the mobile edge computing (MEC) network needs to be jointly optimized for computing, pushing, and caching to reduce transmission load and computation cost. To achieve this, we propose a framework based on deep reinforcement learning that enables the dynamic orchestration of these three activities for the MEC network. The framework can implicitly predict user future requests using deep networks and push or cache the appropriate content to enhance performance. To address the curse of dimensionality resulting from considering three activities collectively, we adopt the soft actor-critic reinforcement learning in continuous space and design the action quantization and correction specifically to fit the discrete optimization problem. We conduct simulations in a single-user single-server MEC network setting and demonstrate that the proposed framework effectively decreases both transmission load and computing cost under various configurations of cache size and tolerable service delay
    • …
    corecore