87,510 research outputs found
Deep Ordinal Reinforcement Learning
Reinforcement learning usually makes use of numerical rewards, which have
nice properties but also come with drawbacks and difficulties. Using rewards on
an ordinal scale (ordinal rewards) is an alternative to numerical rewards that
has received more attention in recent years. In this paper, a general approach
to adapting reinforcement learning problems to the use of ordinal rewards is
presented and motivated. We show how to convert common reinforcement learning
algorithms to an ordinal variation by the example of Q-learning and introduce
Ordinal Deep Q-Networks, which adapt deep reinforcement learning to ordinal
rewards. Additionally, we run evaluations on problems provided by the OpenAI
Gym framework, showing that our ordinal variants exhibit a performance that is
comparable to the numerical variations for a number of problems. We also give
first evidence that our ordinal variant is able to produce better results for
problems with less engineered and simpler-to-design reward signals.Comment: replaced figures for better visibility, added github repository, more
details about source of experimental results, updated target value
calculation for standard and ordinal Deep Q-Networ
Federated Learning Assisted Deep Q-Learning for Joint Task Offloading and Fronthaul Segment Routing in Open RAN
Offloading computation-intensive tasks to edge clouds has become an efficient
way to support resource constraint edge devices. However, task offloading delay
is an issue largely due to the networks with limited capacities between edge
clouds and edge devices. In this paper, we consider task offloading in Open
Radio Access Network (O-RAN), which is a new 5G RAN architecture allowing Open
Central Unit (O-CU) to be co-located with Open Distributed Unit (DU) at the
edge cloud for low-latency services. O-RAN relies on fronthaul network to
connect O-RAN Radio Units (O-RUs) and edge clouds that host O-DUs.
Consequently, tasks are offloaded onto the edge clouds via wireless and
fronthaul networks \cite{10045045}, which requires routing. Since edge clouds
do not have the same available computation resources and tasks' computation
deadlines are different, we need a task distribution approach to multiple edge
clouds. Prior work has never addressed this joint problem of task offloading,
fronthaul routing, and edge computing. To this end, using segment routing,
O-RAN intelligent controllers, and multiple edge clouds, we formulate an
optimization problem to minimize offloading, fronthaul routing, and computation
delays in O-RAN. To determine the solution of this NP-hard problem, we use Deep
Q-Learning assisted by federated learning with a reward function that reduces
the Cost of Delay (CoD). The simulation results show that our solution
maximizes the reward in minimizing CoD
A Deep Reinforcement Learning-Based Framework for Content Caching
Content caching at the edge nodes is a promising technique to reduce the data
traffic in next-generation wireless networks. Inspired by the success of Deep
Reinforcement Learning (DRL) in solving complicated control problems, this work
presents a DRL-based framework with Wolpertinger architecture for content
caching at the base station. The proposed framework is aimed at maximizing the
long-term cache hit rate, and it requires no knowledge of the content
popularity distribution. To evaluate the proposed framework, we compare the
performance with other caching algorithms, including Least Recently Used (LRU),
Least Frequently Used (LFU), and First-In First-Out (FIFO) caching strategies.
Meanwhile, since the Wolpertinger architecture can effectively limit the action
space size, we also compare the performance with Deep Q-Network to identify the
impact of dropping a portion of the actions. Our results show that the proposed
framework can achieve improved short-term cache hit rate and improved and
stable long-term cache hit rate in comparison with LRU, LFU, and FIFO schemes.
Additionally, the performance is shown to be competitive in comparison to Deep
Q-learning, while the proposed framework can provide significant savings in
runtime.Comment: 6 pages, 3 figure
- …