811 research outputs found
GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing
Network slicing is a key technology in 5G communications system. Its purpose
is to dynamically and efficiently allocate resources for diversified services
with distinct requirements over a common underlying physical infrastructure.
Therein, demand-aware resource allocation is of significant importance to
network slicing. In this paper, we consider a scenario that contains several
slices in a radio access network with base stations that share the same
physical resources (e.g., bandwidth or slots). We leverage deep reinforcement
learning (DRL) to solve this problem by considering the varying service demands
as the environment state and the allocated resources as the environment action.
In order to reduce the effects of the annoying randomness and noise embedded in
the received service level agreement (SLA) satisfaction ratio (SSR) and
spectrum efficiency (SE), we primarily propose generative adversarial
network-powered deep distributional Q network (GAN-DDQN) to learn the
action-value distribution driven by minimizing the discrepancy between the
estimated action-value distribution and the target action-value distribution.
We put forward a reward-clipping mechanism to stabilize GAN-DDQN training
against the effects of widely-spanning utility values. Moreover, we further
develop Dueling GAN-DDQN, which uses a specially designed dueling generator, to
learn the action-value distribution by estimating the state-value distribution
and the action advantage function. Finally, we verify the performance of the
proposed GAN-DDQN and Dueling GAN-DDQN algorithms through extensive
simulations
Deep Reinforcement Learning for Resource Management in Network Slicing
Network slicing is born as an emerging business to operators, by allowing
them to sell the customized slices to various tenants at different prices. In
order to provide better-performing and cost-efficient services, network slicing
involves challenging technical issues and urgently looks forward to intelligent
innovations to make the resource management consistent with users' activities
per slice. In that regard, deep reinforcement learning (DRL), which focuses on
how to interact with the environment by trying alternative actions and
reinforcing the tendency actions producing more rewarding consequences, is
assumed to be a promising solution. In this paper, after briefly reviewing the
fundamental concepts of DRL, we investigate the application of DRL in solving
some typical resource management for network slicing scenarios, which include
radio resource slicing and priority-based core network slicing, and demonstrate
the advantage of DRL over several competing schemes through extensive
simulations. Finally, we also discuss the possible challenges to apply DRL in
network slicing from a general perspective.Comment: The manuscript has been accepted by IEEE Access in Nov. 201
Learning and Management for Internet-of-Things: Accounting for Adaptivity and Scalability
Internet-of-Things (IoT) envisions an intelligent infrastructure of networked
smart devices offering task-specific monitoring and control services. The
unique features of IoT include extreme heterogeneity, massive number of
devices, and unpredictable dynamics partially due to human interaction. These
call for foundational innovations in network design and management. Ideally, it
should allow efficient adaptation to changing environments, and low-cost
implementation scalable to massive number of devices, subject to stringent
latency constraints. To this end, the overarching goal of this paper is to
outline a unified framework for online learning and management policies in IoT
through joint advances in communication, networking, learning, and
optimization. From the network architecture vantage point, the unified
framework leverages a promising fog architecture that enables smart devices to
have proximity access to cloud functionalities at the network edge, along the
cloud-to-things continuum. From the algorithmic perspective, key innovations
target online approaches adaptive to different degrees of nonstationarity in
IoT dynamics, and their scalable model-free implementation under limited
feedback that motivates blind or bandit approaches. The proposed framework
aspires to offer a stepping stone that leads to systematic designs and analysis
of task-specific learning and management schemes for IoT, along with a host of
new research directions to build on.Comment: Submitted on June 15 to Proceeding of IEEE Special Issue on Adaptive
and Scalable Communication Network
- …