2,252 research outputs found
Neural combinatorial optimization beyond the TSP: Existing architectures under-represent graph structure
Recent years have witnessed the promise that reinforcement learning, coupled with Graph Neural Network (GNN) architectures, could learn to solve hard combinatorial optimization problems: given raw input data and an evaluator to guide the process, the idea is to automatically learn a policy able to return feasible and high-quality outputs. Recent works have shown promising results but the latter were mainly evaluated on the travelling salesman problem (TSP) and similar abstract variants such as Split Delivery Vehicle Routing Problem (SDVRP). In this paper, we analyze how and whether recent neural architectures can be applied to graph problems of practical importance. We thus set out to systematically "transfer" these architectures to the Power and Channel Allocation Problem (PCAP), which has practical relevance for, e.g., radio resource allocation in wireless networks. Our experimental results suggest that existing architectures (i) are still incapable of capturing graph structural features and (ii) are not suitable for problems where the actions on the graph change the graph attributes. On a positive note, we show that augmenting the structural representation of problems with Distance Encoding is a promising step toward the still-ambitious goal of learning multi-purpose autonomous solvers
EdgeRIC: Empowering Realtime Intelligent Optimization and Control in NextG Networks
Radio Access Networks (RAN) are increasingly softwarized and accessible via
data-collection and control interfaces. RAN intelligent control (RIC) is an
approach to manage these interfaces at different timescales. In this paper, we
develop a RIC platform called RICworld, consisting of (i) EdgeRIC, which is
colocated, but decoupled from the RAN stack, and can access RAN and
application-level information to execute AI-optimized and other policies in
realtime (sub-millisecond) and (ii) DigitalTwin, a full-stack, trace-driven
emulator for training AI-based policies offline. We demonstrate that realtime
EdgeRIC operates as if embedded within the RAN stack and significantly
outperforms a cloud-based near-realtime RIC (> 15 ms latency) in terms of
attained throughput. We train AI-based polices on DigitalTwin, execute them on
EdgeRIC, and show that these policies are robust to channel dynamics, and
outperform queueing-model based policies by 5% to 25% on throughput and
application-level benchmarks in a variety of mobile environments.Comment: 16 pages, 15 figure
RLOps:Development Life-cycle of Reinforcement Learning Aided Open RAN
Radio access network (RAN) technologies continue to witness massive growth,
with Open RAN gaining the most recent momentum. In the O-RAN specifications,
the RAN intelligent controller (RIC) serves as an automation host. This article
introduces principles for machine learning (ML), in particular, reinforcement
learning (RL) relevant for the O-RAN stack. Furthermore, we review
state-of-the-art research in wireless networks and cast it onto the RAN
framework and the hierarchy of the O-RAN architecture. We provide a taxonomy of
the challenges faced by ML/RL models throughout the development life-cycle:
from the system specification to production deployment (data acquisition, model
design, testing and management, etc.). To address the challenges, we integrate
a set of existing MLOps principles with unique characteristics when RL agents
are considered. This paper discusses a systematic life-cycle model development,
testing and validation pipeline, termed: RLOps. We discuss all fundamental
parts of RLOps, which include: model specification, development and
distillation, production environment serving, operations monitoring,
safety/security and data engineering platform. Based on these principles, we
propose the best practices for RLOps to achieve an automated and reproducible
model development process.Comment: 17 pages, 6 figrue
Strategic thinking under social influence: Scalability, stability and robustness of allocations
This paper studies the strategic behavior of a large number of game designers and studies the scalability, stability and robustness of their allocations in a large number of homogeneous coalitional games with transferable utilities (TU). For each TU game, the characteristic function is a continuous-time stochastic process. In each game, a game designer allocates revenues based on the extra reward that a coalition has received up to the current time and the extra reward that the same coalition has received in the other games. The approach is based on the theory of mean-field games with heterogeneous groups in a multi-population regime
- …