3 research outputs found
Self-play Learning Strategies for Resource Assignment in Open-RAN Networks
Open Radio Access Network (ORAN) is being developed with an aim to
democratise access and lower the cost of future mobile data networks,
supporting network services with various QoS requirements, such as massive IoT
and URLLC. In ORAN, network functionality is dis-aggregated into remote units
(RUs), distributed units (DUs) and central units (CUs), which allows flexible
software on Commercial-Off-The-Shelf (COTS) deployments. Furthermore, the
mapping of variable RU requirements to local mobile edge computing centres for
future centralized processing would significantly reduce the power consumption
in cellular networks. In this paper, we study the RU-DU resource assignment
problem in an ORAN system, modelled as a 2D bin packing problem. A deep
reinforcement learning-based self-play approach is proposed to achieve
efficient RU-DU resource management, with AlphaGo Zero inspired neural
Monte-Carlo Tree Search (MCTS). Experiments on representative 2D bin packing
environment and real sites data show that the self-play learning strategy
achieves intelligent RU-DU resource assignment for different network
conditions
First-Order Problem Solving through Neural MCTS based Reinforcement Learning
The formal semantics of an interpreted first-order logic (FOL) statement can
be given in Tarskian Semantics or a basically equivalent Game Semantics. The
latter maps the statement and the interpretation into a two-player semantic
game. Many combinatorial problems can be described using interpreted FOL
statements and can be mapped into a semantic game. Therefore, learning to play
a semantic game perfectly leads to the solution of a specific instance of a
combinatorial problem. We adapt the AlphaZero algorithm so that it becomes
better at learning to play semantic games that have different characteristics
than Go and Chess. We propose a general framework, Persephone, to map the FOL
description of a combinatorial problem to a semantic game so that it can be
solved through a neural MCTS based reinforcement learning algorithm. Our goal
for Persephone is to make it tabula-rasa, mapping a problem stated in
interpreted FOL to a solution without human intervention
(When) Is Truth-telling Favored in AI Debate?
For some problems, humans may not be able to accurately judge the goodness of
AI-proposed solutions. Irving et al. (2018) propose that in such cases, we may
use a debate between two AI systems to amplify the problem-solving capabilities
of a human judge. We introduce a mathematical framework that can model debates
of this type and propose that the quality of debate designs should be measured
by the accuracy of the most persuasive answer. We describe a simple instance of
the debate framework called feature debate and analyze the degree to which such
debates track the truth. We argue that despite being very simple, feature
debates nonetheless capture many aspects of practical debates such as the
incentives to confuse the judge or stall to prevent losing. We then outline how
these models should be generalized to analyze a wider range of debate
phenomena