Beam selection for joint transmission in cell-free massive multi-input
multi-output systems faces the problem of extremely high training overhead and
computational complexity. The traffic-aware quality of service additionally
complicates the beam selection problem. To address this issue, we propose a
traffic-aware hierarchical beam selection scheme performed in a dual timescale.
In the long-timescale, the central processing unit collects wide beam responses
from base stations (BSs) to predict the power profile in the narrow beam space
with a convolutional neural network, based on which the cascaded multiple-BS
beam space is carefully pruned. In the short-timescale, we introduce a
centralized reinforcement learning (RL) algorithm to maximize the satisfaction
rate of delay w.r.t. beam selection within multiple consecutive time slots.
Moreover, we put forward three scalable distributed algorithms including
hierarchical distributed Lyapunov optimization, fully distributed RL, and
centralized training with decentralized execution of RL to achieve better
scalability and better tradeoff between the performance and the execution
signal overhead. Numerical results demonstrate that the proposed schemes
significantly reduce both model training cost and beam training overhead and
are easier to meet the user-specific delay requirement, compared to existing
methods.Comment: 13 pages, 11 figures, part of this work has been accepted by the IEEE
International Conference on Wireless Communications and Signal Processing
(WCSP) 202