The design of beamforming for downlink multi-user massive multi-input
multi-output (MIMO) relies on accurate downlink channel state information (CSI)
at the transmitter (CSIT). In fact, it is difficult for the base station (BS)
to obtain perfect CSIT due to user mobility, latency/feedback delay (between
downlink data transmission and CSI acquisition). Hence, robust beamforming
under imperfect CSIT is needed. In this paper, considering multiple antennas at
all nodes (base station and user terminals), we develop a multi-agent deep
reinforcement learning (DRL) framework for massive MIMO under imperfect CSIT,
where the transmit and receive beamforming are jointly designed to maximize the
average information rate of all users. Leveraging this DRL-based framework,
interference management is explored and three DRL-based schemes, namely the
distributed-learning-distributed-processing scheme,
partial-distributed-learning-distributed-processing, and
central-learning-distributed-processing scheme, are proposed and analyzed. This
paper \textrm{1)} highlights the fact that the DRL-based strategies outperform
the random action-chosen strategy and the delay-sensitive strategy named as
sample-and-hold (SAH) approach, and achieved over 90% of the information
rate of two selected benchmarks with lower complexity: the zero-forcing
channel-inversion (ZF-CI) with perfect CSIT and the Greedy Beam Selection
strategy, \textrm{2)} demonstrates the inherent robustness of the proposed
designs in the presence of user mobility.Comment: submitted for publicatio