138,657 research outputs found
Intelligent Interactive Beam Training for Millimeter Wave Communications
Millimeter wave communications, equipped with large-scale antenna arrays, are able to provide Gbps data by exploring abundant spectrum resources. However, the use of a large number of antennas along with narrow beams causes a large overhead in obtaining channel state information (CSI) via beam training, especially for fast-changing channels. To reduce beam training overhead, in this paper we develop an interactive learning design paradigm (ILDP) that makes full use of domain knowledge of wireless communications (WCs) and adaptive learning ability of machine learning (ML). Specifically, the ILDP is fulfilled via deep reinforcement learning (DRL), which yields DRL-ILDP, and consists of communication model (CM) module and adaptive learning (AL) module, which work in an interactive manner. Then, we exploit the DRL-ILDP to design efficient beam training algorithms for both multi-user and user-centric cooperative communications. The proposed DRL-ILDP based algorithms enjoy three folds of advantages. Firstly, ILDP takes full advantages of the existing WC models and methods. Secondly, ILDP integrates powerful ML elements, which facilitates extracting interested statistical and probabilistic information from environments. Thirdly, via the interaction between the CM and AL modules, the algorithms are able to collect samples and extract information in real-time and sufficiently adapt to the ever-changing environments. Simulation results demonstrate the effectiveness and superiority of the designed algorithms
Emergent Quantized Communication
The field of emergent communication aims to understand the characteristics of
communication as it emerges from artificial agents solving tasks that require
information exchange. Communication with discrete messages is considered a
desired characteristic, for both scientific and applied reasons. However,
training a multi-agent system with discrete communication is not
straightforward, requiring either reinforcement learning algorithms or relaxing
the discreteness requirement via a continuous approximation such as the
Gumbel-softmax. Both these solutions result in poor performance compared to
fully continuous communication. In this work, we propose an alternative
approach to achieve discrete communication -- quantization of communicated
messages. Using message quantization allows us to train the model end-to-end,
achieving superior performance in multiple setups. Moreover, quantization is a
natural framework that runs the gamut from continuous to discrete
communication. Thus, it sets the ground for a broader view of multi-agent
communication in the deep learning era
Optimal Complexity in Non-Convex Decentralized Learning over Time-Varying Networks
Decentralized optimization with time-varying networks is an emerging paradigm
in machine learning. It saves remarkable communication overhead in large-scale
deep training and is more robust in wireless scenarios especially when nodes
are moving. Federated learning can also be regarded as decentralized
optimization with time-varying communication patterns alternating between
global averaging and local updates.
While numerous studies exist to clarify its theoretical limits and develop
efficient algorithms, it remains unclear what the optimal complexity is for
non-convex decentralized stochastic optimization over time-varying networks.
The main difficulties lie in how to gauge the effectiveness when transmitting
messages between two nodes via time-varying communications, and how to
establish the lower bound when the network size is fixed (which is a
prerequisite in stochastic optimization). This paper resolves these challenges
and establish the first lower bound complexity. We also develop a new
decentralized algorithm to nearly attain the lower bound, showing the tightness
of the lower bound and the optimality of our algorithm.Comment: Accepted by 14th Annual Workshop on Optimization for Machine
Learning. arXiv admin note: text overlap with arXiv:2210.0786
Distributed Training Large-Scale Deep Architectures
Scale of data and scale of computation infrastructures together enable the
current deep learning renaissance. However, training large-scale deep
architectures demands both algorithmic improvement and careful system
configuration. In this paper, we focus on employing the system approach to
speed up large-scale training. Via lessons learned from our routine
benchmarking effort, we first identify bottlenecks and overheads that hinter
data parallelism. We then devise guidelines that help practitioners to
configure an effective system and fine-tune parameters to achieve desired
speedup. Specifically, we develop a procedure for setting minibatch size and
choosing computation algorithms. We also derive lemmas for determining the
quantity of key components such as the number of GPUs and parameter servers.
Experiments and examples show that these guidelines help effectively speed up
large-scale deep learning training
- …