11 research outputs found
Boosting Learning for LDPC Codes to Improve the Error-Floor Performance
Low-density parity-check (LDPC) codes have been successfully commercialized
in communication systems due to their strong error correction capabilities and
simple decoding process. However, the error-floor phenomenon of LDPC codes, in
which the error rate stops decreasing rapidly at a certain level, presents
challenges for achieving extremely low error rates and deploying LDPC codes in
scenarios demanding ultra-high reliability. In this work, we propose training
methods for neural min-sum (NMS) decoders to eliminate the error-floor effect.
First, by leveraging the boosting learning technique of ensemble networks, we
divide the decoding network into two neural decoders and train the post decoder
to be specialized for uncorrected words that the first decoder fails to
correct. Secondly, to address the vanishing gradient issue in training, we
introduce a block-wise training schedule that locally trains a block of weights
while retraining the preceding block. Lastly, we show that assigning different
weights to unsatisfied check nodes effectively lowers the error-floor with a
minimal number of weights. By applying these training methods to standard LDPC
codes, we achieve the best error-floor performance compared to other decoding
methods. The proposed NMS decoder, optimized solely through novel training
methods without additional modules, can be integrated into existing LDPC
decoders without incurring extra hardware costs. The source code is available
at https://github.com/ghy1228/LDPC_Error_Floor .Comment: 17 pages, 10 figure
Online Meta-Learning For Hybrid Model-Based Deep Receivers
Recent years have witnessed growing interest in the application of deep
neural networks (DNNs) for receiver design, which can potentially be applied in
complex environments without relying on knowledge of the channel model.
However, the dynamic nature of communication channels often leads to rapid
distribution shifts, which may require periodically retraining. This paper
formulates a data-efficient two-stage training method that facilitates rapid
online adaptation. Our training mechanism uses a predictive meta-learning
scheme to train rapidly from data corresponding to both current and past
channel realizations. Our method is applicable to any deep neural network
(DNN)-based receiver, and does not require transmission of new pilot data for
training. To illustrate the proposed approach, we study DNN-aided receivers
that utilize an interpretable model-based architecture, and introduce a modular
training strategy based on predictive meta-learning. We demonstrate our
techniques in simulations on a synthetic linear channel, a synthetic non-linear
channel, and a COST 2100 channel. Our results demonstrate that the proposed
online training scheme allows receivers to outperform previous techniques based
on self-supervision and joint-learning by a margin of up to 2.5 dB in coded bit
error rate in rapidly-varying scenarios.Comment: arXiv admin note: text overlap with arXiv:2103.1348
Example-based Hypernetworks for Out-of-Distribution Generalization
As Natural Language Processing (NLP) algorithms continually achieve new
milestones, out-of-distribution generalization remains a significant challenge.
This paper addresses the issue of multi-source adaptation for unfamiliar
domains: We leverage labeled data from multiple source domains to generalize to
unknown target domains at training. Our innovative framework employs
example-based Hypernetwork adaptation: a T5 encoder-decoder initially generates
a unique signature from an input example, embedding it within the source
domains' semantic space. This signature is subsequently utilized by a
Hypernetwork to generate the task classifier's weights. We evaluated our method
across two tasks - sentiment classification and natural language inference - in
29 adaptation scenarios, where it outpaced established algorithms. In an
advanced version, the signature also enriches the input example's
representation. We also compare our finetuned architecture to few-shot GPT-3,
demonstrating its effectiveness in essential use cases. To our knowledge, this
marks the first application of Hypernetworks to the adaptation for unknown
domains.Comment: First two authors contributed equally to this work. Our code and data
are available at: https://github.com/TomerVolk/Hyper-PAD