13,188 research outputs found
LEARN Codes: Inventing Low-latency Codes via Recurrent Neural Networks
Designing channel codes under low-latency constraints is one of the most
demanding requirements in 5G standards. However, a sharp characterization of
the performance of traditional codes is available only in the large
block-length limit. Guided by such asymptotic analysis, code designs require
large block lengths as well as latency to achieve the desired error rate.
Tail-biting convolutional codes and other recent state-of-the-art short block
codes, while promising reduced latency, are neither robust to channel-mismatch
nor adaptive to varying channel conditions. When the codes designed for one
channel (e.g.,~Additive White Gaussian Noise (AWGN) channel) are used for
another (e.g.,~non-AWGN channels), heuristics are necessary to achieve
non-trivial performance.
In this paper, we first propose an end-to-end learned neural code, obtained
by jointly designing a Recurrent Neural Network (RNN) based encoder and
decoder. This code outperforms canonical convolutional code under block
settings. We then leverage this experience to propose a new class of codes
under low-latency constraints, which we call Low-latency Efficient Adaptive
Robust Neural (LEARN) codes. These codes outperform state-of-the-art
low-latency codes and exhibit robustness and adaptivity properties. LEARN codes
show the potential to design new versatile and universal codes for future
communications via tools of modern deep learning coupled with communication
engineering insights
MIST: A Novel Training Strategy for Low-latency Scalable Neural Net Decoders
In this paper, we propose a low latency, robust and scalable neural net based
decoder for convolutional and low-density parity-check (LPDC) coding schemes.
The proposed decoders are demonstrated to have bit error rate (BER) and block
error rate (BLER) performances at par with the state-of-the-art neural net
based decoders while achieving more than 8 times higher decoding speed. The
enhanced decoding speed is due to the use of convolutional neural network (CNN)
as opposed to recurrent neural network (RNN) used in the best known neural net
based decoders. This contradicts existing doctrine that only RNN based decoders
can provide a performance close to the optimal ones. The key ingredient to our
approach is a novel Mixed-SNR Independent Samples based Training (MIST), which
allows for training of CNN with only 1\% of possible datawords, even for block
length as high as 1000. The proposed decoder is robust as, once trained, the
same decoder can be used for a wide range of SNR values. Finally, in the
presence of channel outages, the proposed decoders outperform the best known
decoders, {\it viz.} unquantized Viterbi decoder for convolutional code, and
belief propagation for LDPC. This gives the CNN decoder a significant advantage
in 5G millimeter wave systems, where channel outages are prevalent
MMFNet: A Multi-modality MRI Fusion Network for Segmentation of Nasopharyngeal Carcinoma
Segmentation of nasopharyngeal carcinoma (NPC) from Magnetic Resonance Images
(MRI) is a crucial prerequisite for NPC radiotherapy. However, manually
segmenting of NPC is time-consuming and labor-intensive. Additionally,
single-modality MRI generally cannot provide enough information for its
accurate delineation. Therefore, a multi-modality MRI fusion network (MMFNet)
based on three modalities of MRI (T1, T2 and contrast-enhanced T1) is proposed
to complete accurate segmentation of NPC. The backbone of MMFNet is designed as
a multi-encoder-based network, consisting of several encoders to capture
modality-specific features and one single decoder to fuse them and obtain
high-level features for NPC segmentation. A fusion block is presented to
effectively fuse features from multi-modality MRI. It firstly recalibrates
low-level features captured from modality-specific encoders to highlight both
informative features and regions of interest, then fuses weighted features by a
residual fusion block to keep balance between fused ones and high-level
features from decoder. Moreover, a training strategy named self-transfer, which
utilizes pre-trained modality-specific encoders to initialize
multi-encoder-based network, is proposed to make full mining of information
from different modalities of MRI. The proposed method based on multi-modality
MRI can effectively segment NPC and its advantages are validated by extensive
experiments.Comment: 34 pages, 12 figure
ESFNet: Efficient Network for Building Extraction from High-Resolution Aerial Images
Building footprint extraction from high-resolution aerial images is always an
essential part of urban dynamic monitoring, planning and management. It has
also been a challenging task in remote sensing research. In recent years, deep
neural networks have made great achievement in improving accuracy of building
extraction from remote sensing imagery. However, most of existing approaches
usually require large amount of parameters and floating point operations for
high accuracy, it leads to high memory consumption and low inference speed
which are harmful to research. In this paper, we proposed a novel efficient
network named ESFNet which employs separable factorized residual block and
utilizes the dilated convolutions, aiming to preserve slight accuracy loss with
low computational cost and memory consumption. Our ESFNet obtains a better
trade-off between accuracy and efficiency, it can run at over 100 FPS on single
Tesla V100, requires 6x fewer FLOPs and has 18x fewer parameters than
state-of-the-art real-time architecture ERFNet while preserving similar
accuracy without any additional context module, post-processing and pre-trained
scheme. We evaluated our networks on WHU Building Dataset and compared it with
other state-of-the-art architectures. The result and comprehensive analysis
show that our networks are benefit for efficient remote sensing researches, and
the idea can be further extended to other areas. The code is public available
at: https://github.com/mrluin/ESFNet-PytorchComment: 10 pages, 3 figures, 4 tables. Accepted for IEEE Acces
Deepcode: Feedback Codes via Deep Learning
The design of codes for communicating reliably over a statistically well
defined channel is an important endeavor involving deep mathematical research
and wide-ranging practical applications. In this work, we present the first
family of codes obtained via deep learning, which significantly beats
state-of-the-art codes designed over several decades of research. The
communication channel under consideration is the Gaussian noise channel with
feedback, whose study was initiated by Shannon; feedback is known theoretically
to improve reliability of communication, but no practical codes that do so have
ever been successfully constructed.
We break this logjam by integrating information theoretic insights
harmoniously with recurrent-neural-network based encoders and decoders to
create novel codes that outperform known codes by 3 orders of magnitude in
reliability. We also demonstrate several desirable properties of the codes: (a)
generalization to larger block lengths, (b) composability with known codes, (c)
adaptation to practical constraints. This result also has broader ramifications
for coding theory: even when the channel has a clear mathematical model, deep
learning methodologies, when combined with channel-specific
information-theoretic insights, can potentially beat state-of-the-art codes
constructed over decades of mathematical research.Comment: 24 pages, 20 figure
RedNet: Residual Encoder-Decoder Network for indoor RGB-D Semantic Segmentation
Indoor semantic segmentation has always been a difficult task in computer
vision. In this paper, we propose an RGB-D residual encoder-decoder
architecture, named RedNet, for indoor RGB-D semantic segmentation. In RedNet,
the residual module is applied to both the encoder and decoder as the basic
building block, and the skip-connection is used to bypass the spatial feature
between the encoder and decoder. In order to incorporate the depth information
of the scene, a fusion structure is constructed, which makes inference on RGB
image and depth image separately, and fuses their features over several layers.
In order to efficiently optimize the network's parameters, we propose a
`pyramid supervision' training scheme, which applies supervised learning over
different layers in the decoder, to cope with the problem of gradients
vanishing. Experiment results show that the proposed RedNet(ResNet-50) achieves
a state-of-the-art mIoU accuracy of 47.8% on the SUN RGB-D benchmark dataset
Label Refinement Network for Coarse-to-Fine Semantic Segmentation
We consider the problem of semantic image segmentation using deep
convolutional neural networks. We propose a novel network architecture called
the label refinement network that predicts segmentation labels in a
coarse-to-fine fashion at several resolutions. The segmentation labels at a
coarse resolution are used together with convolutional features to obtain finer
resolution segmentation labels. We define loss functions at several stages in
the network to provide supervisions at different stages. Our experimental
results on several standard datasets demonstrate that the proposed model
provides an effective way of producing pixel-wise dense image labeling.Comment: 9 page
Deep Convolutional Framelets: A General Deep Learning Framework for Inverse Problems
Recently, deep learning approaches with various network architectures have
achieved significant performance improvement over existing iterative
reconstruction methods in various imaging problems. However, it is still
unclear why these deep learning architectures work for specific inverse
problems. To address these issues, here we show that the long-searched-for
missing link is the convolution framelets for representing a signal by
convolving local and non-local bases. The convolution framelets was originally
developed to generalize the theory of low-rank Hankel matrix approaches for
inverse problems, and this paper further extends the idea so that we can obtain
a deep neural network using multilayer convolution framelets with perfect
reconstruction (PR) under rectilinear linear unit nonlinearity (ReLU). Our
analysis also shows that the popular deep network components such as residual
block, redundant filter channels, and concatenated ReLU (CReLU) do indeed help
to achieve the PR, while the pooling and unpooling layers should be augmented
with high-pass branches to meet the PR condition. Moreover, by changing the
number of filter channels and bias, we can control the shrinkage behaviors of
the neural network. This discovery leads us to propose a novel theory for deep
convolutional framelets neural network. Using numerical experiments with
various inverse problems, we demonstrated that our deep convolution framelets
network shows consistent improvement over existing deep architectures.This
discovery suggests that the success of deep learning is not from a magical
power of a black-box, but rather comes from the power of a novel signal
representation using non-local basis combined with data-driven local basis,
which is indeed a natural extension of classical signal processing theory.Comment: This will appear in SIAM Journal on Imaging Science
DeepTurbo: Deep Turbo Decoder
Present-day communication systems routinely use codes that approach the
channel capacity when coupled with a computationally efficient decoder.
However, the decoder is typically designed for the Gaussian noise channel and
is known to be sub-optimal for non-Gaussian noise distribution. Deep learning
methods offer a new approach for designing decoders that can be trained and
tailored for arbitrary channel statistics. We focus on Turbo codes and propose
DeepTurbo, a novel deep learning based architecture for Turbo decoding.
The standard Turbo decoder (Turbo) iteratively applies the
Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm with an interleaver in the middle. A
neural architecture for Turbo decoding termed (NeuralBCJR), was proposed
recently. There, the key idea is to create a module that imitates the BCJR
algorithm using supervised learning, and to use the interleaver architecture
along with this module, which is then fine-tuned using end-to-end training.
However, knowledge of the BCJR algorithm is required to design such an
architecture, which also constrains the resulting learned decoder. Here we
remedy this requirement and propose a fully end-to-end trained neural decoder -
Deep Turbo Decoder (DeepTurbo). With novel learnable decoder structure and
training methodology, DeepTurbo reveals superior performance under both AWGN
and non-AWGN settings as compared to the other two decoders - Turbo and
NeuralBCJR. Furthermore, among all the three, DeepTurbo exhibits the lowest
error floor
LEDNet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation
The extensive computational burden limits the usage of CNNs in mobile devices
for dense estimation tasks. In this paper, we present a lightweight network to
address this problem,namely LEDNet, which employs an asymmetric encoder-decoder
architecture for the task of real-time semantic segmentation.More specifically,
the encoder adopts a ResNet as backbone network, where two new operations,
channel split and shuffle, are utilized in each residual block to greatly
reduce computation cost while maintaining higher segmentation accuracy. On the
other hand, an attention pyramid network (APN) is employed in the decoder to
further lighten the entire network complexity. Our model has less than 1M
parameters,and is able to run at over 71 FPS in a single GTX 1080Ti GPU. The
comprehensive experiments demonstrate that our approach achieves
state-of-the-art results in terms of speed and accuracy trade-off on CityScapes
dataset.Comment: 5 pages,3 figures,3 tables,accepted in IEEE ICIP 201
- …