211 research outputs found
On Path Memory in List Successive Cancellation Decoder of Polar Codes
Polar code is a breakthrough in coding theory. Using list successive
cancellation decoding with large list size L, polar codes can achieve excellent
error correction performance. The L partial decoded vectors are stored in the
path memory and updated according to the results of list management. In the
state-of-the-art designs, the memories are implemented with registers and a
large crossbar is used for copying the partial decoded vectors from one block
of memory to another during the update. The architectures are quite area-costly
when the code length and list size are large. To solve this problem, we propose
two optimization schemes for the path memory in this work. First, a folded path
memory architecture is presented to reduce the area cost. Second, we show a
scheme that the path memory can be totally removed from the architecture.
Experimental results show that these schemes effectively reduce the area of
path memory.Comment: 5 pages, 6 figures, 2 table
Low Complexity Belief Propagation Polar Code Decoders
Since its invention, polar code has received a lot of attention because of
its capacity-achieving performance and low encoding and decoding complexity.
Successive cancellation decoding (SCD) and belief propagation decoding (BPD)
are two of the most popular approaches for decoding polar codes. SCD is able to
achieve good error-correcting performance and is less computationally expensive
as compared to BPD. However SCDs suffer from long latency and low throughput
due to the serial nature of the successive cancellation algorithm. BPD is
parallel in nature and hence is more attractive for high throughput
applications. However since it is iterative in nature, the required latency and
energy dissipation increases linearly with the number of iterations. In this
work, we borrow the idea of SCD and propose a novel scheme based on
sub-factor-graph freezing to reduce the average number of computations as well
as the average number of iterations required by BPD, which directly translates
into lower latency and energy dissipation. Simulation results show that the
proposed scheme has no performance degradation and achieves significant
reduction in computation complexity over the existing methods.Comment: 6 page
A Two-staged Adaptive Successive Cancellation List Decoding for Polar Codes
Polar codes achieve outstanding error correction performance when using
successive cancellation list (SCL) decoding with cyclic redundancy check. A
larger list size brings better decoding performance and is essential for
practical applications such as 5G communication networks. However, the decoding
speed of SCL decreases with increased list size. Adaptive SCL (ASCL) decoding
can greatly enhance the decoding speed, but the decoding latency for each
codeword is different so A-SCL is not a good choice for hardware-based
applications. In this paper, a hardware-friendly two-staged adaptive SCL
(TA-SCL) decoding algorithm is proposed such that a constant input data rate is
supported even if the list size for each codeword is different. A mathematical
model based on Markov chain is derived to explore the bounds of its decoding
performance. Simulation results show that the throughput of TA-SCL is tripled
for good channel conditions with negligible performance degradation and
hardware overhead.Comment: 5 pages, 7 figures, 1 table. Accepted by ISCAS 201
Accelerating Large Kernel Convolutions with Nested Winograd Transformation.pdf
Recent literature has shown that convolutional neural networks (CNNs) with
large kernels outperform vision transformers (ViTs) and CNNs with stacked small
kernels in many computer vision tasks, such as object detection and image
restoration. The Winograd transformation helps reduce the number of repetitive
multiplications in convolution and is widely supported by many commercial AI
processors. Researchers have proposed accelerating large kernel convolutions by
linearly decomposing them into many small kernel convolutions and then
sequentially accelerating each small kernel convolution with the Winograd
algorithm. This work proposes a nested Winograd algorithm that iteratively
decomposes a large kernel convolution into small kernel convolutions and proves
it to be more effective than the linear decomposition Winograd transformation
algorithm. Experiments show that compared to the linear decomposition Winograd
algorithm, the proposed algorithm reduces the total number of multiplications
by 1.4 to 10.5 times for computing 4x4 to 31x31 convolutions.Comment: published ref to https://ieeexplore.ieee.org/document/1032193
How Robust is Federated Learning to Communication Error? A Comparison Study Between Uplink and Downlink Channels
Because of its privacy-preserving capability, federated learning (FL) has
attracted significant attention from both academia and industry. However, when
being implemented over wireless networks, it is not clear how much
communication error can be tolerated by FL. This paper investigates the
robustness of FL to the uplink and downlink communication error. Our
theoretical analysis reveals that the robustness depends on two critical
parameters, namely the number of clients and the numerical range of model
parameters. It is also shown that the uplink communication in FL can tolerate a
higher bit error rate (BER) than downlink communication, and this difference is
quantified by a proposed formula. The findings and theoretical analyses are
further validated by extensive experiments.Comment: Submitted to IEEE for possible publicatio
- …