125 research outputs found
Algorithm Development and VLSI Implementation of Energy Efficient Decoders of Polar Codes
With its low error-floor performance, polar codes attract significant attention as the potential standard error correction code (ECC) for future communication and data storage. However, the VLSI implementation complexity of polar codes decoders is largely influenced by its nature of in-series decoding. This dissertation is dedicated to presenting optimal decoder architectures for polar codes. This dissertation addresses several structural properties of polar codes and key properties of decoding algorithms that are not dealt with in the prior researches. The underlying concept of the proposed architectures is a paradigm that simplifies and schedules the computations such that hardware is simplified, latency is minimized and bandwidth is maximized.
In pursuit of the above, throughput centric successive cancellation (TCSC) and overlapping path list successive cancellation (OPLSC) VLSI architectures and express journey BP (XJBP) decoders for the polar codes are presented.
An arbitrary polar code can be decomposed by a set of shorter polar codes with special characteristics, those shorter polar codes are referred to as constituent polar codes. By exploiting the homogeneousness between decoding processes of different constituent polar codes, TCSC reduces the decoding latency of the SC decoder by 60% for codes with length n = 1024. The error correction performance of SC decoding is inferior to that of list successive cancellation decoding. The LSC decoding algorithm delivers the most reliable decoding results; however, it consumes most hardware resources and decoding cycles. Instead of using multiple instances of decoding cores in the LSC decoders, a single SC decoder is used in the OPLSC architecture. The computations of each path in the LSC are arranged to occupy the decoder hardware stages serially in a streamlined fashion. This yields a significant reduction of hardware complexity. The OPLSC decoder has achieved about 1.4 times hardware efficiency improvement compared with traditional LSC decoders. The hardware efficient VLSI architectures for TCSC and OPLSC polar codes decoders are also introduced.
Decoders based on SC or LSC algorithms suffer from high latency and limited throughput due to their serial decoding natures. An alternative approach to decode the polar codes is belief propagation (BP) based algorithm. In BP algorithm, a graph is set up to guide the beliefs propagated and refined, which is usually referred to as factor graph. BP decoding algorithm allows decoding in parallel to achieve much higher throughput. XJBP decoder facilitates belief propagation by utilizing the specific constituent codes that exist in the conventional factor graph, which results in an express journey (XJ) decoder. Compared with the conventional BP decoding algorithm for polar codes, the proposed decoder reduces the computational complexity by about 40.6%. This enables an energy-efficient hardware implementation. To further explore the hardware consumption of the proposed XJBP decoder, the computations scheduling is modeled and analyzed in this dissertation. With discussions on different hardware scenarios, the optimal scheduling plans are developed. A novel memory-distributed micro-architecture of the XJBP decoder is proposed and analyzed to solve the potential memory access problems of the proposed scheduling strategy. The register-transfer level (RTL) models of the XJBP decoder are set up for comparisons with other state-of-the-art BP decoders. The results show that the power efficiency of BP decoders is improved by about 3 times
Algorithm Development and VLSI Implementation of Energy Efficient Decoders of Polar Codes
With its low error-floor performance, polar codes attract significant attention as the potential standard error correction code (ECC) for future communication and data storage. However, the VLSI implementation complexity of polar codes decoders is largely influenced by its nature of in-series decoding. This dissertation is dedicated to presenting optimal decoder architectures for polar codes. This dissertation addresses several structural properties of polar codes and key properties of decoding algorithms that are not dealt with in the prior researches. The underlying concept of the proposed architectures is a paradigm that simplifies and schedules the computations such that hardware is simplified, latency is minimized and bandwidth is maximized.
In pursuit of the above, throughput centric successive cancellation (TCSC) and overlapping path list successive cancellation (OPLSC) VLSI architectures and express journey BP (XJBP) decoders for the polar codes are presented.
An arbitrary polar code can be decomposed by a set of shorter polar codes with special characteristics, those shorter polar codes are referred to as constituent polar codes. By exploiting the homogeneousness between decoding processes of different constituent polar codes, TCSC reduces the decoding latency of the SC decoder by 60% for codes with length n = 1024. The error correction performance of SC decoding is inferior to that of list successive cancellation decoding. The LSC decoding algorithm delivers the most reliable decoding results; however, it consumes most hardware resources and decoding cycles. Instead of using multiple instances of decoding cores in the LSC decoders, a single SC decoder is used in the OPLSC architecture. The computations of each path in the LSC are arranged to occupy the decoder hardware stages serially in a streamlined fashion. This yields a significant reduction of hardware complexity. The OPLSC decoder has achieved about 1.4 times hardware efficiency improvement compared with traditional LSC decoders. The hardware efficient VLSI architectures for TCSC and OPLSC polar codes decoders are also introduced.
Decoders based on SC or LSC algorithms suffer from high latency and limited throughput due to their serial decoding natures. An alternative approach to decode the polar codes is belief propagation (BP) based algorithm. In BP algorithm, a graph is set up to guide the beliefs propagated and refined, which is usually referred to as factor graph. BP decoding algorithm allows decoding in parallel to achieve much higher throughput. XJBP decoder facilitates belief propagation by utilizing the specific constituent codes that exist in the conventional factor graph, which results in an express journey (XJ) decoder. Compared with the conventional BP decoding algorithm for polar codes, the proposed decoder reduces the computational complexity by about 40.6%. This enables an energy-efficient hardware implementation. To further explore the hardware consumption of the proposed XJBP decoder, the computations scheduling is modeled and analyzed in this dissertation. With discussions on different hardware scenarios, the optimal scheduling plans are developed. A novel memory-distributed micro-architecture of the XJBP decoder is proposed and analyzed to solve the potential memory access problems of the proposed scheduling strategy. The register-transfer level (RTL) models of the XJBP decoder are set up for comparisons with other state-of-the-art BP decoders. The results show that the power efficiency of BP decoders is improved by about 3 times
Distributed Data Aggregation for Sparse Recovery in Wireless Sensor Networks
We consider the approximate sparse recovery problem in Wireless Sensor Networks (WSNs) using Compressed Sensing/Compressive Sampling (CS). The goal is to recover the n \mbox{-}dimensional data values by querying only sensors based on some linear projection of sensor readings. To solve this problem, a two-tiered sampling model is considered and a novel distributed compressive sparse sampling (DCSS) algorithm is proposed based on sparse binary CS measurement matrix. In the two-tiered sampling model, each sensor first samples the environment independently. Then the fusion center (FC), acting as a pseudo-sensor, samples the sensor network to select a subset of sensors ( out of ) that directly respond to the FC for data recovery purpose. The sparse binary matrix is designed using unbalanced expander graph which achieves the state-of-the-art performance for CS schemes. This binary matrix can be interpreted as a sensor selection matrix-whose fairness is analyzed. Extensive experiments on both synthetic and real data set show that by querying only the minimum amount of sensors using the DCSS algorithm, the CS recovery accuracy can be as good as dense measurement matrices (e.g., Gaussian, Fourier Scrambles). We also show that the sparse binary measurement matrix works well on compressible data which has the closest recovery result to the known best k\mbox{-}term approximation. The recovery is robust against noisy measurements. The sparsity and binary properties of the measurement matrix contribute, to a great extent, the reduction of the in-network communication cost as well as the computational burden
Multispectral Image Compression Based on DSC Combined with CCSDS-IDC
Remote sensing multispectral image compression encoder requires low complexity, high robust, and high performance because it usually works on the satellite where the resources, such as power, memory, and processing capacity, are limited. For multispectral images, the compression algorithms based on 3D transform (like 3D DWT, 3D DCT) are too complex to be implemented in space mission. In this paper, we proposed a compression algorithm based on distributed source coding (DSC) combined with image data compression (IDC) approach recommended by CCSDS for multispectral images, which has low complexity, high robust, and high performance. First, each band is sparsely represented by DWT to obtain wavelet coefficients. Then, the wavelet coefficients are encoded by bit plane encoder (BPE). Finally, the BPE is merged to the DSC strategy of Slepian-Wolf (SW) based on QC-LDPC by deep coupling way to remove the residual redundancy between the adjacent bands. A series of multispectral images is used to test our algorithm. Experimental results show that the proposed DSC combined with the CCSDS-IDC (DSC-CCSDS)-based algorithm has better compression performance than the traditional compression approaches
Cellular, Wide-Area, and Non-Terrestrial IoT: A Survey on 5G Advances and the Road Towards 6G
The next wave of wireless technologies is proliferating in connecting things
among themselves as well as to humans. In the era of the Internet of things
(IoT), billions of sensors, machines, vehicles, drones, and robots will be
connected, making the world around us smarter. The IoT will encompass devices
that must wirelessly communicate a diverse set of data gathered from the
environment for myriad new applications. The ultimate goal is to extract
insights from this data and develop solutions that improve quality of life and
generate new revenue. Providing large-scale, long-lasting, reliable, and near
real-time connectivity is the major challenge in enabling a smart connected
world. This paper provides a comprehensive survey on existing and emerging
communication solutions for serving IoT applications in the context of
cellular, wide-area, as well as non-terrestrial networks. Specifically,
wireless technology enhancements for providing IoT access in fifth-generation
(5G) and beyond cellular networks, and communication networks over the
unlicensed spectrum are presented. Aligned with the main key performance
indicators of 5G and beyond 5G networks, we investigate solutions and standards
that enable energy efficiency, reliability, low latency, and scalability
(connection density) of current and future IoT networks. The solutions include
grant-free access and channel coding for short-packet communications,
non-orthogonal multiple access, and on-device intelligence. Further, a vision
of new paradigm shifts in communication networks in the 2030s is provided, and
the integration of the associated new technologies like artificial
intelligence, non-terrestrial networks, and new spectra is elaborated. Finally,
future research directions toward beyond 5G IoT networks are pointed out.Comment: Submitted for review to IEEE CS&
Semantic and effective communications
Shannon and Weaver categorized communications into three levels of problems: the technical problem, which tries to answer the question "how accurately can the symbols of communication be transmitted?"; the semantic problem, which asks the question "how precisely do the transmitted symbols convey the desired meaning?"; the effectiveness problem, which strives to answer the question "how effectively does the received meaning affect conduct in the desired way?". Traditionally, communication technologies mainly addressed the technical problem, ignoring the semantics or the effectiveness problems.
Recently, there has been increasing interest to address the higher level semantic and effectiveness problems, with proposals ranging from semantic to goal oriented communications. In this thesis, we propose to formulate the semantic problem as a joint source-channel coding (JSCC) problem and the effectiveness problem as a multi-agent partially observable Markov decision process (MA-POMDP). As such, for the semantic problem, we propose DeepWiVe, the first-ever end-to-end JSCC video transmission scheme that leverages the power of deep neural networks (DNNs) to directly map video signals to channel symbols, combining video compression, channel coding, and modulation steps into a single neural transform. We also further show that it is possible to use predefined constellation designs as well as secure the physical layer communication against eavesdroppers for deep learning (DL) driven JSCC schemes, making such schemes much more viable for deployment in the real world.
For the effectiveness problem, we propose a novel formulation by considering multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation in a multi-agent reinforcement learning (MARL) framework. Specifically, we consider a MA-POMDP, in which the agents, in addition to interacting with the environment, can also communicate with each other over a noisy communication channel. The noisy communication channel is considered explicitly as part of the dynamics of the environment, and the message each agent sends is part of the action that the agent can take. As a result, the agents learn not only to collaborate with each other but also to communicate "effectively'' over a noisy channel. Moreover, we show that this framework generalizes both the semantic and technical problems. In both instances, we show that the resultant communication scheme is superior to one where the communication is considered separately from the underlying semantic or goal of the problem.Open Acces
Recommended from our members
DISTRIBUTED LEARNING ALGORITHMS: COMMUNICATION EFFICIENCY AND ERROR RESILIENCE
In modern day machine learning applications such as self-driving cars, recommender systems, robotics, genetics etc., the size of the training data has grown to the point that it has become essential to design distributed learning algorithms. A general framework for the distributed learning is \emph{data parallelism} where the data is distributed among the \emph{worker machines} for parallel processing and computation to speed up learning. With billions of devices such as cellphones, computers etc., the data is inherently distributed and stored locally in the users\u27 devices. Learning in this set up is popularly known as \emph{Federated Learning}. The speed-up due to distributed framework gets hindered by some fundamental problems such as straggler workers, communication bottleneck due to high communication overhead between workers and central server, adversarial failure popularly know as \emph{Byzantine failure}. In this thesis, we study and develop distributed algorithms that are error resilient and communication efficient.
First, we address the problem of straggler workers where the learning is delayed due to slow workers in the distributed setup. To mitigate the effect of the stragglers, we employ \textbf{LDPC} (low density parity check) code to encode the data and implement gradient descent algorithm in the distributed setup. Second, we present a family of vector quantization schemes \emph{vqSGD} (vector quantized Stochastic Gradient Descent ) that provides an asymptotic reduction in the communication cost with convergence guarantees in the first order distributed optimization. We also showed that \emph{vqSGD} provides strong privacy guarantee. Third, we address the problem of Byzantine failure together with communication-efficiency in the first order gradient descent algorithm. We consider a generic class of - approximate compressor for communication efficiency and employ a simple \emph{norm based thresholding} scheme to make the learning algorithm robust to Byzantine failures. We establish statistical error rate for non-convex smooth loss. Moreover, we analyze the compressed gradient descent algorithm with error feedback in a distributed setting and in the presence of Byzantine worker machines. Fourth, we employ the generic class of - approximate compressor to develop a communication efficient second order Newton-type algorithm and provide rate of convergence for smooth objective. Fifth, we propose \textbf{COMRADE} (COMmunication-efficient and Robust Approximate Distributed nEwton ), an iterative second order algorithm that is communication efficient as well as robust against Byzantine failures. Sixth, we propose a distributed \emph{cubic-regularized Newton } algorithm that can escape saddle points effectively for non-convex loss function and find a local minima . Furthermore, the proposed algorithm can resist the attack of the Byzantine machines, which may create \emph{fake local minima} near the saddle points of the loss function, also known as saddle-point attack
On the Effectiveness of Video Recolouring as an Uplink-model Video Coding Technique
For decades, conventional video compression formats have advanced via incremental improvements with
each subsequent standard achieving better rate-distortion (RD) efficiency at the cost of increased encoder
complexity compared to its predecessors. Design efforts have been driven by common multi-media use cases
such as video-on-demand, teleconferencing, and video streaming, where the most important requirements are
low bandwidth and low video playback latency. Meeting these requirements involves the use of computa-
tionally expensive block-matching algorithms which produce excellent compression rates and quick decoding
times.
However, emerging use cases such as Wireless Video Sensor Networks, remote surveillance, and mobile
video present new technical challenges in video compression. In these scenarios, the video capture and
encoding devices are often power-constrained and have limited computational resources available, while the
decoder devices have abundant resources and access to a dedicated power source. To address these use cases,
codecs must be power-aware and offer a reasonable trade-off between video quality, bitrate, and encoder
complexity. Balancing these constraints requires a complete rethinking of video compression technology.
The uplink video-coding model represents a new paradigm to address these low-power use cases, providing
the ability to redistribute computational complexity by offloading the motion estimation and compensation
steps from encoder to decoder. Distributed Video Coding (DVC) follows this uplink model of video codec
design, and maintains high quality video reconstruction through innovative channel coding techniques. The
field of DVC is still early in its development, with many open problems waiting to be solved, and no defined
video compression or distribution standards. Due to the experimental nature of the field, most DVC codec
to date have focused on encoding and decoding the Luma plane only, which produce grayscale reconstructed
videos.
In this thesis, a technique called “video recolouring” is examined as an alternative to DVC. Video recolour-
ing exploits the temporal redundancies between colour planes, reducing video bitrate by removing Chroma
information from specific frames and then recolouring them at the decoder.
A novel video recolouring algorithm called Motion-Compensated Recolouring (MCR) is proposed, which
uses block motion estimation and bi-directional weighted motion-compensation to reconstruct Chroma planes
at the decoder. MCR is used to enhance a conventional base-layer codec, and shown to reduce bitrate by
up to 16% with only a slight decrease in objective quality. MCR also outperforms other video recolouring
algorithms in terms of objective video quality, demonstrating up to 2 dB PSNR improvement in some cases
Rank Minimization over Finite Fields: Fundamental Limits and Coding-Theoretic Interpretations
This paper establishes information-theoretic limits in estimating a finite
field low-rank matrix given random linear measurements of it. These linear
measurements are obtained by taking inner products of the low-rank matrix with
random sensing matrices. Necessary and sufficient conditions on the number of
measurements required are provided. It is shown that these conditions are sharp
and the minimum-rank decoder is asymptotically optimal. The reliability
function of this decoder is also derived by appealing to de Caen's lower bound
on the probability of a union. The sufficient condition also holds when the
sensing matrices are sparse - a scenario that may be amenable to efficient
decoding. More precisely, it is shown that if the n\times n-sensing matrices
contain, on average, \Omega(nlog n) entries, the number of measurements
required is the same as that when the sensing matrices are dense and contain
entries drawn uniformly at random from the field. Analogies are drawn between
the above results and rank-metric codes in the coding theory literature. In
fact, we are also strongly motivated by understanding when minimum rank
distance decoding of random rank-metric codes succeeds. To this end, we derive
distance properties of equiprobable and sparse rank-metric codes. These
distance properties provide a precise geometric interpretation of the fact that
the sparse ensemble requires as few measurements as the dense one. Finally, we
provide a non-exhaustive procedure to search for the unknown low-rank matrix.Comment: Accepted to the IEEE Transactions on Information Theory; Presented at
IEEE International Symposium on Information Theory (ISIT) 201
- …