17 research outputs found

    System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing

    Get PDF
    This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications. Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance. This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB. Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy). The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption. Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude

    Architectures for Code-based Post-Quantum Cryptography

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Low Complexity LDPC Code Decoders for Next Generation Standards

    Full text link

    Low power low-density parity-checking (ldpc) codes decoder design using dynamic voltage and frequency scaling

    Get PDF
    This thesis presents a low-power LDPC decoder design based on speculative scheduling of energy necessary to decode dynamically varying data frame in both block-fading channels and general AWGN channels. A model of a memory-efficient low-power high-throughput multi-rate array LDPC decoder as well as its FPGA implementa- tion results is first presented. Then, I propose a decoding scheme that provides the feature of constant-time decoding and thus facilitates real-time applications where guaranteed data rate is required. It pre-analyzes each received data frame to estimate the maximum number of necessary iterations for frame convergence. The results are then used to dynamically adjust decoder frequency and switch between multiple-voltage levels; thereby energy use is minimized. This is in contrast to the conventional fixed-iteration decoding schemes that operate at a fixed voltage level regardless of the quality of data received. Analysis shows that the proposed decoding scheme is widely applicable for both two-phase message-passing (TPMP) decoding algorithm and turbo decoding message passing (TDMP) decoding algorithm in block fading channels, and it is independent of the specific LDPC decoder architecture. A decoder architecture utilizing our recently published multi-rate decoding architecture for general AWGN channels is also presented. The result of this thesis is a decoder design scheme that provides a judicious trade-off between power consumption and coding gain

    VLSI decoding architectures: flexibility, robustness and performance

    Get PDF
    Stemming from previous studies on flexible LDPC decoders, this thesis work has been mainly focused on the development of flexible turbo and LDPC decoder designs, and on the narrowing of the power, area and speed gap they might present with respect to dedicated solutions. Additional studies have been carried out within the field of increased code performance and of decoder resiliency to hardware errors. The first chapter regroups several main contributions in the design and implementation of flexible channel decoders. The first part concerns the design of a Network-on-Chip (NoC) serving as an interconnection network for a partially parallel LDPC decoder. A best-fit NoC architecture is designed and a complete multi-standard turbo/LDPC decoder is designed and implemented. Every time the code is changed, the decoder must be reconfigured. A number of variables influence the duration of the reconfiguration process, starting from the involved codes down to decoder design choices. These are taken in account in the flexible decoder designed, and novel traffic reduction and optimization methods are then implemented. In the second chapter a study on the early stopping of iterations for LDPC decoders is presented. The energy expenditure of any LDPC decoder is directly linked to the iterative nature of the decoding algorithm. We propose an innovative multi-standard early stopping criterion for LDPC decoders that observes the evolution of simple metrics and relies on on-the-fly threshold computation. Its effectiveness is evaluated against existing techniques both in terms of saved iterations and, after implementation, in terms of actual energy saving. The third chapter portrays a study on the resilience of LDPC decoders under the effect of memory errors. Given that the purpose of channel decoders is to correct errors, LDPC decoders are intrinsically characterized by a certain degree of resistance to hardware faults. This characteristic, together with the soft nature of the stored values, results in LDPC decoders being affected differently according to the meaning of the wrong bits: ad-hoc error protection techniques, like the Unequal Error Protection devised in this chapter, can consequently be applied to different bits according to their significance. In the fourth chapter the serial concatenation of LDPC and turbo codes is presented. The concatenated FEC targets very high error correction capabilities, joining the performance of turbo codes at low SNR with that of LDPC codes at high SNR, and outperforming both current deep-space FEC schemes and concatenation-based FECs. A unified decoder for the concatenated scheme is subsequently propose

    Area and energy efficient VLSI architectures for low-density parity-check decoders using an on-the-fly computation

    Get PDF
    The VLSI implementation complexity of a low density parity check (LDPC) decoder is largely influenced by the interconnect and the storage requirements. This dissertation presents the decoder architectures for regular and irregular LDPC codes that provide substantial gains over existing academic and commercial implementations. Several structured properties of LDPC codes and decoding algorithms are observed and are used to construct hardware implementation with reduced processing complexity. The proposed architectures utilize an on-the-fly computation paradigm which permits scheduling of the computations in a way that the memory requirements and re-computations are reduced. Using this paradigm, the run-time configurable and multi-rate VLSI architectures for the rate compatible array LDPC codes and irregular block LDPC codes are designed. Rate compatible array codes are considered for DSL applications. Irregular block LDPC codes are proposed for IEEE 802.16e, IEEE 802.11n, and IEEE 802.20. When compared with a recent implementation of an 802.11n LDPC decoder, the proposed decoder reduces the logic complexity by 6.45x and memory complexity by 2x for a given data throughput. When compared to the latest reported multi-rate decoders, this decoder design has an area efficiency of around 5.5x and energy efficiency of 2.6x for a given data throughput. The numbers are normalized for a 180nm CMOS process. Properly designed array codes have low error floors and meet the requirements of magnetic channel and other applications which need several Gbps of data throughput. A high throughput and fixed code architecture for array LDPC codes has been designed. No modification to the code is performed as this can result in high error floors. This parallel decoder architecture has no routing congestion and is scalable for longer block lengths. When compared to the latest fixed code parallel decoders in the literature, this design has an area efficiency of around 36x and an energy efficiency of 3x for a given data throughput. Again, the numbers are normalized for a 180nm CMOS process. In summary, the design and analysis details of the proposed architectures are described in this dissertation. The results from the extensive simulation and VHDL verification on FPGA and ASIC design platforms are also presented

    Private Streaming with Convolutional Codes

    Full text link
    Recently, information-theoretic private information retrieval (PIR) from coded storage systems has gained a lot of attention, and a general star product PIR scheme was proposed. In this paper, the star product scheme is adopted, with appropriate modifications, to the case of private (e.g., video) streaming. It is assumed that the files to be streamed are stored on~nn servers in a coded form, and the streaming is carried out via a convolutional code. The star product scheme is defined for this special case, and various properties are analyzed for two channel models related to straggling and Byzantine servers, both in the baseline case as well as with colluding servers. The achieved PIR rates for the given models are derived and, for the cases where the capacity is known, the first model is shown to be asymptotically optimal, when the number of stripes in a file is large. The second scheme introduced in this work is shown to be the equivalent of block convolutional codes in the PIR setting. For the Byzantine server model, it is shown to outperform the trivial scheme of downloading stripes of the desired file separately without memory

    Secure and Privacy-preserving Data Sharing in the Cloud based on Lossless Image Coding

    Get PDF
    Abstract Image and video processing in the encrypted domain has recently emerged as a promising research area to tackle privacy-related data processing issues. In particular, reversible data hiding in the encrypted domain has been suggested as a solution to store and manage digital images securely in the cloud while preserving their confidentiality. However, although efficiency has been claimed with reversible data hiding techniques in encrypted images (RDHEI), reported results show that the cloud service provider cannot add more than 1 bit per pixel (bpp) of additional data to manage stored images. This paper highlights the weakness of RDHEI as a suggested approach for secure and privacy-preserving cloud computing. In particular, we propose a new, simple, and efficient approach that offers the same level of data security and confidentiality in the cloud without the process of reversible data hiding. The proposed idea is to compress the image via a lossless image coder in order to create space before encryption. This space is then filled with a randomly generated sequence and combined with an encrypted version of the compressed bit stream to form a full resolution encrypted image in the pixel domain. The cloud service provider uses the created room in the encrypted image to add additional data and produces an encrypted image containing additional data in a similar fashion. Assessed with the lossless Embedded Block Coding with Optimized Truncation (EBCOT) algorithm on natural images, the proposed scheme has been shown to exceed the capacity of 3 bpp of additional data while maintaining data security and confidentiality

    Information-theoretic security under computational, bandwidth, and randomization constraints

    Get PDF
    The objective of the proposed research is to develop and analyze coding schemes for information-theoretic security, which could bridge a gap between theory an practice. We focus on two fundamental models for information-theoretic security: secret-key generation for a source model and secure communication over the wire-tap channel. Many results for these models only provide existence of codes, and few attempts have been made to design practical schemes. The schemes we would like to propose should account for practical constraints. Specifically, we formulate the following constraints to avoid oversimplifying the problems. We should assume: (1) computationally bounded legitimate users and not solely rely on proofs showing existence of code with exponential complexity in the block-length; (2) a rate-limited public communication channel for the secret-key generation model, to account for bandwidth constraints; (3) a non-uniform and rate-limited source of randomness at the encoder for the wire-tap channel model, since a perfectly uniform and rate-unlimited source of randomness might be an expensive resource. Our work focuses on developing schemes for secret-key generation and the wire-tap channel that satisfy subsets of the aforementioned constraints.Ph.D
    corecore