32 research outputs found

    Low power low-density parity-checking (ldpc) codes decoder design using dynamic voltage and frequency scaling

    Get PDF
    This thesis presents a low-power LDPC decoder design based on speculative scheduling of energy necessary to decode dynamically varying data frame in both block-fading channels and general AWGN channels. A model of a memory-efficient low-power high-throughput multi-rate array LDPC decoder as well as its FPGA implementa- tion results is first presented. Then, I propose a decoding scheme that provides the feature of constant-time decoding and thus facilitates real-time applications where guaranteed data rate is required. It pre-analyzes each received data frame to estimate the maximum number of necessary iterations for frame convergence. The results are then used to dynamically adjust decoder frequency and switch between multiple-voltage levels; thereby energy use is minimized. This is in contrast to the conventional fixed-iteration decoding schemes that operate at a fixed voltage level regardless of the quality of data received. Analysis shows that the proposed decoding scheme is widely applicable for both two-phase message-passing (TPMP) decoding algorithm and turbo decoding message passing (TDMP) decoding algorithm in block fading channels, and it is independent of the specific LDPC decoder architecture. A decoder architecture utilizing our recently published multi-rate decoding architecture for general AWGN channels is also presented. The result of this thesis is a decoder design scheme that provides a judicious trade-off between power consumption and coding gain

    Low-Power Embedded Design Solutions and Low-Latency On-Chip Interconnect Architecture for System-On-Chip Design

    Get PDF
    This dissertation presents three design solutions to support several key system-on-chip (SoC) issues to achieve low-power and high performance. These are: 1) joint source and channel decoding (JSCD) schemes for low-power SoCs used in portable multimedia systems, 2) efficient on-chip interconnect architecture for massive multimedia data streaming on multiprocessor SoCs (MPSoCs), and 3) data processing architecture for low-power SoCs in distributed sensor network (DSS) systems and its implementation. The first part includes a low-power embedded low density parity check code (LDPC) - H.264 joint decoding architecture to lower the baseband energy consumption of a channel decoder using joint source decoding and dynamic voltage and frequency scaling (DVFS). A low-power multiple-input multiple-output (MIMO) and H.264 video joint detector/decoder design that minimizes energy for portable, wireless embedded systems is also designed. In the second part, a link-level quality of service (QoS) scheme using unequal error protection (UEP) for low-power network-on-chip (NoC) and low latency on-chip network designs for MPSoCs is proposed. This part contains WaveSync, a low-latency focused network-on-chip architecture for globally-asynchronous locally-synchronous (GALS) designs and a simultaneous dual-path routing (SDPR) scheme utilizing path diversity present in typical mesh topology network-on-chips. SDPR is akin to having a higher link width but without the significant hardware overhead associated with simple bus width scaling. The last part shows data processing unit designs for embedded SoCs. We propose a data processing and control logic design for a new radiation detection sensor system generating data at or above Peta-bits-per-second level. Implementation results show that the intended clock rate is achieved within the power target of less than 200mW. We also present a digital signal processing (DSP) accelerator supporting configurable MAC, FFT, FIR, and 3-D cross product operations for embedded SoCs. It consumes 12.35mW along with 0.167mm2 area at 333MHz

    Low-Power and Error-Resilient VLSI Circuits and Systems.

    Full text link
    Efficient low-power operation is critically important for the success of the next-generation signal processing applications. Device and supply voltage have been continuously scaled to meet a more constrained power envelope, but scaling has created resiliency challenges, including increasing timing faults and soft errors. Our research aims at designing low-power and robust circuits and systems for signal processing by drawing circuit, architecture, and algorithm approaches. To gain an insight into the system faults due to supply voltage reduction, we researched the two primary effects that determine the minimum supply voltage (VMIN) in Intel’s tri-gate CMOS technology, namely process variations and gate-dielectric soft breakdown. We determined that voltage scaling increases the timing window that sequential circuits are vulnerable. Thus, we proposed a new hold-time violation metric to define hold-time VMIN, which has been adopted as a new design standard. Device scaling increases soft errors which affect circuit reliability. Through extensive soft error characterization using two 65nm CMOS test chips, we studied the soft error mechanisms and its dependence on supply voltage and clock frequency. This study laid the foundation of the first 65nm DSP chip design for a NASA spaceflight project. To mitigate such random errors, we proposed a new confidence-driven architecture that effectively enhances the error resiliency of deeply scaled CMOS and post-CMOS circuits. Designing low-power resilient systems can effectively leverage application-specific algorithmic approaches. To explore design opportunities in the algorithmic domain, we demonstrate an application-specific detection and decoding processor for multiple-input multiple-output (MIMO) wireless communication. To enhance the receive error rate for a robust wireless communication, we designed a joint detection and decoding technique by enclosing detection and decoding in an iterative loop to enhance both interference cancellation and error reduction. A proof-of-concept chip design was fabricated for the next-generation 4x4 256QAM MIMO systems. Through algorithm-architecture optimizations and low-power circuit techniques, our design achieves significant improvements in throughput, energy efficiency and error rate, paving the way for future developments in this area.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/110323/1/uchchen_1.pd

    ERASER: Towards Adaptive Leakage Suppression for Fault-Tolerant Quantum Computing

    Full text link
    Quantum error correction (QEC) codes can tolerate hardware errors by encoding fault-tolerant logical qubits using redundant physical qubits and detecting errors using parity checks. Leakage errors occur in quantum systems when a qubit leaves its computational basis and enters higher energy states. These errors severely limit the performance of QEC due to two reasons. First, they lead to erroneous parity checks that obfuscate the accurate detection of errors. Second, the leakage spreads to other qubits and creates a pathway for more errors over time. Prior works tolerate leakage errors by using leakage reduction circuits (LRCs) that modify the parity check circuitry of QEC codes. Unfortunately, naively using LRCs always throughout a program is sub-optimal because LRCs incur additional two-qubit operations that (1) facilitate leakage transport, and (2) serve as new sources of errors. Ideally, LRCs should only be used if leakage occurs, so that errors from both leakage as well as additional LRC operations are simultaneously minimized. However, identifying leakage errors in real-time is challenging. To enable the robust and efficient usage of LRCs, we propose ERASER that speculates the subset of qubits that may have leaked and only uses LRCs for those qubits. Our studies show that the majority of leakage errors typically impact the parity checks. We leverage this insight to identify the leaked qubits by analyzing the patterns in the failed parity checks. We propose ERASER+M that enhances ERASER by detecting leakage more accurately using qubit measurement protocols that can classify qubits into 0,1|0\rangle, |1\rangle and L|L\rangle states. ERASER and ERASER+M improve the logical error rate by up to 4.3×4.3\times and 23×23\times respectively compared to always using LRC.Comment: Will appear in the International Symposium on Microarchitecture (MICRO) 2023 in Octobe

    Intertwined results on linear codes and Galois geometries

    Get PDF

    Fault-tolerant satellite computing with modern semiconductors

    Get PDF
    Miniaturized satellites enable a variety space missions which were in the past infeasible, impractical or uneconomical with traditionally-designed heavier spacecraft. Especially CubeSats can be launched and manufactured rapidly at low cost from commercial components, even in academic environments. However, due to their low reliability and brief lifetime, they are usually not considered suitable for life- and safety-critical services, complex multi-phased solar-system-exploration missions, and missions with a longer duration. Commercial electronics are key to satellite miniaturization, but also responsible for their low reliability: Until 2019, there existed no reliable or fault-tolerant computer architectures suitable for very small satellites. To overcome this deficit, a novel on-board-computer architecture is described in this thesis.Robustness is assured without resorting to radiation hardening, but through software measures implemented within a robust-by-design multiprocessor-system-on-chip. This fault-tolerant architecture is component-wise simple and can dynamically adapt to changing performance requirements throughout a mission. It can support graceful aging by exploiting FPGA-reconfiguration and mixed-criticality.  Experimentally, we achieve 1.94W power consumption at 300Mhz with a Xilinx Kintex Ultrascale+ proof-of-concept, which is well within the powerbudget range of current 2U CubeSats. To our knowledge, this is the first COTS-based, reproducible on-board-computer architecture that can offer strong fault coverage even for small CubeSats.European Space AgencyComputer Systems, Imagery and Medi

    Semantics-Based Cache-Side-Channel Quantification in Cryptographic Implementations

    Get PDF
    Performance has been and will continue to be a key criterion in the development of computer systems for a long time. To speed up Central Processing Units (CPUs), micro-architectural components like, e.g., caches and instruction pipelines have been developed. While caches are indispensable from a performance perspective, they also introduce a security risk. If the interaction of a software implementation with a cache differs depending on the data processed by the software, an attacker who observes this interaction can deduce information about the processed data. If the dependence is unintentional, it is called a cache side channel. Cache side channels have been exploited to recover entire secret keys from numerous cryptographic implementations. There are ways to mitigate the leakage of secret information like, e.g., crypto keys through cache side channels. However, such mitigations come at the cost of performance loss, because they cancel out the performance benefits of caching either selectively or completely. That is, there is a security-performance trade-off that is inherent in the mitigation of cache-side-channel leakage. This security-performance trade-off can only be navigated in an informed fashion if reliable quantitative information on the cache-side-channel security of an implementation is available. Quantitative security guarantees can be computed based on program analyses. However, the existing analyses either do not consider caches, do not provide quantitative guarantees across all side-channel output values, or are only applicable to a limited range of crypto implementations. In this thesis, we propose a suite of program analyses that can provide quantitative security guarantees in the form of reliable upper bounds on the cache-side-channel leakage of a variety of real-world cryptographic implementations. Technically, our program analyses are based on a combination of information theory and abstract interpretation. The distinguishing feature of each analysis is the underlying abstraction of the execution environment and program semantics. Our first program analysis is based on an abstraction that captures the state of a CPU with a regular Arithmetic Logic Unit (ALU) during the execution of x86 instructions. In particular, our abstraction captures two status flags that are used, e.g., during the execution of different AES implementations. Our analysis is capable of computing quantitative cache-side-channel security guarantees for off-the-shelf AES implementations from multiple popular libraries. In a comparative study, we clarify the security impact of design choices in these implementations. For instance, we find that the number and size of lookup tables used for just the last transformation round of AES already has a significant impact on the guarantees for the entire implementation. Our second program analysis is based on an abstraction that captures the execution of additional x86 instructions, including instructions that process larger operands. This abstraction can be used to quantify the leakage of crypto implementations that are based on large parameters. For instance, the lattice-based signature scheme ring-TESLA has a maximum key size of 49152 bit. With our analysis, we successfully computed leakage bounds for the implementation of ring-TESLA. These bounds lead to the detection of multiple vulnerabilities that might be exploited to break the entire signature scheme. As a result, mitigations were integrated into the implementations of ring-TESLA and qTESLA, before the latter was submitted to the NIST PQC standardization. Our third program analysis is based on an abstraction that captures the state of a CPU with an ALU and a Floating-Point Unit. It can be used to compute leakage bounds for crypto implementations that rely on floating-point instructions, e.g., to compute probabilities. The software used in Quantum Key Distribution (QKD), e.g., heavily relies on probabilities to perform error correction. With our analysis, we computed leakage bounds for a QKD implementation and detected a vulnerability that might leak the entire secret key. We proposed a mitigation and verified its effectiveness using our analysis. In the new version of the implementation, which is used at the TU Darmstadt Department of Physics, our mitigation is already integrated. Finally, we broaden the scope to side channels that arise from the combination of caching and instruction pipelining. Such side channels are exploited, e.g., by the Spectre-PHT attack. The fourth program analysis in our suite is, to our knowledge, the first ever program analysis that computes reliable quantitative security guarantees with respect to such side channels
    corecore