1,013 research outputs found

    Approaches to multiprocessor error recovery using an on-chip interconnect subsystem

    Get PDF
    For future multicores, a dedicated interconnect subsystem for on-chip monitors was found to be highly beneficial in terms of scalability, performance and area. In this thesis, such a monitor network (MNoC) is used for multicores to support selective error identification and recovery and maintain target chip reliability in the context of dynamic voltage and frequency scaling (DVFS). A selective shared memory multiprocessor recovery is performed using MNoC in which, when an error is detected, only the group of processors sharing an application with the affected processors are recovered. Although the use of DVFS in contemporary multicores provides significant protection from unpredictable thermal events, a potential side effect can be an increased processor exposure to soft errors. To address this issue, a flexible fault prevention and recovery mechanism has been developed to selectively enable a small amount of per-core dual modular redundancy (DMR) in response to increased vulnerability, as measured by the processor architectural vulnerability factor (AVF). Our new algorithm for DMR deployment aims to provide a stable effective soft error rate (SER) by using DMR in response to DVFS caused by thermal events. The algorithm is implemented in real-time on the multicore using MNoC and controller which evaluates thermal information and multicore performance statistics in addition to error information. DVFS experiments with a multicore simulator using standard benchmarks show an average 6% improvement in overall power consumption and a stable SER by using selective DMR versus continuous DMR deployment

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing

    Embedded dynamic programming networks for networks-on-chip

    Get PDF
    PhD ThesisRelentless technology downscaling and recent technological advancements in three dimensional integrated circuit (3D-IC) provide a promising prospect to realize heterogeneous system-on-chip (SoC) and homogeneous chip multiprocessor (CMP) based on the networks-onchip (NoCs) paradigm with augmented scalability, modularity and performance. In many cases in such systems, scheduling and managing communication resources are the major design and implementation challenges instead of the computing resources. Past research efforts were mainly focused on complex design-time or simple heuristic run-time approaches to deal with the on-chip network resource management with only local or partial information about the network. This could yield poor communication resource utilizations and amortize the benefits of the emerging technologies and design methods. Thus, the provision for efficient run-time resource management in large-scale on-chip systems becomes critical. This thesis proposes a design methodology for a novel run-time resource management infrastructure that can be realized efficiently using a distributed architecture, which closely couples with the distributed NoC infrastructure. The proposed infrastructure exploits the global information and status of the network to optimize and manage the on-chip communication resources at run-time. There are four major contributions in this thesis. First, it presents a novel deadlock detection method that utilizes run-time transitive closure (TC) computation to discover the existence of deadlock-equivalence sets, which imply loops of requests in NoCs. This detection scheme, TC-network, guarantees the discovery of all true-deadlocks without false alarms in contrast to state-of-the-art approximation and heuristic approaches. Second, it investigates the advantages of implementing future on-chip systems using three dimensional (3D) integration and presents the design, fabrication and testing results of a TC-network implemented in a fully stacked three-layer 3D architecture using a through-silicon via (TSV) complementary metal-oxide semiconductor (CMOS) technology. Testing results demonstrate the effectiveness of such a TC-network for deadlock detection with minimal computational delay in a large-scale network. Third, it introduces an adaptive strategy to effectively diffuse heat throughout the three dimensional network-on-chip (3D-NoC) geometry. This strategy employs a dynamic programming technique to select and optimize the direction of data manoeuvre in NoC. It leads to a tool, which is based on the accurate HotSpot thermal model and SystemC cycle accurate model, to simulate the thermal system and evaluate the proposed approach. Fourth, it presents a new dynamic programming-based run-time thermal management (DPRTM) system, including reactive and proactive schemes, to effectively diffuse heat throughout NoC-based CMPs by routing packets through the coolest paths, when the temperature does not exceed chip’s thermal limit. When the thermal limit is exceeded, throttling is employed to mitigate heat in the chip and DPRTM changes its course to avoid throttled paths and to minimize the impact of throttling on chip performance. This thesis enables a new avenue to explore a novel run-time resource management infrastructure for NoCs, in which new methodologies and concepts are proposed to enhance the on-chip networks for future large-scale 3D integration.Iraqi Ministry of Higher Education and Scientific Research (MOHESR)

    Center for Space Microelectronics Technology 1988-1989 technical report

    Get PDF
    The 1988 to 1989 Technical Report of the JPL Center for Space Microelectronics Technology summarizes the technical accomplishments, publications, presentations, and patents of the center. Listed are 321 publications, 282 presentations, and 140 new technology reports and patents

    A Comprehensive Study of the Hardware Trojan and Side-Channel Attacks in Three-Dimensional (3D) Integrated Circuits (ICs)

    Get PDF
    Three-dimensional (3D) integration is emerging as promising techniques for high-performance and low-power integrated circuit (IC, a.k.a. chip) design. As 3D chips require more manufacturing phases than conventional planar ICs, more fabrication foundries are involved in the supply chain of 3D ICs. Due to the globalized semiconductor business model, the extended IC supply chain could incur more security challenges on maintaining the integrity, confidentiality, and reliability of integrated circuits and systems. In this work, we analyze the potential security threats induced by the integration techniques for 3D ICs and propose effective attack detection and mitigation methods. More specifically, we first propose a comprehensive characterization for 3D hardware Trojans in the 3D stacking structure. Practical experiment based quantitative analyses have been performed to assess the impact of 3D Trojans on computing systems. Our analysis shows that advanced attackers could exploit the limitation of the most recent 3D IC testing standard IEEE Standard 1838 to bypass the tier-level testing and successfully implement a powerful TSV-Trojan in 3D chips. We propose an enhancement for IEEE Standard 1838 to facilitate the Trojan detection on two neighboring tiers simultaneously. Next, we develop two 3D Trojan detection methods. The proposed frequency-based Trojan-activity identification (FTAI) method can differentiate the frequency changes induced by Trojans from those caused by process variation noise, outperforming the existing time-domain Trojan detection approaches by 38% in Trojan detection rate. Our invariance checking based Trojan detection method leverages the invariance among the 3D communication infrastructure, 3D network-on-chips (NoCs), to tackle the cross-tier 3D hardware Trojans, achieving a Trojan detection rate of over 94%. Furthermore, this work investigates another type of common security threat, side-channel attacks. We first propose to group the supply voltages of different 3D tiers temporally to drive the crypto unit implemented in 3D ICs such that the noise in power distribution network (PDN) can be induced to obfuscate the original power traces and thus mitigates correlation power analysis (CPA) attacks. Furthermore, we study the side-channel attack on the logic locking mechanism in monolithic 3D ICs and propose a logic-cone conjunction (LCC) method and a configuration guideline for the transistor-level logic locking to strengthen its resilience against CPA attacks

    Product assurance technology for procuring reliable, radiation-hard, custom LSI/VLSI electronics

    Get PDF
    Advanced measurement methods using microelectronic test chips are described. These chips are intended to be used in acquiring the data needed to qualify Application Specific Integrated Circuits (ASIC's) for space use. Efforts were focused on developing the technology for obtaining custom IC's from CMOS/bulk silicon foundries. A series of test chips were developed: a parametric test strip, a fault chip, a set of reliability chips, and the CRRES (Combined Release and Radiation Effects Satellite) chip, a test circuit for monitoring space radiation effects. The technical accomplishments of the effort include: (1) development of a fault chip that contains a set of test structures used to evaluate the density of various process-induced defects; (2) development of new test structures and testing techniques for measuring gate-oxide capacitance, gate-overlap capacitance, and propagation delay; (3) development of a set of reliability chips that are used to evaluate failure mechanisms in CMOS/bulk: interconnect and contact electromigration and time-dependent dielectric breakdown; (4) development of MOSFET parameter extraction procedures for evaluating subthreshold characteristics; (5) evaluation of test chips and test strips on the second CRRES wafer run; (6) two dedicated fabrication runs for the CRRES chip flight parts; and (7) publication of two papers: one on the split-cross bridge resistor and another on asymmetrical SRAM (static random access memory) cells for single-event upset analysis

    3D heterogeneous sensor system on a chip for defense and security applications

    Full text link

    GraphR: Accelerating Graph Processing Using ReRAM

    Full text link
    This paper presents GRAPHR, the first ReRAM-based graph processing accelerator. GRAPHR follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. The analog computation is suit- able for graph processing because: 1) The algorithms are iterative and could inherently tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and Collaborative Filtering) and typical graph algorithms involving integers (e.g., BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a vertex program of a graph algorithm can be expressed in sparse matrix vector multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We show that this assumption is generally true for a large set of graph algorithms. GRAPHR is a novel accelerator architecture consisting of two components: memory ReRAM and graph engine (GE). The core graph computations are performed in sparse matrix format in GEs (ReRAM crossbars). The vector/matrix-based graph computation is not new, but ReRAM offers the unique opportunity to realize the massive parallelism with unprecedented energy efficiency and low hardware cost. With small subgraphs processed by GEs, the gain of performing parallel operations overshadows the wastes due to sparsity. The experiment results show that GRAPHR achieves a 16.01x (up to 132.67x) speedup and a 33.82x energy saving on geometric mean compared to a CPU baseline system. Com- pared to GPU, GRAPHR achieves 1.69x to 2.19x speedup and consumes 4.77x to 8.91x less energy. GRAPHR gains a speedup of 1.16x to 4.12x, and is 3.67x to 10.96x more energy efficiency compared to PIM-based architecture.Comment: Accepted to HPCA 201
    • …
    corecore