1,013 research outputs found
Approaches to multiprocessor error recovery using an on-chip interconnect subsystem
For future multicores, a dedicated interconnect subsystem for on-chip monitors was found to be highly beneficial in terms of scalability, performance and area. In this thesis, such a monitor network (MNoC) is used for multicores to support selective error identification and recovery and maintain target chip reliability in the context of dynamic voltage and frequency scaling (DVFS). A selective shared memory multiprocessor recovery is performed using MNoC in which, when an error is detected, only the group of processors sharing an application with the affected processors are recovered. Although the use of DVFS in contemporary multicores provides significant protection from unpredictable thermal events, a potential side effect can be an increased processor exposure to soft errors. To address this issue, a flexible fault prevention and recovery mechanism has been developed to selectively enable a small amount of per-core dual modular redundancy (DMR) in response to increased vulnerability, as measured by the processor architectural vulnerability factor (AVF). Our new algorithm for DMR deployment aims to provide a stable effective soft error rate (SER) by using DMR in response to DVFS caused by thermal events. The algorithm is implemented in real-time on the multicore using MNoC and controller which evaluates thermal information and multicore performance statistics in addition to error information. DVFS experiments with a multicore simulator using standard benchmarks show an average 6% improvement in overall power consumption and a stable SER by using selective DMR versus continuous DMR deployment
Parallel and Distributed Computing
The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing
Embedded dynamic programming networks for networks-on-chip
PhD ThesisRelentless technology downscaling and recent technological advancements
in three dimensional integrated circuit (3D-IC) provide a promising
prospect to realize heterogeneous system-on-chip (SoC) and homogeneous
chip multiprocessor (CMP) based on the networks-onchip
(NoCs) paradigm with augmented scalability, modularity and
performance. In many cases in such systems, scheduling and managing
communication resources are the major design and implementation
challenges instead of the computing resources. Past research
efforts were mainly focused on complex design-time or simple heuristic
run-time approaches to deal with the on-chip network resource
management with only local or partial information about the network.
This could yield poor communication resource utilizations and amortize
the benefits of the emerging technologies and design methods.
Thus, the provision for efficient run-time resource management in
large-scale on-chip systems becomes critical. This thesis proposes a
design methodology for a novel run-time resource management infrastructure
that can be realized efficiently using a distributed architecture,
which closely couples with the distributed NoC infrastructure. The
proposed infrastructure exploits the global information and status
of the network to optimize and manage the on-chip communication
resources at run-time.
There are four major contributions in this thesis. First, it presents a
novel deadlock detection method that utilizes run-time transitive closure
(TC) computation to discover the existence of deadlock-equivalence
sets, which imply loops of requests in NoCs. This detection scheme,
TC-network, guarantees the discovery of all true-deadlocks without
false alarms in contrast to state-of-the-art approximation and heuristic
approaches. Second, it investigates the advantages of implementing
future on-chip systems using three dimensional (3D) integration and
presents the design, fabrication and testing results of a TC-network
implemented in a fully stacked three-layer 3D architecture using a
through-silicon via (TSV) complementary metal-oxide semiconductor
(CMOS) technology. Testing results demonstrate the effectiveness
of such a TC-network for deadlock detection with minimal computational
delay in a large-scale network. Third, it introduces an adaptive
strategy to effectively diffuse heat throughout the three dimensional
network-on-chip (3D-NoC) geometry. This strategy employs a dynamic
programming technique to select and optimize the direction of data
manoeuvre in NoC. It leads to a tool, which is based on the accurate
HotSpot thermal model and SystemC cycle accurate model, to simulate
the thermal system and evaluate the proposed approach. Fourth, it
presents a new dynamic programming-based run-time thermal management
(DPRTM) system, including reactive and proactive schemes, to
effectively diffuse heat throughout NoC-based CMPs by routing packets
through the coolest paths, when the temperature does not exceed
chip’s thermal limit. When the thermal limit is exceeded, throttling is
employed to mitigate heat in the chip and DPRTM changes its course
to avoid throttled paths and to minimize the impact of throttling on
chip performance.
This thesis enables a new avenue to explore a novel run-time resource
management infrastructure for NoCs, in which new methodologies
and concepts are proposed to enhance the on-chip networks for
future large-scale 3D integration.Iraqi Ministry of Higher Education and Scientific Research (MOHESR)
Center for Space Microelectronics Technology 1988-1989 technical report
The 1988 to 1989 Technical Report of the JPL Center for Space Microelectronics Technology summarizes the technical accomplishments, publications, presentations, and patents of the center. Listed are 321 publications, 282 presentations, and 140 new technology reports and patents
A Comprehensive Study of the Hardware Trojan and Side-Channel Attacks in Three-Dimensional (3D) Integrated Circuits (ICs)
Three-dimensional (3D) integration is emerging as promising techniques for high-performance and low-power integrated circuit (IC, a.k.a. chip) design. As 3D chips require more manufacturing phases than conventional planar ICs, more fabrication foundries are involved in the supply chain of 3D ICs. Due to the globalized semiconductor business model, the extended IC supply chain could incur more security challenges on maintaining the integrity, confidentiality, and reliability of integrated circuits and systems. In this work, we analyze the potential security threats induced by the integration techniques for 3D ICs and propose effective attack detection and mitigation methods. More specifically, we first propose a comprehensive characterization for 3D hardware Trojans in the 3D stacking structure. Practical experiment based quantitative analyses have been performed to assess the impact of 3D Trojans on computing systems. Our analysis shows that advanced attackers could exploit the limitation of the most recent 3D IC testing standard IEEE Standard 1838 to bypass the tier-level testing and successfully implement a powerful TSV-Trojan in 3D chips. We propose an enhancement for IEEE Standard 1838 to facilitate the Trojan detection on two neighboring tiers simultaneously. Next, we develop two 3D Trojan detection methods. The proposed frequency-based Trojan-activity identification (FTAI) method can differentiate the frequency changes induced by Trojans from those caused by process variation noise, outperforming the existing time-domain Trojan detection approaches by 38% in Trojan detection rate. Our invariance checking based Trojan detection method leverages the invariance among the 3D communication infrastructure, 3D network-on-chips (NoCs), to tackle the cross-tier 3D hardware Trojans, achieving a Trojan detection rate of over 94%. Furthermore, this work investigates another type of common security threat, side-channel attacks. We first propose to group the supply voltages of different 3D tiers temporally to drive the crypto unit implemented in 3D ICs such that the noise in power distribution network (PDN) can be induced to obfuscate the original power traces and thus mitigates correlation power analysis (CPA) attacks. Furthermore, we study the side-channel attack on the logic locking mechanism in monolithic 3D ICs and propose a logic-cone conjunction (LCC) method and a configuration guideline for the transistor-level logic locking to strengthen its resilience against CPA attacks
Product assurance technology for procuring reliable, radiation-hard, custom LSI/VLSI electronics
Advanced measurement methods using microelectronic test chips are described. These chips are intended to be used in acquiring the data needed to qualify Application Specific Integrated Circuits (ASIC's) for space use. Efforts were focused on developing the technology for obtaining custom IC's from CMOS/bulk silicon foundries. A series of test chips were developed: a parametric test strip, a fault chip, a set of reliability chips, and the CRRES (Combined Release and Radiation Effects Satellite) chip, a test circuit for monitoring space radiation effects. The technical accomplishments of the effort include: (1) development of a fault chip that contains a set of test structures used to evaluate the density of various process-induced defects; (2) development of new test structures and testing techniques for measuring gate-oxide capacitance, gate-overlap capacitance, and propagation delay; (3) development of a set of reliability chips that are used to evaluate failure mechanisms in CMOS/bulk: interconnect and contact electromigration and time-dependent dielectric breakdown; (4) development of MOSFET parameter extraction procedures for evaluating subthreshold characteristics; (5) evaluation of test chips and test strips on the second CRRES wafer run; (6) two dedicated fabrication runs for the CRRES chip flight parts; and (7) publication of two papers: one on the split-cross bridge resistor and another on asymmetrical SRAM (static random access memory) cells for single-event upset analysis
GraphR: Accelerating Graph Processing Using ReRAM
This paper presents GRAPHR, the first ReRAM-based graph processing
accelerator. GRAPHR follows the principle of near-data processing and explores
the opportunity of performing massive parallel analog operations with low
hardware and energy cost. The analog computation is suit- able for graph
processing because: 1) The algorithms are iterative and could inherently
tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and
Collaborative Filtering) and typical graph algorithms involving integers (e.g.,
BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a
vertex program of a graph algorithm can be expressed in sparse matrix vector
multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We
show that this assumption is generally true for a large set of graph
algorithms. GRAPHR is a novel accelerator architecture consisting of two
components: memory ReRAM and graph engine (GE). The core graph computations are
performed in sparse matrix format in GEs (ReRAM crossbars). The
vector/matrix-based graph computation is not new, but ReRAM offers the unique
opportunity to realize the massive parallelism with unprecedented energy
efficiency and low hardware cost. With small subgraphs processed by GEs, the
gain of performing parallel operations overshadows the wastes due to sparsity.
The experiment results show that GRAPHR achieves a 16.01x (up to 132.67x)
speedup and a 33.82x energy saving on geometric mean compared to a CPU baseline
system. Com- pared to GPU, GRAPHR achieves 1.69x to 2.19x speedup and consumes
4.77x to 8.91x less energy. GRAPHR gains a speedup of 1.16x to 4.12x, and is
3.67x to 10.96x more energy efficiency compared to PIM-based architecture.Comment: Accepted to HPCA 201
- …