284 research outputs found
Using embedded hardware monitor cores in critical computer systems
The integration of FPGA devices in many different architectures and services
makes monitoring and real time detection of errors an important concern in FPGA
system design. A monitor is a tool, or a set of tools, that facilitate analytic
measurements in observing a given system. The goal of these observations is
usually the performance analysis and optimisation, or the surveillance of the system.
However, System-on-Chip (SoC) based designs leave few points to attach external
tools such as logic analysers. Thus, an embedded error detection core that allows
observation of critical system nodes (such as processor cores and buses) should
enforce the operation of the FPGA-based system, in order to prevent system
failures. The core should not interfere with system performance and must ensure
timely detection of errors.
This thesis is an investigation onto how a robust hardware-monitoring module
can be efficiently integrated in a target PCI board (with FPGA-based application processing
features) which is part of a critical computing system. [Continues.
Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis
Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions
Decompose and Conquer: Addressing Evasive Errors in Systems on Chip
Modern computer chips comprise many components, including microprocessor cores, memory modules, on-chip networks, and accelerators. Such system-on-chip (SoC) designs are deployed in a variety of computing devices: from internet-of-things, to smartphones, to personal computers, to data centers. In this dissertation, we discuss evasive errors in SoC designs and how these errors can be addressed efficiently. In particular, we focus on two types of errors: design bugs and permanent faults. Design bugs originate from the limited amount of time allowed for design verification and validation. Thus, they are often found in functional features that are rarely activated. Complete functional verification, which can eliminate design bugs, is extremely time-consuming, thus impractical in modern complex SoC designs. Permanent faults are caused by failures of fragile transistors in nano-scale semiconductor manufacturing processes. Indeed, weak transistors may wear out unexpectedly within the lifespan of the design. Hardware structures that reduce the occurrence of permanent faults incur significant silicon area or performance overheads, thus they are infeasible for most cost-sensitive SoC designs.
To tackle and overcome these evasive errors efficiently, we propose to leverage the principle of decomposition to lower the complexity of the software analysis or the hardware structures involved. To this end, we present several decomposition techniques, specific to major SoC components. We first focus on microprocessor cores, by presenting a lightweight bug-masking analysis that decomposes a program into individual instructions to identify if a design bug would be masked by the program's execution. We then move to memory subsystems: there, we offer an efficient memory consistency testing framework to detect buggy memory-ordering behaviors, which decomposes the memory-ordering graph into small components based on incremental differences. We also propose a microarchitectural patching solution for memory subsystem bugs, which augments each core node with a small distributed programmable logic, instead of including a global patching module. In the context of on-chip networks, we propose two routing reconfiguration algorithms that bypass faulty network resources. The first computes short-term routes in a distributed fashion, localized to the fault region. The second decomposes application-aware routing computation into simple routing rules so to quickly find deadlock-free, application-optimized routes in a fault-ridden network. Finally, we consider general accelerator modules in SoC designs. When a system includes many accelerators, there are a variety of interactions among them that must be verified to catch buggy interactions. To this end, we decompose such inter-module communication into basic interaction elements, which can be reassembled into new, interesting tests.
Overall, we show that the decomposition of complex software algorithms and hardware structures can significantly reduce overheads: up to three orders of magnitude in the bug-masking analysis and the application-aware routing, approximately 50 times in the routing reconfiguration latency, and 5 times on average in the memory-ordering graph checking. These overhead reductions come with losses in error coverage: 23% undetected bug-masking incidents, 39% non-patchable memory bugs, and occasionally we overlook rare patterns of multiple faults. In this dissertation, we discuss the ideas and their trade-offs, and present future research directions.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147637/1/doowon_1.pd
The 1991 3rd NASA Symposium on VLSI Design
Papers from the symposium are presented from the following sessions: (1) featured presentations 1; (2) very large scale integration (VLSI) circuit design; (3) VLSI architecture 1; (4) featured presentations 2; (5) neural networks; (6) VLSI architectures 2; (7) featured presentations 3; (8) verification 1; (9) analog design; (10) verification 2; (11) design innovations 1; (12) asynchronous design; and (13) design innovations 2
VLSI design of configurable low-power coarse-grained array architecture
Biomedical signal acquisition from in- or on-body sensors often requires local (on-node) low-level pre-processing before the data are sent to a remote node for aggregation and further processing. Local processing is required for many different operations, which include signal cleanup (noise removal), sensor calibration, event detection and data compression. In this environment, processing is subject to aggressive energy consumption restrictions, while often operating under real-time requirements. These conflicting requirements impose the use of dedicated circuits addressing a very specific task or the use of domain-specific customization to obtain significant gains in power efficiency. However, economic and time-to-market constraints often make the development or use of application-specific platforms very risky.One way to address these challenges is to develop a sensor node with a general-purpose architecture combining a low-power, low-performance general microprocessor or micro-controller with a coarse-grained reconfigurable array (CGRA) acting as an accelerator. A CGRA consists of a fixed number of processing units (e.g., ALUs) whose function and interconnections are determined by some configuration data.The objective of this work is to create an RTL-level description of a low-power CGRA of ALUs and produce a low-power VLSI (standard cell) implementation, that supports power-saving features.The CGRA implementation should use as few resources as possible and fully exploit the intended operation environment. The design will be evaluated with a set of simple signal processing task
Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip
The sustained demand for faster, more powerful chips has been met by the
availability of chip manufacturing processes allowing for the integration of increasing
numbers of computation units onto a single die. The resulting outcome,
especially in the embedded domain, has often been called SYSTEM-ON-CHIP
(SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC).
MPSoC design brings to the foreground a large number of challenges, one of
the most prominent of which is the design of the chip interconnection. With a
number of on-chip blocks presently ranging in the tens, and quickly approaching
the hundreds, the novel issue of how to best provide on-chip communication
resources is clearly felt.
NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable
answer to this design concern. By bringing large-scale networking concepts to
the on-chip domain, they guarantee a structured answer to present and future
communication requirements. The point-to-point connection and packet switching
paradigms they involve are also of great help in minimizing wiring overhead
and physical routing issues. However, as with any technology of recent inception,
NoC design is still an evolving discipline. Several main areas of interest
require deep investigation for NoCs to become viable solutions:
• The design of the NoC architecture needs to strike the best tradeoff among
performance, features and the tight area and power constraints of the onchip
domain.
• Simulation and verification infrastructure must be put in place to explore,
validate and optimize the NoC performance.
• NoCs offer a huge design space, thanks to their extreme customizability in
terms of topology and architectural parameters. Design tools are needed
to prune this space and pick the best solutions.
• Even more so given their global, distributed nature, it is essential to evaluate
the physical implementation of NoCs to evaluate their suitability for
next-generation designs and their area and power costs.
This dissertation performs a design space exploration of network-on-chip architectures,
in order to point-out the trade-offs associated with the design of
each individual network building blocks and with the design of network topology
overall. The design space exploration is preceded by a comparative analysis
of state-of-the-art interconnect fabrics with themselves and with early networkon-
chip prototypes. The ultimate objective is to point out the key advantages
that NoC realizations provide with respect to state-of-the-art communication
infrastructures and to point out the challenges that lie ahead in order to make
this new interconnect technology come true. Among these latter, technologyrelated
challenges are emerging that call for dedicated design techniques at all
levels of the design hierarchy. In particular, leakage power dissipation, containment
of process variations and of their effects. The achievement of the above
objectives was enabled by means of a NoC simulation environment for cycleaccurate
modelling and simulation and by means of a back-end facility for the
study of NoC physical implementation effects. Overall, all the results provided
by this work have been validated on actual silicon layout
Automated and Reliable Low-Complexity SoC Design Methodology for EEG Artefacts Removal
EEG is a non-invasive tool for neurodevelopmental disorder diagnosis (NDD) and treatment. However,
EEG signal is mixed with other biological signals including Ocular and Muscular artefacts making
it difficult to extract the diagnostic features. Therefore, the contaminated EEG channels are often discarded
by the medical practitioners which may result in less accurate diagnosis. Independent Component
Analysis (ICA) and wavelet-based algorithms require reference electrodes, which will create discomfort
to the patient/children and cause hindrance to the diagnosis of the NDD and Brain Computer Interface
(BCI). Therefore, it would be ideal if these artefacts can be removed real time and on hardware platform
in an automated fashion and denoised EEG can be used for online diagnosis in a pervasive personalised
healthcare environment without the need of any reference electrode. In this thesis we propose a reliable,
robust and automated methodology to solve the aforementioned problem and its subsequent hardware
implementation results are also presented. 100 EEG data from Physionet, Klinik fur Epileptologie, Universitat
Bonn, Germany, Caltech EEG databases and 3 EEG data from 3 subjects from University of
Southampton, UK have been studied and nine exhaustive case studies comprising of real and simulated
data have been formulated and tested. The performance of the proposed methodology is measured in
terms of correlation, regression and R-square statistics and the respective values lie above 80%, 79% and
65% with the gain in hardware complexity of 64.28% and hardware delay 53.58% compared to state-ofthe
art approach. We believe the proposed methodology would be useful in next generation of pervasive
healthcare for BCI and NDD diagnosis and treatment
- …