22,534 research outputs found

    System analysis and integration studies for a 15-micron horizon radiance measurement experiment

    Get PDF
    Systems analysis and integration studies for 15-micron horizon radiance measurement experimen

    On-Line Dependability Enhancement of Multiprocessor SoCs by Resource Management

    Get PDF
    This paper describes a new approach towards dependable design of homogeneous multi-processor SoCs in an example satellite-navigation application. First, the NoC dependability is functionally verified via embedded software. Then the Xentium processor tiles are periodically verified via on-line self-testing techniques, by using a new IIP Dependability Manager. Based on the Dependability Manager results, faulty tiles are electronically excluded and replaced by fault-free spare tiles via on-line resource management. This integrated approach enables fast electronic fault detection/diagnosis and repair, and hence a high system availability. The dependability application runs in parallel with the actual application, resulting in a very dependable system. All parts have been verified by simulation

    From FPGA to ASIC: A RISC-V processor experience

    Get PDF
    This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC

    The ReaxFF reactive force-field : development, applications and future directions

    Get PDF
    The reactive force-field (ReaxFF) interatomic potential is a powerful computational tool for exploring, developing and optimizing material properties. Methods based on the principles of quantum mechanics (QM), while offering valuable theoretical guidance at the electronic level, are often too computationally intense for simulations that consider the full dynamic evolution of a system. Alternatively, empirical interatomic potentials that are based on classical principles require significantly fewer computational resources, which enables simulations to better describe dynamic processes over longer timeframes and on larger scales. Such methods, however, typically require a predefined connectivity between atoms, precluding simulations that involve reactive events. The ReaxFF method was developed to help bridge this gap. Approaching the gap from the classical side, ReaxFF casts the empirical interatomic potential within a bond-order formalism, thus implicitly describing chemical bonding without expensive QM calculations. This article provides an overview of the development, application, and future directions of the ReaxFF method

    A Test Vector Minimization Algorithm Based On Delta Debugging For Post-Silicon Validation Of Pcie Rootport

    Get PDF
    In silicon hardware design, such as designing PCIe devices, design verification is an essential part of the design process, whereby the devices are subjected to a series of tests that verify the functionality. However, manual debugging is still widely used in post-silicon validation and is a major bottleneck in the validation process. The reason is a large number of tests vectors have to be analyzed, and this slows process down. To solve the problem, a test vector minimizer algorithm is proposed to eliminate redundant test vectors that do not contribute to reproduction of a test failure, hence, improving the debug throughput. The proposed methodology is inspired by the Delta Debugging algorithm which is has been used in automated software debugging but not in post-silicon hardware debugging. The minimizer operates on the principle of binary partitioning of the test vectors, and iteratively testing each subset (or complement of set) on a post-silicon System-Under-Test (SUT), to identify and eliminate redundant test vectors. Test results using test vector sets containing deliberately introduced erroneous test vectors show that the minimizer is able to isolate the erroneous test vectors. In test cases containing up to 10,000 test vectors, the minimizer requires about 16ns per test vector in the test case when only one erroneous test vector is present. In a test case with 1000 vectors including erroneous vectors, the same minimizer requires about 140μs per erroneous test vector that is injected. Thus, the minimizer’s CPU consumption is significantly smaller than the typical amount of time of a test running on SUT. The factors that significantly impact the performance of the algorithm are number of erroneous test vectors and distribution (spacing) of the erroneous vectors. The effect of total number of test vectors and position of the erroneous vectors are relatively minor compared to the other two. The minimization algorithm therefore was most effective for cases where there are only a few erroneous test vectors, with large number of test vectors in the set

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    The STAR MAPS-based PiXeL detector

    Get PDF
    The PiXeL detector (PXL) for the Heavy Flavor Tracker (HFT) of the STAR experiment at RHIC is the first application of the state-of-the-art thin Monolithic Active Pixel Sensors (MAPS) technology in a collider environment. Custom built pixel sensors, their readout electronics and the detector mechanical structure are described in detail. Selected detector design aspects and production steps are presented. The detector operations during the three years of data taking (2014-2016) and the overall performance exceeding the design specifications are discussed in the conclusive sections of this paper

    Decompose and Conquer: Addressing Evasive Errors in Systems on Chip

    Full text link
    Modern computer chips comprise many components, including microprocessor cores, memory modules, on-chip networks, and accelerators. Such system-on-chip (SoC) designs are deployed in a variety of computing devices: from internet-of-things, to smartphones, to personal computers, to data centers. In this dissertation, we discuss evasive errors in SoC designs and how these errors can be addressed efficiently. In particular, we focus on two types of errors: design bugs and permanent faults. Design bugs originate from the limited amount of time allowed for design verification and validation. Thus, they are often found in functional features that are rarely activated. Complete functional verification, which can eliminate design bugs, is extremely time-consuming, thus impractical in modern complex SoC designs. Permanent faults are caused by failures of fragile transistors in nano-scale semiconductor manufacturing processes. Indeed, weak transistors may wear out unexpectedly within the lifespan of the design. Hardware structures that reduce the occurrence of permanent faults incur significant silicon area or performance overheads, thus they are infeasible for most cost-sensitive SoC designs. To tackle and overcome these evasive errors efficiently, we propose to leverage the principle of decomposition to lower the complexity of the software analysis or the hardware structures involved. To this end, we present several decomposition techniques, specific to major SoC components. We first focus on microprocessor cores, by presenting a lightweight bug-masking analysis that decomposes a program into individual instructions to identify if a design bug would be masked by the program's execution. We then move to memory subsystems: there, we offer an efficient memory consistency testing framework to detect buggy memory-ordering behaviors, which decomposes the memory-ordering graph into small components based on incremental differences. We also propose a microarchitectural patching solution for memory subsystem bugs, which augments each core node with a small distributed programmable logic, instead of including a global patching module. In the context of on-chip networks, we propose two routing reconfiguration algorithms that bypass faulty network resources. The first computes short-term routes in a distributed fashion, localized to the fault region. The second decomposes application-aware routing computation into simple routing rules so to quickly find deadlock-free, application-optimized routes in a fault-ridden network. Finally, we consider general accelerator modules in SoC designs. When a system includes many accelerators, there are a variety of interactions among them that must be verified to catch buggy interactions. To this end, we decompose such inter-module communication into basic interaction elements, which can be reassembled into new, interesting tests. Overall, we show that the decomposition of complex software algorithms and hardware structures can significantly reduce overheads: up to three orders of magnitude in the bug-masking analysis and the application-aware routing, approximately 50 times in the routing reconfiguration latency, and 5 times on average in the memory-ordering graph checking. These overhead reductions come with losses in error coverage: 23% undetected bug-masking incidents, 39% non-patchable memory bugs, and occasionally we overlook rare patterns of multiple faults. In this dissertation, we discuss the ideas and their trade-offs, and present future research directions.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147637/1/doowon_1.pd

    Avoiding core's DUE & SDC via acoustic wave detectors and tailored error containment and recovery

    Get PDF
    The trend of downsizing transistors and operating voltage scaling has made the processor chip more sensitive against radiation phenomena making soft errors an important challenge. New reliability techniques for handling soft errors in the logic and memories that allow meeting the desired failures-in-time (FIT) target are key to keep harnessing the benefits of Moore's law. The failure to scale the soft error rate caused by particle strikes, may soon limit the total number of cores that one may have running at the same time. This paper proposes a light-weight and scalable architecture to eliminate silent data corruption errors (SDC) and detected unrecoverable errors (DUE) of a core. The architecture uses acoustic wave detectors for error detection. We propose to recover by confining the errors in the cache hierarchy, allowing us to deal with the relatively long detection latencies. Our results show that the proposed mechanism protects the whole core (logic, latches and memory arrays) incurring performance overhead as low as 0.60%. © 2014 IEEE.Peer ReviewedPostprint (author's final draft
    corecore