1,361 research outputs found

    RITSim: distributed systemC simulation

    Get PDF
    Parallel or distributed simulation is becoming more than a novel way to speedup design evaluation; it is becoming necessary for simulating modern processors in a reasonable timeframe. As architectural features become faster, smaller, and more complex, designers are interested in obtaining detailed and accurate performance and power estimations. Uniprocessor simulators may not be able to meet such demands. The RITSim project uses SystemC to model a processor microarchitecture and memory subsystem in great detail. SystemC is a C++ library built on a discrete-event simulation kernel. Many projects have successfully implemented parallel discrete-event simulation (PDES) frameworks to distribute simulation among several hosts. The field promises significant simulation speedup, possibly leading to faster turnaround time in design space exploration and commercial production. However, parallel implementation of such simulators is not an easy task. It requires modification of the simulation kernel for effective partitioning and synchronization. This thesis explores PDES techniques and presents a distributed version of the SystemC simulation environment. With minimal user interaction, SystemC models can executed on a cluster of workstations using a message-passing library such as the Message Passing Interface (MPI). The implementation is designed for transparency; distribution and synchronization happen with little intervention by the model author. Modification of SystemC is fashioned to promote maintainability with future releases. Furthermore, only freely available libraries are used for maximum flexibility and portability

    Automated Debugging Methodology for FPGA-based Systems

    Get PDF
    Electronic devices make up a vital part of our lives. These are seen from mobiles, laptops, computers, home automation, etc. to name a few. The modern designs constitute billions of transistors. However, with this evolution, ensuring that the devices fulfill the designer’s expectation under variable conditions has also become a great challenge. This requires a lot of design time and effort. Whenever an error is encountered, the process is re-started. Hence, it is desired to minimize the number of spins required to achieve an error-free product, as each spin results in loss of time and effort. Software-based simulation systems present the main technique to ensure the verification of the design before fabrication. However, few design errors (bugs) are likely to escape the simulation process. Such bugs subsequently appear during the post-silicon phase. Finding such bugs is time-consuming due to inherent invisibility of the hardware. Instead of software simulation of the design in the pre-silicon phase, post-silicon techniques permit the designers to verify the functionality through the physical implementations of the design. The main benefit of the methodology is that the implemented design in the post-silicon phase runs many order-of-magnitude faster than its counterpart in pre-silicon. This allows the designers to validate their design more exhaustively. This thesis presents five main contributions to enable a fast and automated debugging solution for reconfigurable hardware. During the research work, we used an obstacle avoidance system for robotic vehicles as a use case to illustrate how to apply the proposed debugging solution in practical environments. The first contribution presents a debugging system capable of providing a lossless trace of debugging data which permits a cycle-accurate replay. This methodology ensures capturing permanent as well as intermittent errors in the implemented design. The contribution also describes a solution to enhance hardware observability. It is proposed to utilize processor-configurable concentration networks, employ debug data compression to transmit the data more efficiently, and partially reconfiguring the debugging system at run-time to save the time required for design re-compilation as well as preserve the timing closure. The second contribution presents a solution for communication-centric designs. Furthermore, solutions for designs with multi-clock domains are also discussed. The third contribution presents a priority-based signal selection methodology to identify the signals which can be more helpful during the debugging process. A connectivity generation tool is also presented which can map the identified signals to the debugging system. The fourth contribution presents an automated error detection solution which can help in capturing the permanent as well as intermittent errors without continuous monitoring of debugging data. The proposed solution works for designs even in the absence of golden reference. The fifth contribution proposes to use artificial intelligence for post-silicon debugging. We presented a novel idea of using a recurrent neural network for debugging when a golden reference is present for training the network. Furthermore, the idea was also extended to designs where golden reference is not present

    Desynchronization: Synthesis of asynchronous circuits from synchronous specifications

    Get PDF
    Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur

    Ono: an open platform for social robotics

    Get PDF
    In recent times, the focal point of research in robotics has shifted from industrial ro- bots toward robots that interact with humans in an intuitive and safe manner. This evolution has resulted in the subfield of social robotics, which pertains to robots that function in a human environment and that can communicate with humans in an int- uitive way, e.g. with facial expressions. Social robots have the potential to impact many different aspects of our lives, but one particularly promising application is the use of robots in therapy, such as the treatment of children with autism. Unfortunately, many of the existing social robots are neither suited for practical use in therapy nor for large scale studies, mainly because they are expensive, one-of-a-kind robots that are hard to modify to suit a specific need. We created Ono, a social robotics platform, to tackle these issues. Ono is composed entirely from off-the-shelf components and cheap materials, and can be built at a local FabLab at the fraction of the cost of other robots. Ono is also entirely open source and the modular design further encourages modification and reuse of parts of the platform

    Manticore: Hardware-Accelerated RTL Simulation with Static Bulk-Synchronous Parallelism

    Full text link
    The demise of Moore's Law and Dennard Scaling has revived interest in specialized computer architectures and accelerators. Verification and testing of this hardware heavily uses cycle-accurate simulation of register-transfer-level (RTL) designs. The best software RTL simulators can simulate designs at 1--1000~kHz, i.e., more than three orders of magnitude slower than hardware. Faster simulation can increase productivity by speeding design iterations and permitting more exhaustive exploration. One possibility is to use parallelism as RTL exposes considerable fine-grain concurrency. However, state-of-the-art RTL simulators generally perform best when single-threaded since modern processors cannot effectively exploit fine-grain parallelism. This work presents Manticore: a parallel computer designed to accelerate RTL simulation. Manticore uses a static bulk-synchronous parallel (BSP) execution model to eliminate runtime synchronization barriers among many simple processors. Manticore relies entirely on its compiler to schedule resources and communication. Because RTL code is practically free of long divergent execution paths, static scheduling is feasible. Communication and synchronization no longer incur runtime overhead, enabling efficient fine-grain parallelism. Moreover, static scheduling dramatically simplifies the physical implementation, significantly increasing the potential parallelism on a chip. Our 225-core FPGA prototype running at 475 MHz outperforms a state-of-the-art RTL simulator on an Intel Xeon processor running at ≈\approx 3.3 GHz by up to 27.9×\times (geomean 5.3×\times) in nine Verilog benchmarks

    A Cross-level Verification Methodology for Digital IPs Augmented with Embedded Timing Monitors

    Get PDF
    Smart systems are characterized by the integration in a single device of multi-domain subsystems of different technological domains, namely, analog, digital, discrete and power devices, MEMS, and power sources. Such challenges, emerging from the heterogeneous nature of the whole system, combined with the traditional challenges of digital design, directly impact on performance and on propagation delay of digital components. This article proposes a design approach to enhance the RTL model of a given digital component for the integration in smart systems with the automatic insertion of delay sensors, which can detect and correct timing failures. The article then proposes a methodology to verify such added features at system level. The augmented model is abstracted to SystemC TLM, which is automatically injected with mutants (i.e., code mutations) to emulate delays and timing failures. The resulting TLM model is finally simulated to identify timing failures and to verify the correctness of the inserted delay monitors. Experimental results demonstrate the applicability of the proposed design and verification methodology, thanks to an efficient sensor-aware abstraction methodology, by applying the flow to three complex case studies

    Fault-based Analysis of Industrial Cyber-Physical Systems

    Get PDF
    The fourth industrial revolution called Industry 4.0 tries to bridge the gap between traditional Electronic Design Automation (EDA) technologies and the necessity of innovating in many indus- trial fields, e.g., automotive, avionic, and manufacturing. This complex digitalization process in- volves every industrial facility and comprises the transformation of methodologies, techniques, and tools to improve the efficiency of every industrial process. The enhancement of functional safety in Industry 4.0 applications needs to exploit the studies related to model-based and data-driven anal- yses of the deployed Industrial Cyber-Physical System (ICPS). Modeling an ICPS is possible at different abstraction levels, relying on the physical details included in the model and necessary to describe specific system behaviors. However, it is extremely complicated because an ICPS is com- posed of heterogeneous components related to different physical domains, e.g., digital, electrical, and mechanical. In addition, it is also necessary to consider not only nominal behaviors but even faulty behaviors to perform more specific analyses, e.g., predictive maintenance of specific assets. Nevertheless, these faulty data are usually not present or not available directly from the industrial machinery. To overcome these limitations, constructing a virtual model of an ICPS extended with different classes of faults enables the characterization of faulty behaviors of the system influenced by different faults. In literature, these topics are addressed with non-uniformly approaches and with the absence of standardized and automatic methodologies for describing and simulating faults in the different domains composing an ICPS. This thesis attempts to overcome these state-of-the-art gaps by proposing novel methodologies, techniques, and tools to: model and simulate analog and multi-domain systems; abstract low-level models to higher-level behavioral models; and monitor industrial systems based on the Industrial Internet of Things (IIOT) paradigm. Specifically, the proposed contributions involve the exten- sion of state-of-the-art fault injection practices to improve the ICPSs safety, the development of frameworks for safety operations automatization, and the definition of a monitoring framework for ICPSs. Overall, fault injection in analog and digital models is the state of the practice to en- sure functional safety, as mentioned in the ISO 26262 standard specific for the automotive field. Starting from state-of-the-art defects defined for analog descriptions, new defects are proposed to enhance the IEEE P2427 draft standard for analog defect modeling and coverage. Moreover, dif- ferent techniques to abstract a transistor-level model to a behavioral model are proposed to speed up the simulation of faulty circuits. Therefore, unlike the electrical domain, there is no extensive use of fault injection techniques in the mechanical one. Thus, extending the fault injection to the mechanical and thermal fields allows for supporting the definition and evaluation of more reliable safety mechanisms. Hence, a taxonomy of mechanical faults is derived from the electrical domain by exploiting the physical analogies. Furthermore, specific tools are built for automatically instru- menting different descriptions with multi-domain faults. The entire work is proposed as a basis for supporting the creation of increasingly resilient and secure ICPS that need to preserve functional safety in any operating context
    • …
    corecore