246 research outputs found

    Reusing RTL assertion checkers for verification of SystemC TLM models

    Get PDF
    The recent trend towards system-level design gives rise to new challenges for reusing existing RTL intellectual properties (IPs) and their verification environment in TLM. While techniques and tools to abstract RTL IPs into TLM models have begun to appear, the problem of reusing, at TLM, a verification environment originally developed for an RTL IP is still under-explored, particularly when ABV is adopted. Some frameworks have been proposed to deal with ABV at TLM, but they assume a top-down design and verification flow, where assertions are defined ex-novo at TLM level. In contrast, the reuse of existing assertions in an RTL-to-TLM bottom-up design flow has not been analyzed yet, except by using transactors to create a mixed simulation between the TLM design and the RTL checkers corresponding to the assertions. However, the use of transactors may lead to longer verification time due to the need of developing and verifying the transactors themselves. Moreover, the simulation time is negatively affected by the presence of transactors, which slow down the simulation at the speed of the slowest parts (i.e., RTL checkers). This article proposes an alternative methodology that does not require transactors for reusing assertions, originally defined for a given RTL IP, in order to verify the corresponding TLM model. Experimental results have been conducted on benchmarks with different characteristics and complexity to show the applicability and the efficacy of the proposed methodology

    Exploiting Hardware Abstraction for Parallel Programming Framework: Platform and Multitasking

    Get PDF
    With the help of the parallelism provided by the fine-grained architecture, hardware accelerators on Field Programmable Gate Arrays (FPGAs) can significantly improve the performance of many applications. However, designers are required to have excellent hardware programming skills and unique optimization techniques to explore the potential of FPGA resources fully. Intermediate frameworks above hardware circuits are proposed to improve either performance or productivity by leveraging parallel programming models beyond the multi-core era. In this work, we propose the PolyPC (Polymorphic Parallel Computing) framework, which targets enhancing productivity without losing performance. It helps designers develop parallelized applications and implement them on FPGAs. The PolyPC framework implements a custom hardware platform, on which programs written in an OpenCL-like programming model can launch. Additionally, the PolyPC framework extends vendor-provided tools to provide a complete development environment including intermediate software framework, and automatic system builders. Designers\u27 programs can be either synthesized as hardware processing elements (PEs) or compiled to executable files running on software PEs. Benefiting from nontrivial features of re-loadable PEs, and independent group-level schedulers, the multitasking is enabled for both software and hardware PEs to improve the efficiency of utilizing hardware resources. The PolyPC framework is evaluated regarding performance, area efficiency, and multitasking. The results show a maximum 66 times speedup over a dual-core ARM processor and 1043 times speedup over a high-performance MicroBlaze with 125 times of area efficiency. It delivers a significant improvement in response time to high-priority tasks with the priority-aware scheduling. Overheads of multitasking are evaluated to analyze trade-offs. With the help of the design flow, the OpenCL application programs are converted into executables through the front-end source-to-source transformation and back-end synthesis/compilation to run on PEs, and the framework is generated from users\u27 specifications

    Generation of Application Specific Hardware Extensions for Hybrid Architectures: The Development of PIRANHA - A GCC Plugin for High-Level-Synthesis

    Get PDF
    Architectures combining a field programmable gate array (FPGA) and a general-purpose processor on a single chip became increasingly popular in recent years. On the one hand, such hybrid architectures facilitate the use of application specific hardware accelerators that improve the performance of the software on the host processor. On the other hand, it obliges system designers to handle the whole process of hardware/software co-design. The complexity of this process is still one of the main reasons, that hinders the widespread use of hybrid architectures. Thus, an automated process that aids programmers with the hardware/software partitioning and the generation of application specific accelerators is an important issue. The method presented in this thesis neither requires restrictions of the used high-level-language nor special source code annotations. Usually, this is an entry barrier for programmers without deeper understanding of the underlying hardware platform. This thesis introduces a seamless programming flow that allows generating hardware accelerators for unrestricted, legacy C code. The implementation consists of a GCC plugin that automatically identifies application hot-spots and generates hardware accelerators accordingly. Apart from the accelerator implementation in a hardware description language, the compiler plugin provides the generation of a host processor interfaces and, if necessary, a prototypical integration with the host operating system. An evaluation with typical embedded applications shows general benefits of the approach, but also reveals limiting factors that hamper possible performance improvements

    Desynchronization: Synthesis of asynchronous circuits from synchronous specifications

    Get PDF
    Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur

    Automated Exploration of the ASIC Design Space for Minimum Power-Delay-Area Product at the Register Transfer Level

    Get PDF
    Exploring the integrated circuit design space for minimum power-delay-area (PDA) product can be time-consuming and tedious, especially when the target standard-cell library has hundreds of options. In this dissertation, heuristic algorithms that automate this process have been developed, implemented and validated at the reg- ister transfer level. In some cases, the PDA product was 1.9 times better than the initial baseline solution. The parallel search algorithm exhibited 9x speed up when executed on 10 machines simultaneously. These two new methods also characterize the design space for the given RTL code by generating power-delay-area points in addition to the minimum PDA point in case the designer wishes to select a different solution that is a tradeoff among these metrics. As a final step, these two search algorithms are integrated into a fully automated ASIC design flow

    On the Reuse of RTL assertions in Systemc TLM Verification

    Get PDF
    Reuse of existing and already verified intellectual property (IP) models is a key strategy to cope with the com- plexity of designing modern system-on-chips (SoC)s under ever stringent time-to-market requirements. In particular, the recent trend towards system-level design and transaction level modeling (TLM) gives rise to new challenges for reusing existing RTL IPs and their verification environment in TLM-based design flows. While techniques and tools to abstract RTL IPs into TLM models have begun to appear, the problem of reusing, at TLM, a verification environment originally developed for an RTL IP is still underexplored, particularly when assertion-based verification (ABV) is adopted. Some techniques and frameworks have been proposed to deal with ABV at TLM, but they assume a top-down design and verification flow, where assertions are defined ex-novo at TLM level. In contrast, the reuse of existing assertions in an RTL-to-TLM bottom-up design flow has not been analyzed yet. This paper proposes a methodology to reuse assertions originally defined for a given RTL IP, to verify the corresponding TLM model. Experimental results have been conducted on benchmarks of different characteristics and complexity to show the applicability and the efficacy of the proposed methodology

    Correct synthesis and integration of compiler-generated function units

    Get PDF
    PhD ThesisComputer architectures can use custom logic in addition to general pur- pose processors to improve performance for a variety of applications. The use of custom logic allows greater parallelism for some algorithms. While conventional CPUs typically operate on words, ne-grained custom logic can improve e ciency for many bit level operations. The commodi ca- tion of eld programmable devices, particularly FPGAs, has improved the viability of using custom logic in an architecture. This thesis introduces an approach to reasoning about the correctness of compilers that generate custom logic that can be synthesized to provide hardware acceleration for a given application. Compiler intermediate representations (IRs) and transformations that are relevant to genera- tion of custom logic are presented. Architectures may vary in the way that custom logic is incorporated, and suitable abstractions are used in order that the results apply to compilation for a variety of the design parameters that are introduced by the use of custom logic
    • …
    corecore