8,280 research outputs found

    Efficient hardware debugging using parameterized FPGA reconfiguration

    Get PDF
    Functional errors and bugs inadvertently introduced at the RTL stage of the design process are responsible for the largest fraction of silicon IC re-spins. Thus, comprehensive func- tional verification is the key to reduce development costs and to deliver a product in time. The increasing demands for verification led to an increase in FPGA-based tools that perform emulation. These tools can run at much higher operating frequencies and achieve higher coverage than simulation. However, an important pitfall of the FPGA tools is that they suffer from limited internal signal observability, as only a small and preselected set of signals is guided towards (embedded) trace buffers and observed. This paper proposes a dynamically reconfigurable network of multiplexers that significantly enhance the visibility of internal signals. It allows the designer to dynamically change the small set of internal signals to be observed, virtually enlarging the set of observed signals significantly. These multiplexers occupy minimal space, as they are implemented by the FPGA’s routing infrastructure

    A Comprehensive Workflow for General-Purpose Neural Modeling with Highly Configurable Neuromorphic Hardware Systems

    Full text link
    In this paper we present a methodological framework that meets novel requirements emerging from upcoming types of accelerated and highly configurable neuromorphic hardware systems. We describe in detail a device with 45 million programmable and dynamic synapses that is currently under development, and we sketch the conceptual challenges that arise from taking this platform into operation. More specifically, we aim at the establishment of this neuromorphic system as a flexible and neuroscientifically valuable modeling tool that can be used by non-hardware-experts. We consider various functional aspects to be crucial for this purpose, and we introduce a consistent workflow with detailed descriptions of all involved modules that implement the suggested steps: The integration of the hardware interface into the simulator-independent model description language PyNN; a fully automated translation between the PyNN domain and appropriate hardware configurations; an executable specification of the future neuromorphic system that can be seamlessly integrated into this biology-to-hardware mapping process as a test bench for all software layers and possible hardware design modifications; an evaluation scheme that deploys models from a dedicated benchmark library, compares the results generated by virtual or prototype hardware devices with reference software simulations and analyzes the differences. The integration of these components into one hardware-software workflow provides an ecosystem for ongoing preparative studies that support the hardware design process and represents the basis for the maturity of the model-to-hardware mapping software. The functionality and flexibility of the latter is proven with a variety of experimental results

    Evaluating dynamic partial reconfiguration in the integer pipeline of a FPGA-based opensource processor

    Get PDF

    Efficient Implementation on Low-Cost SoC-FPGAs of TLSv1.2 Protocol with ECC_AES Support for Secure IoT Coordinators

    Get PDF
    Security management for IoT applications is a critical research field, especially when taking into account the performance variation over the very different IoT devices. In this paper, we present high-performance client/server coordinators on low-cost SoC-FPGA devices for secure IoT data collection. Security is ensured by using the Transport Layer Security (TLS) protocol based on the TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 cipher suite. The hardware architecture of the proposed coordinators is based on SW/HW co-design, implementing within the hardware accelerator core Elliptic Curve Scalar Multiplication (ECSM), which is the core operation of Elliptic Curve Cryptosystems (ECC). Meanwhile, the control of the overall TLS scheme is performed in software by an ARM Cortex-A9 microprocessor. In fact, the implementation of the ECC accelerator core around an ARM microprocessor allows not only the improvement of ECSM execution but also the performance enhancement of the overall cryptosystem. The integration of the ARM processor enables to exploit the possibility of embedded Linux features for high system flexibility. As a result, the proposed ECC accelerator requires limited area, with only 3395 LUTs on the Zynq device used to perform high-speed, 233-bit ECSMs in 413 ”s, with a 50 MHz clock. Moreover, the generation of a 384-bit TLS handshake secret key between client and server coordinators requires 67.5 ms on a low cost Zynq 7Z007S device

    A scalable hardware and software control apparatus for experiments with hybrid quantum systems

    Get PDF
    Modern experiments with fundamental quantum systems - like ultracold atoms, trapped ions, single photons - are managed by a control system formed by a number of input/output electronic channels governed by a computer. In hybrid quantum systems, where two or more quantum systems are combined and made to interact, establishing an efficient control system is particularly challenging due to the higher complexity, especially when each single quantum system is characterized by a different timescale. Here we present a new control apparatus specifically designed to efficiently manage hybrid quantum systems. The apparatus is formed by a network of fast communicating Field Programmable Gate Arrays (FPGAs), the action of which is administrated by a software. Both hardware and software share the same tree-like structure, which ensures a full scalability of the control apparatus. In the hardware, a master board acts on a number of slave boards, each of which is equipped with an FPGA that locally drives analog and digital input/output channels and radiofrequency (RF) outputs up to 400 MHz. The software is designed to be a general platform for managing both commercial and home-made instruments in a user-friendly and intuitive Graphical User Interface (GUI). The architecture ensures that complex control protocols can be carried out, such as performing of concurrent commands loops by acting on different channels, the generation of multi-variable error functions and the implementation of self-optimization procedures. Although designed for managing experiments with hybrid quantum systems, in particular with atom-ion mixtures, this control apparatus can in principle be used in any experiment in atomic, molecular, and optical physics.Comment: 10 pages, 12 figure

    Smart technologies for effective reconfiguration: the FASTER approach

    Get PDF
    Current and future computing systems increasingly require that their functionality stays flexible after the system is operational, in order to cope with changing user requirements and improvements in system features, i.e. changing protocols and data-coding standards, evolving demands for support of different user applications, and newly emerging applications in communication, computing and consumer electronics. Therefore, extending the functionality and the lifetime of products requires the addition of new functionality to track and satisfy the customers needs and market and technology trends. Many contemporary products along with the software part incorporate hardware accelerators for reasons of performance and power efficiency. While adaptivity of software is straightforward, adaptation of the hardware to changing requirements constitutes a challenging problem requiring delicate solutions. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a complete methodology to allow designers to easily implement a system specification on a platform which includes a general purpose processor combined with multiple accelerators running on an FPGA, taking as input a high-level description and fully exploiting, both at design time and at run time, the capabilities of partial dynamic reconfiguration. The goal is that for selected application domains, the FASTER toolchain will be able to reduce the design and verification time of complex reconfigurable systems providing additional novel verification features that are not available in existing tool flows

    The Octopus switch

    Get PDF
    This chapter1 discusses the interconnection architecture of the Mobile Digital Companion. The approach to build a low-power handheld multimedia computer presented here is to have autonomous, reconfigurable modules such as network, video and audio devices, interconnected by a switch rather than by a bus, and to offload as much as work as possible from the CPU to programmable modules placed in the data streams. Thus, communication between components is not broadcast over a bus but delivered exactly where it is needed, work is carried out where the data passes through, bypassing the memory. The amount of buffering is minimised, and if it is required at all, it is placed right on the data path, where it is needed. A reconfigurable internal communication network switch called Octopus exploits locality of reference and eliminates wasteful data copies. The switch is implemented as a simplified ATM switch and provides Quality of Service guarantees and enough bandwidth for multimedia applications. We have built a testbed of the architecture, of which we will present performance and energy consumption characteristics

    Empowering parallel computing with field programmable gate arrays

    Get PDF
    After more than 30 years, reconïŹgurable computing has grown from a concept to a mature ïŹeld of science and technology. The cornerstone of this evolution is the ïŹeld programmable gate array, a building block enabling the conïŹguration of a custom hardware architecture. The departure from static von Neumannlike architectures opens the way to eliminate the instruction overhead and to optimize the execution speed and power consumption. FPGAs now live in a growing ecosystem of development tools, enabling software programmers to map algorithms directly onto hardware. Applications abound in many directions, including data centers, IoT, AI, image processing and space exploration. The increasing success of FPGAs is largely due to an improved toolchain with solid high-level synthesis support as well as a better integration with processor and memory systems. On the other hand, long compile times and complex design exploration remain areas for improvement. In this paper we address the evolution of FPGAs towards advanced multi-functional accelerators, discuss different programming models and their HLS language implementations, as well as high-performance tuning of FPGAs integrated into a heterogeneous platform. We pinpoint fallacies and pitfalls, and identify opportunities for language enhancements and architectural reïŹnements
    • 

    corecore