16 research outputs found

    Generic low power reconfigurable distributed arithmetic processor

    Get PDF
    Higher performance, lower cost, increasingly minimizing integrated circuit components, and higher packaging density of chips are ongoing goals of the microelectronic and computer industry. As these goals are being achieved, however, power consumption and flexibility are increasingly becoming bottlenecks that need to be addressed with the new technology in Very Large-Scale Integrated (VLSI) design. For modern systems, more energy is required to support the powerful computational capability which accords with the increasing requirements, and these requirements cause the change of standards not only in audio and video broadcasting but also in communication such as wireless connection and network protocols. Powerful flexibility and low consumption are repellent, but their combination in one system is the ultimate goal of designers. A generic domain-specific low-power reconfigurable processor for the distributed arithmetic algorithm is presented in this dissertation. This domain reconfigurable processor features high efficiency in terms of area, power and delay, which approaches the performance of an ASIC design, while retaining the flexibility of programmable platforms. The architecture not only supports typical distributed arithmetic algorithms which can be found in most still picture compression standards and video conferencing standards, but also offers implementation ability for other distributed arithmetic algorithms found in digital signal processing, telecommunication protocols and automatic control. In this processor, a simple reconfigurable low power control unit is implemented with good performance in area, power and timing. The generic characteristic of the architecture makes it applicable for any small and medium size finite state machines which can be used as control units to implement complex system behaviour and can be found in almost all engineering disciplines. Furthermore, to map target applications efficiently onto the proposed architecture, a new algorithm is introduced for searching for the best common sharing terms set and it keeps the area and power consumption of the implementation at low level. The software implementation of this algorithm is presented, which can be used not only for the proposed architecture in this dissertation but also for all the implementations with adder-based distributed arithmetic algorithms. In addition, some low power design techniques are applied in the architecture, such as unsymmetrical design style including unsymmetrical interconnection arranging, unsymmetrical PTBs selection and unsymmetrical mapping basic computing units. All these design techniques achieve extraordinary power consumption saving. It is believed that they can be extended to more low power designs and architectures. The processor presented in this dissertation can be used to implement complex, high performance distributed arithmetic algorithms for communication and image processing applications with low cost in area and power compared with the traditional methods

    Embedded electronic systems driven by run-time reconfigurable hardware

    Get PDF
    Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

    Single event upset hardened embedded domain specific reconfigurable architecture

    Get PDF

    Evaluation of Alternative Field Buses for Lighting ControlApplications

    Full text link

    Programmable flexible cores for SoC applications

    Get PDF
    Tese de mestrado. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200

    Towards the development of flexible, reliable, reconfigurable, and high-performance imaging systems

    Get PDF
    Current FPGAs can implement large systems because of the high density of reconfigurable logic resources in a single chip. FPGAs are comprehensive devices that combine flexibility and high performance in the same platform compared to other platform such as General-Purpose Processors (GPPs) and Application Specific Integrated Circuits (ASICs). The flexibility of modern FPGAs is further enhanced by introducing Dynamic Partial Reconfiguration (DPR) feature, which allows for changing the functionality of part of the system while other parts are functioning. FPGAs became an important platform for digital image processing applications because of the aforementioned features. They can fulfil the need of efficient and flexible platforms that execute imaging tasks efficiently as well as the reliably with low power, high performance and high flexibility. The use of FPGAs as accelerators for image processing outperforms most of the current solutions. Current FPGA solutions can to load part of the imaging application that needs high computational power on dedicated reconfigurable hardware accelerators while other parts are working on the traditional solution to increase the system performance. Moreover, the use of the DPR feature enhances the flexibility of image processing further by swapping accelerators in and out at run-time. The use of fault mitigation techniques in FPGAs enables imaging applications to operate in harsh environments following the fact that FPGAs are sensitive to radiation and extreme conditions. The aim of this thesis is to present a platform for efficient implementations of imaging tasks. The research uses FPGAs as the key component of this platform and uses the concept of DPR to increase the performance, flexibility, to reduce the power dissipation and to expand the cycle of possible imaging applications. In this context, it proposes the use of FPGAs to accelerate the Image Processing Pipeline (IPP) stages, the core part of most imaging devices. The thesis has a number of novel concepts. The first novel concept is the use of FPGA hardware environment and DPR feature to increase the parallelism and achieve high flexibility. The concept also increases the performance and reduces the power consumption and area utilisation. Based on this concept, the following implementations are presented in this thesis: An implementation of Adams Hamilton Demosaicing algorithm for camera colour interpolation, which exploits the FPGA parallelism to outperform other equivalents. In addition, an implementation of Automatic White Balance (AWB), another IPP stage that employs DPR feature to prove the mentioned novelty aspects. Another novel concept in this thesis is presented in chapter 6, which uses DPR feature to develop a novel flexible imaging system that requires less logic and can be implemented in small FPGAs. The system can be employed as a template for any imaging application with no limitation. Moreover, discussed in this thesis is a novel reliable version of the imaging system that adopts novel techniques including scrubbing, Built-In Self Test (BIST), and Triple Modular Redundancy (TMR) to detect and correct errors using the Internal Configuration Access Port (ICAP) primitive. These techniques exploit the datapath-based nature of the implemented imaging system to improve the system's overall reliability. The thesis presents a proposal for integrating the imaging system with the Robust Reliable Reconfigurable Real-Time Heterogeneous Operating System (R4THOS) to get the best out of the system. The proposal shows the suitability of the proposed DPR imaging system to be used as part of the core system of autonomous cars because of its unbounded flexibility. These novel works are presented in a number of publications as shown in section 1.3 later in this thesis

    Development of FPGA-based High-Speed serial links for High Energy Physics Experiments

    Get PDF
    High Energy Physics (HEP) experiments generate high volumes of data which need to be transferred over long distance. Then, for data read out, reliable and high-speed links are necessary. Over the years, due to their extreme high bandwidth, serial links (especially optical) have been preferred over the parallel ones. So that, now, high-speed serial links are commonly used in Trigger and Data Acquisition (TDAQ) systems of HEP experiments, not only for data transfer, but also for the distribution of trigger and control systems. Examples of their wide use can be found at CERN, where each of the four big experiments mounted on the Large Hadron Collider (LHC) uses a huge amount of serial links in its read out system. Again at LHC, the Timing, Trigger and Control system (TTC), which broadcasts the timing signals, from the LHC machine to the experiments, uses optical serial link to distribute signals over kilometers of distance (diameter of LHC is 27 Km). Also for upgrades of LHC, physical layer components and protocol chips (ASIC) have been designed and are now under development: the Versatile Link and the GBT protocol (and ASICs) whose peculiarity relies in their radiation hardness. This PhD project is intended to respond to the requests of HEP experiments, developing: - a high-speed self-adapting serial link, which can be easily used in different application fields; - the serial interface of a read out board in the end-cap region of ATLAS Experiment at LHC; - the interface board for the barrel read out system of the ATLAS Experiments. Both the two last projects have required the development of fixed latency, high-speed serial links. In order to take advantage of flexibility, re-programmability and system integration of SRAM-based Field Programmable Gate Array devices (FPGAs), their serializer-deserializer (SERDES) embedded modules have been chosen for the development of the links. However, as a drawback, FPGA embedded SERDESes are typically designed for applications that do not require a deterministic latenc. Then, an accurate study of their architecture has been necessary, in order to find a configuration and a clocking scheme to guarantee a deterministic transmission delay in data transfers. The frequency agile, auto-adaptive serial link is capable to analyze the incoming data stream, by scanning the Unit Interval, and to find the highest transmission line rate, according to a given tolerated Bit Error Ratio (BER). It uses a new feature (RX eye margin analysis) of the RX side of the Xilinx 7 series FPGAs high-speed transceivers (GTX/GTH), in order to measure and display the receiver eye margin after the equalizer. When the new eye scan functionality is running, an additional sampler is activated in the GTX. It acquires a new sample (Offset Sample), with programmable (horizontal and vertical) offsets from the data sample point (Data Sample) used in standard operation. An eye scan measurement run is performed by acquiring a large number of Data Samples (which can range from tens of thousands to 1014 or more) and by counting the number of times the Offset Sample has a different value with respect to the Data Sample; the latter number is often called Error Count. The BER at a specific vertical and horizontal offset is given by the ratio between the Error Count and the Sample Count. By repeating the eye scan measurement for each horizontal and vertical offset in the Unit Interval (or in a part of the U.I.) a 2-D BER map can be produced which is usually called Statistical Eye. The auto-adaptive derail ink is designed around an FPGA-embedded microprocessor, which drives the programmable ports of the GTX, in order to perform a 2-D eye-scan, and takes care of the reconfiguration of the GTX parameters, in order to fully benefit from the available link bandwidth. Xilinx provides a standalone tool that allows performing the Eye Scan Analysis on the receiver side of the GTX/GTH transceiver, using the MicroBlaze Micro Controller System macro; the toolkit also includes the Eye Scan algorithm (providing the C code). Moreover, Xilinx supplies the hardware sources files for the implementation of a link based on the XAUI protocol, in which the GTXs are arranged in a loopback configuration. The original contribution of this work consists in the build-up, design and optimization of a full architecture, on top of the basic Xilinx tool, which: - drives the programmable ports of the GTX in order to modify the line rate of the link; - runs consecutive eye scans for various line rate; - analyses the results of the different scans, in order to find the maximum line rate sustainable by the link; - manages the synchronization between the transmitter and the receiver of the link, that will be needed at each line rate change. The application can be deployed as a monitoring tool in HEP experiments, in order to remotely monitor a transmission system or detect issues in the serial link physical layer. An application example could be some of the many experiments at Large Hadron Collider (LHC) at CERN, which have been intensively using different serial links, both for transmission of TTC signals and for trigger and data readout. Besides, this solution could be easily adapted in wide, different frameworks, as it can be used on top of any user’s existing link, as it has no specific requirement about link specification or protocol. The other two serial interface developed in this project are in the framework of the ATLAS experiment. ATLAS is one of the four detectors installed on the LHC proton-proton collider built at CERN. It was designed to collide two opposing particle beams at an energy of 14 TeV and to reach a luminosity of 1034 cm-2/s. In order to reach the design parameters, the LHC system will be upgraded in several phases. In order to take advantage of the improved LHC operation, the ATLAS detector must be upgraded following the same schedule as the LHC upgrade. The main focus of the Phase-I ATLAS upgrade (to be completed by 2018) is on the Level-1 trigger where upgrades are planned for both the muon and the calorimeter trigger systems. In particular, for the end-cap region of the muon spectrometer, the installation of a new set of precision tracking and trigger detectors was approved, called the ‘New Small Wheels’ (NSW). It will be instrumented with micro-mesh gaseous structure detectors (MM) and small-strip Thin Gap Chambers (sTGC). These detectors will solve two points of particular importance at high luminosity: high rate of fake high-pt level-1 muon triggers, and high L1 muon rate with the current momentum threshold. With the introduction of new detectors, new electronics need to be developed, in particular new trigger electronics for both the MM and sTGC. I was involved in the development of serial interface of the FPGA-based sTGC trigger board that uses information from the coarse sTGC readout pads. The sTGC pad trigger board receives serial information coming from 24 front-end chips at 4.8 Gb/s. On the board, data are deserialised, aligned and analyzed by the trigger algorithm. The trigger logic processes the data and choses two candidates at each Bunch Crossing. The result is then serialised and used for selective fine-grained strip readout. I developed the pad trigger board interface logic. The data format from the front-end chips has been agreed upon, and defines the requirements on the receiver and decoding logic. The number of output lines is 24 and the data are 8B/10B formatted. While the receiver uses the Xilinx Kintex-7 GTX transceivers, the output lines are driven by double data rate (DDR) shift registers at 640 Mb/s. A fixed latency in the sTGC trigger chain was guaranteed through the implementation and configuration of all serialisers and deserialisers. In order to test the project, I also developed a simple microprocessor-based protocol for accessing the board via terminal (rs232). A demonstrator board is now being developed. Another Phase-I Level-1 trigger upgrade consists of a new Muon to Central Trigger Processor Interface (MUCTPI). The MUCTPI receives muon candidate information from each of the muon detectors, selects muon candidates and sends them to the Central Trigger Processor (CTP). In the first runs of ATLAS, the L1 Barrel trigger candidate data were transferred to the MuCTPI via copper cables. In order to cope with the trigger upgrade, serial optical links are necessary. The optical links will provide a much higher bandwidth (up to 6.4 Gb/s) which will be used to transfer additional information from the sector logic modules, for example data for more than two muon candidates. They will also provide a lower transmission latency. I developed the interface board between the new MUCTPI and the Resistive Plate Chambers (RPC) muon trigger, using the Xilinx Artix-7 FPGA GTP transceivers. I took care of the study of feasibility of the new serial optical transmitter and the logic for the new data format. Also in this case, the fixed latency has been a requirement to be fulfilled

    New Hardware Architecture for Low-Cost Functional Test Systems Applications to HDMI generation

    Get PDF
    English: Development of a new test hardware architecture for functional test systems. Development of a proof-of-concept prototype for HDMI generation.Castellano: Desarrollo de una nueva arquitectura para equipos de test destinados a máquinas de test funcional de PCBs. Desarrollo de un prototipo de demostración destinado a la generación de HDMI.Català: Desenvolupament d'una nova arquitectura per equips de test destinats a màquines de test funcional de PCB. Desenvolupament d'un prototip de demostració destinat a generació d'HDM
    corecore