939 research outputs found

    Experimental Evaluation and Comparison of Time-Multiplexed Multi-FPGA Routing Architectures

    Get PDF
    Emulating large complex designs require multi-FPGA systems (MFS). However, inter-FPGA communication is confronted by the challenge of lack of interconnect capacity due to limited number of FPGA input/output (I/O) pins. Serializing parallel signals onto a single trace effectively addresses the limited I/O pin obstacle. Besides the multiplexing scheme and multiplexing ratio (number of inter-FPGA signals per trace), the choice of the MFS routing architecture also affect the critical path latency. The routing architecture of an MFS is the interconnection pattern of FPGAs, fixed wires and/or programmable interconnect chips. Performance of existing MFS routing architectures is also limited by off-chip interface selection. In this dissertation we proposed novel 2D and 3D latency-optimized time-multiplexed MFS routing architectures. We used rigorous experimental approach and real sequential benchmark circuits to evaluate and compare the proposed and existing MFS routing architectures. This research provides a new insight into the encouraging effects of using off-chip optical interface and three dimensional MFS routing architectures. The vertical stacking results in shorter off-chip links improving the overall system frequency with the additional advantage of smaller footprint area. The proposed 3D architectures employed serialized interconnect between intra-plane and inter-plane FPGAs to address the pin limitation problem. Additionally, all off-chip links are replaced by optical fibers that exhibited latency improvement and resulted in faster MFS. Results indicated that exploiting third dimension provided latency and area improvements as compared to 2D MFS. We also proposed latency-optimized planar 2D MFS architectures in which electrical interconnections are replaced by optical interface in same spatial distribution. Performance evaluation and comparison showed that the proposed architectures have reduced critical path delay and system frequency improvement as compared to conventional MFS. We also experimentally evaluated and compared the system performance of three inter-FPGA communication schemes i.e. Logic Multiplexing, SERDES and MGT in conjunction with two routing architectures i.e. Completely Connected Graph (CCG) and TORUS. Experimental results showed that SERDES attained maximum frequency than the other two schemes. However, for very high multiplexing ratios, the performance of SERDES & MGT became comparable

    System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing

    Get PDF
    This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications. Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance. This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB. Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy). The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption. Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude

    Application of novel technologies for the development of next generation MR compatible PET inserts

    Get PDF
    Multimodal imaging integrating Positron Emission Tomography and Magnetic Resonance Imaging (PET/MRI) has professed advantages as compared to other available combinations, allowing both functional and structural information to be acquired with very high precision and repeatability. However, it has yet to be adopted as the standard for experimental and clinical applications, due to a variety of reasons mainly related to system cost and flexibility. A hopeful existing approach of silicon photodetector-based MR compatible PET inserts comprised by very thin PET devices that can be inserted in the MRI bore, has been pioneered, without disrupting the market as expected. Technological solutions that exist and can make this type of inserts lighter, cost-effective and more adaptable to the application need to be researched further. In this context, we expand the study of sub-surface laser engraving (SSLE) for scintillators used for PET. Through acquiring, measuring and calibrating the use of a SSLE setting we study the effect of different engraving configurations on detection characteristics of the scintillation light by the photosensors. We demonstrate that apart from cost-effectiveness and ease of application, SSLE treated scintillators have similar spatial resolution and superior sensitivity and packing fraction as compared to standard pixelated arrays, allowing for shorter crystals to be used. Flexibility of design is benchmarked and adoption of honeycomb architecture due to geometrical advantages is proposed. Furthermore, a variety of depth-of-interaction (DoI) designs are engraved and studied, greatly enhancing applicability in small field-of-view tomographs, such as the intended inserts. To adapt to this need, a novel approach for multi-layer DoI characterization has been developed and is demonstrated. Apart from crystal treatment, considerations on signal transmission and processing are addressed. A double time-over-threshold (ToT) method is proposed, using the statistics of noise in order to enhance precision. This method is tested and linearity results demonstrate applicability for multiplexed readout designs. A study on analog optical wireless communication (aOWC) techniques is also performed and proof of concept results presented. Finally, a ToT readout firmware architecture, intended for low-cost FPGAs, has been developed and is described. By addressing the potential development, applicability and merits of a range of transdisciplinary solutions, we demonstrate that with these techniques it is possible to construct lighter, smaller, lower consumption, cost-effective MRI compatible PET inserts. Those designs can make PET/MRI multimodality the dominant clinical and experimental imaging approach, enhancing researcher and physician insight to the mysteries of life.La combinación multimodal de Tomografía por Emisión de Positrones con la Imagen de Resonancia Magnética (PET/MRI, de sus siglas en inglés) tiene clara ventajas en comparación con otras técnicas multimodales actualmente disponibles, dada su capacidad para registrar información funcional e información estructural con mucha precisión y repetibilidad. Sin embargo, esta técnica no acaba de penetrar en la práctica clínica debido en gran parte a alto coste. Las investigaciones que persiguen mejorar el desarrollo de insertos de PET basados en fotodetectores de silicio y compatibles con MRI, aunque han sido intensas y han generado soluciones ingeniosas, todavía no han conseguido encontrar las soluciones que necesita la industria. Sin embargo, existen opciones todavía sin explorar que podrían ayudar a evolucionar este tipo de insertos consiguiendo dispositivos más ligeros, baratos y con mejores prestaciones. Esta tesis profundiza en el estudio de grabación sub-superficie con láser (SSLE) para el diseño de los cristales centelladores usados en los sistemas PET. Para ello hemos caracterizado, medido y calibrado un procedimiento SSLE, y a continuación hemos estudiado el efecto que tienen sobre las especificaciones del detector las diferentes configuraciones del grabado. Demostramos que además de la rentabilidad y facilidad de uso de esta técnica, los centelladores SSLE tienen resolución espacial equivalente y sensibilidad y fracción de empaquetamiento superiores a las matrices de centelleo convencionales, lo que posibilita utilizar cristales más cortos para conseguir la misma sensibilidad. Estos diseños también permiten medir la profundidad de la interacción (DoI), lo que facilita el uso de estos diseños en tomógrafos de radio pequeño, como pueden ser los sistemas preclínicos, los dedicados (cabeza o mama) o los insertos para MRI. Además de trabajar en el tratamiento de cristal de centelleo, hemos considerado nuevas aproximaciones al procesamiento y transmisión de la señal. Proponemos un método innovador de doble medida de tiempo sobre el umbral (ToT) que integra una evaluación de la estadística del ruido con el propósito de mejorar la precisión. El método se ha validado y los resultados demuestran su viabilidad de uso incluso en conjuntos de señales multiplexadas. Un estudio de las técnicas de comunicación óptica analógica e inalámbrica (aOWC) ha permitido el desarrollo de una nueva propuesta para comunicar las señales del detector PET insertado en el gantry a un el procesador de señal externo, técnica que se ha validado en un demostrador. Finalmente, se ha propuesto y demostrado una nueva arquitectura de análisis de señal ToT implementada en firmware en FPGAs de bajo coste. La concepción y desarrollo de estas ideas, así como la evaluación de los méritos de las diferentes soluciones propuestas, demuestran que con estas técnicas es posible construir insertos de PET compatibles con sistemas MRI, que serán más ligeros y compactos, con un reducido consumo y menor coste. De esta forma se contribuye a que la técnica multimodal PET/MRI pueda penetrar en la clínica, mejorando la comprensión que médicos e investigadores puedan alcanzar en su estudio de los misterios de la vida.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Andrés Santos Lleó.- Secretario: Luis Hernández Corporales.- Vocal: Giancarlo Sportell

    Embedded electronic systems driven by run-time reconfigurable hardware

    Get PDF
    Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

    Techniques for low-overhead dynamic partial reconfiguration of FPGAs

    Get PDF

    FPGA dynamic and partial reconfiguration : a survey of architectures, methods, and applications

    Get PDF
    Dynamic and partial reconfiguration are key differentiating capabilities of field programmable gate arrays (FPGAs). While they have been studied extensively in academic literature, they find limited use in deployed systems. We review FPGA reconfiguration, looking at architectures built for the purpose, and the properties of modern commercial architectures. We then investigate design flows, and identify the key challenges in making reconfigurable FPGA systems easier to design. Finally, we look at applications where reconfiguration has found use, as well as proposing new areas where this capability places FPGAs in a unique position for adoption

    Hardware design and CAD for processor-based logic emulation systems.

    Get PDF
    • …
    corecore