939 research outputs found
Experimental Evaluation and Comparison of Time-Multiplexed Multi-FPGA Routing Architectures
Emulating large complex designs require multi-FPGA systems (MFS). However, inter-FPGA communication is confronted by the challenge of lack of interconnect capacity due to limited number of FPGA input/output (I/O) pins. Serializing parallel signals onto a single trace effectively addresses the limited I/O pin obstacle. Besides the multiplexing scheme and multiplexing ratio (number of inter-FPGA signals per trace), the choice of the MFS routing architecture also affect the critical path latency. The routing architecture of an MFS is the interconnection pattern of FPGAs, fixed wires and/or programmable interconnect chips. Performance of existing MFS routing architectures is also limited by off-chip interface selection. In this dissertation we proposed novel 2D and 3D latency-optimized time-multiplexed MFS routing architectures. We used rigorous experimental approach and real sequential benchmark circuits to evaluate and compare the proposed and existing MFS routing architectures. This research provides a new insight into the encouraging effects of using off-chip optical interface and three dimensional MFS routing architectures. The vertical stacking results in shorter off-chip links improving the overall system frequency with the additional advantage of smaller footprint area. The proposed 3D architectures employed serialized interconnect between intra-plane and inter-plane FPGAs to address the pin limitation problem. Additionally, all off-chip links are replaced by optical fibers that exhibited latency improvement and resulted in faster MFS. Results indicated that exploiting third dimension provided latency and area improvements as compared to 2D MFS. We also proposed latency-optimized planar 2D MFS architectures in which electrical interconnections are replaced by optical interface in same spatial distribution. Performance evaluation and comparison showed that the proposed architectures have reduced critical path delay and system frequency improvement as compared to conventional MFS. We also experimentally evaluated and compared the system performance of three inter-FPGA communication schemes i.e. Logic Multiplexing, SERDES and MGT in conjunction with two routing architectures i.e. Completely Connected Graph (CCG) and TORUS. Experimental results showed that SERDES attained maximum frequency than the other two schemes. However, for very high multiplexing ratios, the performance of SERDES & MGT became comparable
System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing
This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications.
Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance.
This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB.
Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy).
The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption.
Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude
Application of novel technologies for the development of next generation MR compatible PET inserts
Multimodal imaging integrating Positron Emission Tomography and Magnetic
Resonance Imaging (PET/MRI) has professed advantages as compared to other available
combinations, allowing both functional and structural information to be acquired with
very high precision and repeatability. However, it has yet to be adopted as the standard
for experimental and clinical applications, due to a variety of reasons mainly related to
system cost and flexibility. A hopeful existing approach of silicon photodetector-based MR
compatible PET inserts comprised by very thin PET devices that can be inserted in the
MRI bore, has been pioneered, without disrupting the market as expected. Technological
solutions that exist and can make this type of inserts lighter, cost-effective and more
adaptable to the application need to be researched further.
In this context, we expand the study of sub-surface laser engraving (SSLE) for
scintillators used for PET. Through acquiring, measuring and calibrating the use of a SSLE
setting we study the effect of different engraving configurations on detection
characteristics of the scintillation light by the photosensors. We demonstrate that apart
from cost-effectiveness and ease of application, SSLE treated scintillators have similar
spatial resolution and superior sensitivity and packing fraction as compared to standard
pixelated arrays, allowing for shorter crystals to be used. Flexibility of design is
benchmarked and adoption of honeycomb architecture due to geometrical advantages is
proposed. Furthermore, a variety of depth-of-interaction (DoI) designs are engraved and
studied, greatly enhancing applicability in small field-of-view tomographs, such as the
intended inserts. To adapt to this need, a novel approach for multi-layer DoI
characterization has been developed and is demonstrated.
Apart from crystal treatment, considerations on signal transmission and processing are
addressed. A double time-over-threshold (ToT) method is proposed, using the statistics of
noise in order to enhance precision. This method is tested and linearity results
demonstrate applicability for multiplexed readout designs. A study on analog optical
wireless communication (aOWC) techniques is also performed and proof of concept
results presented. Finally, a ToT readout firmware architecture, intended for low-cost
FPGAs, has been developed and is described.
By addressing the potential development, applicability and merits of a range of
transdisciplinary solutions, we demonstrate that with these techniques it is possible to
construct lighter, smaller, lower consumption, cost-effective MRI compatible PET inserts.
Those designs can make PET/MRI multimodality the dominant clinical and experimental
imaging approach, enhancing researcher and physician insight to the mysteries of life.La combinación multimodal de TomografÃa por Emisión de Positrones con la Imagen de
Resonancia Magnética (PET/MRI, de sus siglas en inglés) tiene clara ventajas en
comparación con otras técnicas multimodales actualmente disponibles, dada su capacidad
para registrar información funcional e información estructural con mucha precisión y
repetibilidad. Sin embargo, esta técnica no acaba de penetrar en la práctica clÃnica debido
en gran parte a alto coste. Las investigaciones que persiguen mejorar el desarrollo de
insertos de PET basados en fotodetectores de silicio y compatibles con MRI, aunque han
sido intensas y han generado soluciones ingeniosas, todavÃa no han conseguido encontrar
las soluciones que necesita la industria. Sin embargo, existen opciones todavÃa sin explorar
que podrÃan ayudar a evolucionar este tipo de insertos consiguiendo dispositivos más
ligeros, baratos y con mejores prestaciones.
Esta tesis profundiza en el estudio de grabación sub-superficie con láser (SSLE) para el
diseño de los cristales centelladores usados en los sistemas PET. Para ello hemos
caracterizado, medido y calibrado un procedimiento SSLE, y a continuación hemos
estudiado el efecto que tienen sobre las especificaciones del detector las diferentes
configuraciones del grabado. Demostramos que además de la rentabilidad y facilidad de
uso de esta técnica, los centelladores SSLE tienen resolución espacial equivalente y
sensibilidad y fracción de empaquetamiento superiores a las matrices de centelleo
convencionales, lo que posibilita utilizar cristales más cortos para conseguir la misma
sensibilidad. Estos diseños también permiten medir la profundidad de la interacción (DoI),
lo que facilita el uso de estos diseños en tomógrafos de radio pequeño, como pueden ser
los sistemas preclÃnicos, los dedicados (cabeza o mama) o los insertos para MRI.
Además de trabajar en el tratamiento de cristal de centelleo, hemos considerado nuevas
aproximaciones al procesamiento y transmisión de la señal. Proponemos un método
innovador de doble medida de tiempo sobre el umbral (ToT) que integra una evaluación
de la estadÃstica del ruido con el propósito de mejorar la precisión. El método se ha
validado y los resultados demuestran su viabilidad de uso incluso en conjuntos de señales
multiplexadas. Un estudio de las técnicas de comunicación óptica analógica e inalámbrica
(aOWC) ha permitido el desarrollo de una nueva propuesta para comunicar las señales del
detector PET insertado en el gantry a un el procesador de señal externo, técnica que se ha
validado en un demostrador. Finalmente, se ha propuesto y demostrado una nueva
arquitectura de análisis de señal ToT implementada en firmware en FPGAs de bajo coste.
La concepción y desarrollo de estas ideas, asà como la evaluación de los méritos de las
diferentes soluciones propuestas, demuestran que con estas técnicas es posible construir
insertos de PET compatibles con sistemas MRI, que serán más ligeros y compactos, con un
reducido consumo y menor coste. De esta forma se contribuye a que la técnica multimodal
PET/MRI pueda penetrar en la clÃnica, mejorando la comprensión que médicos e
investigadores puedan alcanzar en su estudio de los misterios de la vida.Programa Oficial de Doctorado en IngenierÃa Eléctrica, Electrónica y AutomáticaPresidente: Andrés Santos Lleó.- Secretario: Luis Hernández Corporales.- Vocal: Giancarlo Sportell
Embedded electronic systems driven by run-time reconfigurable hardware
Abstract
This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen
Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnologÃa hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando asà su implementación fÃsica –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnologÃa a través del prototipado de varias aplicaciones de ingenierÃa (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum
Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinà micament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinà mica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant aixà la seva implementació fÃsica –à rea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware està tic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria
FPGA dynamic and partial reconfiguration : a survey of architectures, methods, and applications
Dynamic and partial reconfiguration are key differentiating capabilities of field programmable gate arrays (FPGAs). While they have been studied extensively in academic literature, they find limited use in deployed systems. We review FPGA reconfiguration, looking at architectures built for the purpose, and the properties of modern commercial architectures. We then investigate design flows, and identify the key challenges in making reconfigurable FPGA systems easier to design. Finally, we look at applications where reconfiguration has found use, as well as proposing new areas where this capability places FPGAs in a unique position for adoption
- …