916 research outputs found
FASTER: Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration
The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) EU FP7 project, aims to ease the design and implementation of dynamically changing hardware systems. Our motivation stems from the promise reconfigurable systems hold for achieving high performance and extending product functionality and lifetime via the addition of new features that operate at hardware speed. However, designing a changing hardware system is both challenging and time-consuming. FASTER facilitates the use of reconfigurable technology by providing a complete methodology enabling designers to easily specify, analyze, implement and verify applications on platforms with general-purpose processors and acceleration modules implemented in the latest reconfigurable technology. Our tool-chain supports both coarse- and fine-grain FPGA reconfiguration, while during execution a flexible run-time system manages the reconfigurable resources. We target three applications from different domains. We explore the way each application benefits from reconfiguration, and then we asses them and the FASTER tools, in terms of performance, area consumption and accuracy of analysis
Application of novel technologies for the development of next generation MR compatible PET inserts
Multimodal imaging integrating Positron Emission Tomography and Magnetic
Resonance Imaging (PET/MRI) has professed advantages as compared to other available
combinations, allowing both functional and structural information to be acquired with
very high precision and repeatability. However, it has yet to be adopted as the standard
for experimental and clinical applications, due to a variety of reasons mainly related to
system cost and flexibility. A hopeful existing approach of silicon photodetector-based MR
compatible PET inserts comprised by very thin PET devices that can be inserted in the
MRI bore, has been pioneered, without disrupting the market as expected. Technological
solutions that exist and can make this type of inserts lighter, cost-effective and more
adaptable to the application need to be researched further.
In this context, we expand the study of sub-surface laser engraving (SSLE) for
scintillators used for PET. Through acquiring, measuring and calibrating the use of a SSLE
setting we study the effect of different engraving configurations on detection
characteristics of the scintillation light by the photosensors. We demonstrate that apart
from cost-effectiveness and ease of application, SSLE treated scintillators have similar
spatial resolution and superior sensitivity and packing fraction as compared to standard
pixelated arrays, allowing for shorter crystals to be used. Flexibility of design is
benchmarked and adoption of honeycomb architecture due to geometrical advantages is
proposed. Furthermore, a variety of depth-of-interaction (DoI) designs are engraved and
studied, greatly enhancing applicability in small field-of-view tomographs, such as the
intended inserts. To adapt to this need, a novel approach for multi-layer DoI
characterization has been developed and is demonstrated.
Apart from crystal treatment, considerations on signal transmission and processing are
addressed. A double time-over-threshold (ToT) method is proposed, using the statistics of
noise in order to enhance precision. This method is tested and linearity results
demonstrate applicability for multiplexed readout designs. A study on analog optical
wireless communication (aOWC) techniques is also performed and proof of concept
results presented. Finally, a ToT readout firmware architecture, intended for low-cost
FPGAs, has been developed and is described.
By addressing the potential development, applicability and merits of a range of
transdisciplinary solutions, we demonstrate that with these techniques it is possible to
construct lighter, smaller, lower consumption, cost-effective MRI compatible PET inserts.
Those designs can make PET/MRI multimodality the dominant clinical and experimental
imaging approach, enhancing researcher and physician insight to the mysteries of life.La combinación multimodal de Tomografía por Emisión de Positrones con la Imagen de
Resonancia Magnética (PET/MRI, de sus siglas en inglés) tiene clara ventajas en
comparación con otras técnicas multimodales actualmente disponibles, dada su capacidad
para registrar información funcional e información estructural con mucha precisión y
repetibilidad. Sin embargo, esta técnica no acaba de penetrar en la práctica clínica debido
en gran parte a alto coste. Las investigaciones que persiguen mejorar el desarrollo de
insertos de PET basados en fotodetectores de silicio y compatibles con MRI, aunque han
sido intensas y han generado soluciones ingeniosas, todavía no han conseguido encontrar
las soluciones que necesita la industria. Sin embargo, existen opciones todavía sin explorar
que podrían ayudar a evolucionar este tipo de insertos consiguiendo dispositivos más
ligeros, baratos y con mejores prestaciones.
Esta tesis profundiza en el estudio de grabación sub-superficie con láser (SSLE) para el
diseño de los cristales centelladores usados en los sistemas PET. Para ello hemos
caracterizado, medido y calibrado un procedimiento SSLE, y a continuación hemos
estudiado el efecto que tienen sobre las especificaciones del detector las diferentes
configuraciones del grabado. Demostramos que además de la rentabilidad y facilidad de
uso de esta técnica, los centelladores SSLE tienen resolución espacial equivalente y
sensibilidad y fracción de empaquetamiento superiores a las matrices de centelleo
convencionales, lo que posibilita utilizar cristales más cortos para conseguir la misma
sensibilidad. Estos diseños también permiten medir la profundidad de la interacción (DoI),
lo que facilita el uso de estos diseños en tomógrafos de radio pequeño, como pueden ser
los sistemas preclínicos, los dedicados (cabeza o mama) o los insertos para MRI.
Además de trabajar en el tratamiento de cristal de centelleo, hemos considerado nuevas
aproximaciones al procesamiento y transmisión de la señal. Proponemos un método
innovador de doble medida de tiempo sobre el umbral (ToT) que integra una evaluación
de la estadística del ruido con el propósito de mejorar la precisión. El método se ha
validado y los resultados demuestran su viabilidad de uso incluso en conjuntos de señales
multiplexadas. Un estudio de las técnicas de comunicación óptica analógica e inalámbrica
(aOWC) ha permitido el desarrollo de una nueva propuesta para comunicar las señales del
detector PET insertado en el gantry a un el procesador de señal externo, técnica que se ha
validado en un demostrador. Finalmente, se ha propuesto y demostrado una nueva
arquitectura de análisis de señal ToT implementada en firmware en FPGAs de bajo coste.
La concepción y desarrollo de estas ideas, así como la evaluación de los méritos de las
diferentes soluciones propuestas, demuestran que con estas técnicas es posible construir
insertos de PET compatibles con sistemas MRI, que serán más ligeros y compactos, con un
reducido consumo y menor coste. De esta forma se contribuye a que la técnica multimodal
PET/MRI pueda penetrar en la clínica, mejorando la comprensión que médicos e
investigadores puedan alcanzar en su estudio de los misterios de la vida.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Andrés Santos Lleó.- Secretario: Luis Hernández Corporales.- Vocal: Giancarlo Sportell
Late-bound code generation
Each time a function or method is invoked during the execution of a program, a stream of instructions is issued to some underlying hardware platform. But exactly what underlying hardware, and which instructions, is usually left implicit. However in certain situations it becomes important to control these decisions. For example, particular problems can only be solved in real-time when scheduled on specialised accelerators, such as graphics coprocessors or computing clusters.
We introduce a novel operator for hygienically reifying the behaviour of a runtime function instance as a syntactic fragment, in a language which may in general differ from the source function definition. Translation and optimisation are performed by recursively invoked, dynamically dispatched code generators. Side-effecting operations are permitted, and their ordering is preserved.
We compare our operator with other techniques for pragmatic control, observing that: the use of our operator supports lifting arbitrary mutable objects, and neither requires rewriting sections of the source program in a multi-level language, nor interferes with the interface to individual software components. Due to its lack of interference at the abstraction level at which software is composed, we believe that our approach poses a significantly lower barrier to practical adoption than current methods.
The practical efficacy of our operator is demonstrated by using it to offload the user interface rendering of a smartphone application to an FPGA coprocessor, including both statically and procedurally defined user interface components. The generated pipeline is an application-specific, statically scheduled processor-per-primitive rendering pipeline, suitable for place-and-route style optimisation.
To demonstrate the compatibility of our operator with existing languages, we show how it may be defined within the Python programming language. We introduce a transformation for weakening mutable to immutable named bindings, termed let-weakening, to solve the problem of propagating information pertaining to named variables between modular code generating units.Open Acces
Acceleration Techniques for Sparse Recovery Based Plane-wave Decomposition of a Sound Field
Plane-wave decomposition by sparse recovery is a reliable and accurate technique for plane-wave decomposition which can be used for source localization, beamforming, etc. In this work, we introduce techniques to accelerate the plane-wave decomposition by sparse recovery. The method consists of two main algorithms which are spherical Fourier transformation (SFT) and sparse recovery. Comparing the two algorithms, the sparse recovery is the most computationally intensive. We implement the SFT on an FPGA and the sparse recovery on a multithreaded computing platform. Then the multithreaded computing platform could be fully utilized for the sparse recovery. On the other hand, implementing the SFT on an FPGA helps to flexibly integrate the microphones and improve the portability of the microphone array. For implementing the SFT on an FPGA, we develop a scalable FPGA design model that enables the quick design of the SFT architecture on FPGAs. The model considers the number of microphones, the number of SFT channels and the cost of the FPGA and provides the design of a resource optimized and cost-effective FPGA architecture as the output. Then we investigate the performance of the sparse recovery algorithm executed on various multithreaded computing platforms (i.e., chip-multiprocessor, multiprocessor, GPU, manycore). Finally, we investigate the influence of modifying the dictionary size on the computational performance and the accuracy of the sparse recovery algorithms. We introduce novel sparse-recovery techniques which use non-uniform dictionaries to improve the performance of the sparse recovery on a parallel architecture
A Hierarchical Architectural Framework for Securing Unmanned Aerial Systems
Unmanned Aerial Systems (UAS) are becoming more widely used in the new era of evolving technology; increasing performance while decreasing size, weight, and cost. A UAS equipped with a Flight Control System (FCS) that can be used to fly semi- or fully-autonomous is a prime example of a Cyber Physical and Safety Critical system. Current Cyber-Physical defenses against malicious attacks are structured around security standards for best practices involving the development of protocols and the digital software implementation. Thus far, few attempts have been made to embed security into the architecture of the system considering security as a holistic problem. Therefore, a Hierarchical, Embedded, Cyber Attack Detection (HECAD) framework is developed to provide security in a holistic manor, providing resiliency against cyber-attacks as well as introducing strategies for mitigating and dealing with component failures. Traversing the hardware/software barrier, HECAD provides detection of malicious faults at the hardware and software level; verified through the development of an FPGA implementation and tested using a UAS FCS
OpenMPD: A Low-Level Presentation Engine for Multimodal Particle-Based Displays
Phased arrays of transducers have been quickly evolving in terms of software and hardware with applications in haptics (acoustic vibrations), display (levitation), and audio. Most recently, Multimodal Particle-based Displays (MPDs) have even demonstrated volumetric content that can be seen, heard, and felt simultaneously, without additional instrumentation. However, current software tools only support individual modalities and they do not address the integration and exploitation of the multi-modal potential of MPDs. This is because there is no standardized presentation pipeline tackling the challenges related to presenting such kind of multi-modal content (e.g., multi-modal support, multi-rate synchronization at 10 KHz, visual rendering or synchronization and continuity). This article presents OpenMPD, a low-level presentation engine that deals with these challenges and allows structured exploitation of any type of MPD content (i.e., visual, tactile, audio). We characterize OpenMPD’s performance and illustrate how it can be integrated into higher-level development tools (i.e., Unity game engine). We then illustrate its ability to enable novel presentation capabilities, such as support of multiple MPD contents, dexterous manipulations of fast-moving particles, or novel swept-volume MPD content
Real Time 3-D Graphics Processing Hardware Design using Field-Programmable Gate Arrays.
Three dimensional graphics processing requires many complex algebraic and matrix based operations to be performed in real-time. In early stages of graphics processing, such tasks were delegated to a Central Processing Unit (CPU). Over time as more complex graphics rendering was demanded, CPU solutions became inadequate. To meet this demand, custom hardware solutions that take advantage of pipelining and massive parallelism become more preferable to CPU software based solutions. This fact has lead to the many custom hardware solutions that are available today. Since real time graphics processing requires extreme high performance, hardware solutions using Application Specific Integrated Circuits (ASICs) are the standard within the industry. While ASICs are a more than adequate solution for implementing high performance custom hardware, the design, implementation and testing of ASIC based designs are becoming cost prohibitive due to the massive up front verification effort needed as well as the cost of fixing design defects.Field Programmable Gate Arrays (FPGAs) provide an alternative to the ASIC design flow. More importantly, in recent years FPGA technology have begun to improve in performance to the point where ASIC and FPGA performance has become comparable. In addition, FPGAs address many of the issues of the ASIC design flow. The ability to reconfigure FPGAs reduces the upfront verification effort and allows design defects to be fixed easily. This thesis demonstrates that a 3-D graphics processor implementation on and FPGA is feasible by implementing both a two dimensional and three dimensional graphics processor prototype. By using a Xilinx Virtex 5 ML506 FPGA development kit a fully functional wireframe graphics rendering engine is implemented using VHDL and Xilinx's development tools. A VHDL testbench was designed to verify that the graphics engine works functionally. This is followed by synthesizing the design and real hardware and developing test applications to verify functionality and performance of the design. This thesis provides the ground work for push forward the use of FPGA technology in graphics processing applications
Compiling and optimizing spreadsheets for FPGA and multicore execution
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007."September 2007."Includes bibliographical references (p. 102-104).A major barrier to developing systems on multicore and FPGA chips is an easy-to-use development environment. This thesis presents the RhoZeta spreadsheet compiler and Catalyst optimization system for programming multiprocessors and FPGAs. Any spreadsheet frontend may be extended to work with RhoZeta's multiple interpreters and behavioral abstraction mechanisms. RhoZeta synchronizes a variety of cell interpreters acting on a global memory space. RhoZeta can also compile a group of cells to multithreaded C or Verilog. The result is an easy-to-use interface for programming multicore microprocessors and FPGAs. A spreadsheet environment presents parallelism and locality issues of modem hardware directly to the user and allows for a simple global memory synchronization model. Catalyst is a spreadsheet graph rewriting system based on performing behaviorally invariant guarded atomic actions while a system is being interpreted by RhoZeta. A number of optimization macros were developed to perform speculation, resource sharing and propagation of static assignments through a circuit. Parallelization of a 64-bit serial leading-zero-counter is demonstrated with Catalyst. Fault tolerance macros were also developed in Catalyst to protect against dynamic faults and to offset costs associated with testing semiconductors for static defects. A model for partitioning, placing and profiling spreadsheet execution in a heterogeneous hardware environment is also discussed. The RhoZeta system has been used to design several multithreaded and FPGA applications including a RISC emulator and a MIDI controlled modular synthesizer.by Amir Hirsch.M.Eng
- …