20 research outputs found

    High level compilation for gate reconfigurable architectures

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 205-215).A continuing exponential increase in the number of programmable elements is turning management of gate-reconfigurable architectures as "glue logic" into an intractable problem; it is past time to raise this abstraction level. The physical hardware in gate-reconfigurable architectures is all low level - individual wires, bit-level functions, and single bit registers - hence one should look to the fetch-decode-execute machinery of traditional computers for higher level abstractions. Ordinary computers have machine-level architectural mechanisms that interpret instructions - instructions that are generated by a high-level compiler. Efficiently moving up to the next abstraction level requires leveraging these mechanisms without introducing the overhead of machine-level interpretation. In this dissertation, I solve this fundamental problem by specializing architectural mechanisms with respect to input programs. This solution is the key to efficient compilation of high-level programs to gate reconfigurable architectures. My approach to specialization includes several novel techniques. I develop, with others, extensive bitwidth analyses that apply to registers, pointers, and arrays. I use pointer analysis and memory disambiguation to target devices with blocks of embedded memory. My approach to memory parallelization generates a spatial hierarchy that enables easier-to-synthesize logic state machines with smaller circuits and no long wires.(cont.) My space-time scheduling approach integrates the techniques of high-level synthesis with the static routing concepts developed for single-chip multiprocessors. Using DeepC, a prototype compiler demonstrating my thesis, I compile a new benchmark suite to Xilinx Virtex FPGAs. Resulting performance is comparable to a custom MIPS processor, with smaller area (40 percent on average), higher evaluation speeds (2.4x), and lower energy (18x) and energy-delay (45x). Specialization of advanced mechanisms results in additional speedup, scaling with hardware area, at the expense of power. For comparison, I also target IBM's standard cell SA-27E process and the RAW microprocessor. Results include sensitivity analysis to the different mechanisms specialized and a grand comparison between alternate targets.by Jonathan William Babb.Ph.D

    The Customizable Virtual FPGA: Generation, System Integration and Configuration of Application-Specific Heterogeneous FPGA Architectures

    Get PDF
    In den vergangenen drei Jahrzehnten wurde die Entwicklung von Field Programmable Gate Arrays (FPGAs) stark von Moore’s Gesetz, Prozesstechnologie (Skalierung) und kommerziellen Märkten beeinflusst. State-of-the-Art FPGAs bewegen sich einerseits dem Allzweck näher, aber andererseits, da FPGAs immer mehr traditionelle Domänen der Anwendungsspezifischen integrierten Schaltungen (ASICs) ersetzt haben, steigen die Effizienzerwartungen. Mit dem Ende der Dennard-Skalierung können Effizienzsteigerungen nicht mehr auf Technologie-Skalierung allein zurückgreifen. Diese Facetten und Trends in Richtung rekonfigurierbarer System-on-Chips (SoCs) und neuen Low-Power-Anwendungen wie Cyber Physical Systems und Internet of Things erfordern eine bessere Anpassung der Ziel-FPGAs. Neben den Trends für den Mainstream-Einsatz von FPGAs in Produkten des täglichen Bedarfs und Services wird es vor allem bei den jüngsten Entwicklungen, FPGAs in Rechenzentren und Cloud-Services einzusetzen, notwendig sein, eine sofortige Portabilität von Applikationen über aktuelle und zukünftige FPGA-Geräte hinweg zu gewährleisten. In diesem Zusammenhang kann die Hardware-Virtualisierung ein nahtloses Mittel für Plattformunabhängigkeit und Portabilität sein. Ehrlich gesagt stehen die Zwecke der Anpassung und der Virtualisierung eigentlich in einem Konfliktfeld, da die Anpassung für die Effizienzsteigerung vorgesehen ist, während jedoch die Virtualisierung zusätzlichen Flächenaufwand hinzufügt. Die Virtualisierung profitiert aber nicht nur von der Anpassung, sondern fügt auch mehr Flexibilität hinzu, da die Architektur jederzeit verändert werden kann. Diese Besonderheit kann für adaptive Systeme ausgenutzt werden. Sowohl die Anpassung als auch die Virtualisierung von FPGA-Architekturen wurden in der Industrie bisher kaum adressiert. Trotz einiger existierenden akademischen Werke können diese Techniken noch als unerforscht betrachtet werden und sind aufstrebende Forschungsgebiete. Das Hauptziel dieser Arbeit ist die Generierung von FPGA-Architekturen, die auf eine effiziente Anpassung an die Applikation zugeschnitten sind. Im Gegensatz zum üblichen Ansatz mit kommerziellen FPGAs, bei denen die FPGA-Architektur als gegeben betrachtet wird und die Applikation auf die vorhandenen Ressourcen abgebildet wird, folgt diese Arbeit einem neuen Paradigma, in dem die Applikation oder Applikationsklasse fest steht und die Zielarchitektur auf die effiziente Anpassung an die Applikation zugeschnitten ist. Dies resultiert in angepassten anwendungsspezifischen FPGAs. Die drei Säulen dieser Arbeit sind die Aspekte der Virtualisierung, der Anpassung und des Frameworks. Das zentrale Element ist eine weitgehend parametrierbare virtuelle FPGA-Architektur, die V-FPGA genannt wird, wobei sie als primäres Ziel auf jeden kommerziellen FPGA abgebildet werden kann, während Anwendungen auf der virtuellen Schicht ausgeführt werden. Dies sorgt für Portabilität und Migration auch auf Bitstream-Ebene, da die Spezifikation der virtuellen Schicht bestehen bleibt, während die physische Plattform ausgetauscht werden kann. Darüber hinaus wird diese Technik genutzt, um eine dynamische und partielle Rekonfiguration auf Plattformen zu ermöglichen, die sie nicht nativ unterstützen. Neben der Virtualisierung soll die V-FPGA-Architektur auch als eingebettetes FPGA in ein ASIC integriert werden, das effiziente und dennoch flexible System-on-Chip-Lösungen bietet. Daher werden Zieltechnologie-Abbildungs-Methoden sowohl für Virtualisierung als auch für die physikalische Umsetzung adressiert und ein Beispiel für die physikalische Umsetzung in einem 45 nm Standardzellen Ansatz aufgezeigt. Die hochflexible V-FPGA-Architektur kann mit mehr als 20 Parametern angepasst werden, darunter LUT-Grösse, Clustering, 3D-Stacking, Routing-Struktur und vieles mehr. Die Auswirkungen der Parameter auf Fläche und Leistung der Architektur werden untersucht und eine umfangreiche Analyse von über 1400 Benchmarkläufen zeigt eine hohe Parameterempfindlichkeit bei Abweichungen bis zu ±95, 9% in der Fläche und ±78, 1% in der Leistung, was die hohe Bedeutung von Anpassung für Effizienz aufzeigt. Um die Parameter systematisch an die Bedürfnisse der Applikation anzupassen, wird eine parametrische Entwurfsraum-Explorationsmethode auf der Basis geeigneter Flächen- und Zeitmodellen vorgeschlagen. Eine Herausforderung von angepassten Architekturen ist der Entwurfsaufwand und die Notwendigkeit für angepasste Werkzeuge. Daher umfasst diese Arbeit ein Framework für die Architekturgenerierung, die Entwurfsraumexploration, die Anwendungsabbildung und die Evaluation. Vor allem ist der V-FPGA in einem vollständig synthetisierbaren generischen Very High Speed Integrated Circuit Hardware Description Language (VHDL) Code konzipiert, der sehr flexibel ist und die Notwendigkeit für externe Codegeneratoren eliminiert. Systementwickler können von verschiedenen Arten von generischen SoC-Architekturvorlagen profitieren, um die Entwicklungszeit zu reduzieren. Alle notwendigen Konstruktionsschritte für die Applikationsentwicklung und -abbildung auf den V-FPGA werden durch einen Tool-Flow für Entwurfsautomatisierung unterstützt, der eine Sammlung von vorhandenen kommerziellen und akademischen Werkzeugen ausnutzt, die durch geeignete Modelle angepasst und durch ein neues Werkzeug namens V-FPGA-Explorer ergänzt werden. Dieses neue Tool fungiert nicht nur als Back-End-Tool für die Anwendungsabbildung auf dem V-FPGA sondern ist auch ein grafischer Konfigurations- und Layout-Editor, ein Bitstream-Generator, ein Architekturdatei-Generator für die Place & Route Tools, ein Script-Generator und ein Testbenchgenerator. Eine Besonderheit ist die Unterstützung der Just-in-Time-Kompilierung mit schnellen Algorithmen für die In-System Anwendungsabbildung. Die Arbeit schliesst mit einigen Anwendungsfällen aus den Bereichen industrielle Prozessautomatisierung, medizinische Bildgebung, adaptive Systeme und Lehre ab, in denen der V-FPGA eingesetzt wird

    Memory hierarchy and data communication in heterogeneous reconfigurable SoCs

    Get PDF
    The miniaturization race in the hardware industry aiming at continuous increasing of transistor density on a die does not bring respective application performance improvements any more. One of the most promising alternatives is to exploit a heterogeneous nature of common applications in hardware. Supported by reconfigurable computation, which has already proved its efficiency in accelerating data intensive applications, this concept promises a breakthrough in contemporary technology development. Memory organization in such heterogeneous reconfigurable architectures becomes very critical. Two primary aspects introduce a sophisticated trade-off. On the one hand, a memory subsystem should provide well organized distributed data structure and guarantee the required data bandwidth. On the other hand, it should hide the heterogeneous hardware structure from the end-user, in order to support feasible high-level programmability of the system. This thesis work explores the heterogeneous reconfigurable hardware architectures and presents possible solutions to cope the problem of memory organization and data structure. By the example of the MORPHEUS heterogeneous platform, the discussion follows the complete design cycle, starting from decision making and justification, until hardware realization. Particular emphasis is made on the methods to support high system performance, meet application requirements, and provide a user-friendly programmer interface. As a result, the research introduces a complete heterogeneous platform enhanced with a hierarchical memory organization, which copes with its task by means of separating computation from communication, providing reconfigurable engines with computation and configuration data, and unification of heterogeneous computational devices using local storage buffers. It is distinguished from the related solutions by distributed data-flow organization, specifically engineered mechanisms to operate with data on local domains, particular communication infrastructure based on Network-on-Chip, and thorough methods to prevent computation and communication stalls. In addition, a novel advanced technique to accelerate memory access was developed and implemented

    Preliminary multicore architecture for Introspective Computing

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 243-245).This thesis creates a framework for Introspective Computing. Introspective Computing is a computing paradigm characterized by self-aware software. Self-aware software systems use hardware mechanisms to observe an application's execution so that they may adapt execution to improve performance, reduce power consumption, or balance user-defined fitness criteria over time-varying conditions in a system environment. We dub our framework Partner Cores. The Partner Cores framework builds upon tiled multicore architectures [11, 10, 25, 9], closely coupling cores such that one may be used to observe and optimize execution in another. Partner cores incrementally collect and analyze execution traces from code cores then exploit knowledge of the hardware to optimize execution. This thesis develops a tiled architecture for the Partner Cores framework that we dub Evolve. Evolve provides a versatile substrate upon which software may coordinate core partnerships and various forms of parallelism. To do so, Evolve augments a basic tiled architecture with introspection hardware and programmable functional units. Partner Cores software systems on the Evolve hardware may follow the style of helper threading [13, 12, 6] or utilize the programmable functional units in each core to evolve application-specific coprocessor engines. This thesis work develops two Partner Cores software systems: the Dynamic Partner-Assisted Branch Predictor and the Introspective L2 Memory System (IL2). The branch predictor employs a partner core as a coprocessor engine for general dynamic branch prediction in a corresponding code core. The IL2 retasks the silicon resources of partner cores as banks of an on-chip, distributed, software L2 cache for code cores.(cont.) The IL2 employs aggressive, application-specific prefetchers for minimizing cache miss penalties and DRAM power consumption. Our results and future work show that the branch predictor is able to sustain prediction for code core branch frequencies as high as one every 7 instructions with no degradation in accuracy; updated prediction directions are available in a low minimum of 20-21 instructions. For the IL2, we develop a pixel block prefetcher for the image data structure used in a JPEG encoder benchmark and show that a 50% improvement in absolute performance is attainable.by Jonathan M. Eastep.S.M

    Hardware-software codesign in a high-level synthesis environment

    Get PDF
    Interfacing hardware-oriented high-level synthesis to software development is a computationally hard problem for which no general solution exists. Under special conditions, the hardware-software codesign (system-level synthesis) problem may be analyzed with traditional tools and efficient heuristics. This dissertation introduces a new alternative to the currently used heuristic methods. The new approach combines the results of top-down hardware development with existing basic hardware units (bottom-up libraries) and compiler generation tools. The optimization goal is to maximize operating frequency or minimize cost with reasonable tradeoffs in other properties. The dissertation research provides a unified approach to hardware-software codesign. The improvements over previously existing design methodologies are presented in the frame-work of an academic CAD environment (PIPE). This CAD environment implements a sufficient subset of functions of commercial microelectronics CAD packages. The results may be generalized for other general-purpose algorithms or environments. Reference benchmarks are used to validate the new approach. Most of the well-known benchmarks are based on discrete-time numerical simulations, digital filtering applications, and cryptography (an emerging field in benchmarking). As there is a need for high-performance applications, an additional requirement for this dissertation is to investigate pipelined hardware-software systems\u27 performance and design methods. The results demonstrate that the quality of existing heuristics does not change in the enhanced, hardware-software environment

    New Techniques for On-line Testing and Fault Mitigation in GPUs

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Reconfigurable Computing Based on Commercial FPGAs. Solutions for the Design and Implementation of Partially Reconfigurable Systems = Computación reconfigurable basada en FPGAs comerciales. Soluciones para el diseño e implementación de sistemas parcialmente reconfigurables.

    Get PDF
    Esta tesis doctoral está enmarcada en el campo de investigación de la computación reconfigurable. Este campo ha experimentado un crecimiento abrumador en los últimos años como resultado de la evolución de los dispositivos reconfigurables, donde las Field Programmable Gate Arrays (FPGAs) son el máximo exponente desde el punto de vista comercial. De forma tradicional las empresas de electrónica han seleccionado las FPGAs como prototipos iníciales de productos de altas prestaciones. Luego el sistema final es integrado en Application Specific Integrated Circuits (ASICs) que se producen en grandes volúmenes perimiendo amortiza su alto coste de diseño y producción y aprovechando la ventaja del bajo coste por unidad. Por otro lado, los DSPs (Digital Signal Processing) y los microprocesadores han sido preferidos por su bajo coste ante las FPGAs el campo de los dispositivos con menores requisitos de cómputo. En los últimos años, este panorama está sufriendo una serie de cambios. Ahora el mercado busca mas soluciones “reconfigurables” ya que permiten reducir el tiempo de salida del producto al mercado (time-to-market), aumentar el tiempo del producto en el mercado (time-in-market) y además cubren los amplios requisitos de cómputo. El cambio que se observa, se debe a que los dispositivos programables han evolucionado de simples estructuras programables a complejas plataformas reconfigurables. Las FPGAs del estado de la técnica han alcanzado un grado de integración muy alto y además ahora contienen, dentro de su arquitectura programable, microprocesadores y lógica específica de procesamiento digital de señal. Otro factor sumamente importante para el cambio es que las FPGAs permiten el diseño de dispositivos cuyo hardware pueden ser adaptado, o actualizado, una vez que el producto ya esta entregado e instalado, obteniendo así una flexibilidad en el hardware comparable con la del software, donde la actualización postventa de los sistemas es una práctica muy explotada de cara a la reducción de costes y la salida rápida al mercado. Por otro lado, y sobre todo en el ámbito académico, existen dispositivos reconfigurables con distinta granularidad que permites alcanzar altas prestaciones en comparación con las FPGAs comerciales de grano fino (comparable con la de los ASICs), pero están restringidas a una aplicación o grupo de aplicaciones. A pesar de que los dispositivos reconfigurables propietarios ofrecen muchas ventajas, esta opción ha sido descartada en la presente tesis debido a que, desde el punto de vista industrial requieren, aparte del diseño del ASIC reconfigurable, el desarrollo de un entorno de diseño completo. Todo esto conlleva a un elevado coste de recursos, además del alejamiento de las propuestas de la industria. La presente tesis se ha centrado en proporcionar soluciones para dispositivos comerciales, FPGAs de grano fino, con la finalidad de aprovechar las herramientas existentes y mantener las soluciones propuestas lo más cerca posible de la industria. Los dispositivos reconfigurables proporcionan diversos métodos de reconfiguración, siendo el más atractivo la reconfiguración parcial y dinámica, ya que permite adaptar el dispositivo sin interrumpir su funcionamiento y crear dispositivos auto-adaptables. Este tipo de reconfiguración será el objeto de estudio de la tesis doctoral. La reconfiguración parcial permite tener una serie de tareas hardware (módulos que se ubican en la estructura reconfigurable) ejecutándose paralelamente en la FPGA y sustituir un bloque por otro, dependiendo de las necesidades del sistema, sin alterar el funcionamiento del resto de bloques. Esta idea básica en teoría brinda la flexibilidad del software al hardware, que combinado con su paralelismo implícito hace del sistema reconfigurable una potente herramienta que puede dar pie a la creación de sistemas adaptables o incluso autoadaptativos, supercomputadores reconfigurables y hardware bio-inspirado entre otros. Por otro lado, a pesar que algunos proveedores de FPGAs permiten la reconfiguración parcial, el uso de esta técnica aún está restringido al ámbito académico y a sistemas muy básicos. El trabajo de investigación descrito dentro de la presente tesis doctoral ha tenido por objeto el estudio de diversos aspectos de los sistemas parcialmente reconfigurables, la identificación de las principales deficiencias de las soluciones existentes y la propuesta de nuevas soluciones originales. Como resultado del estudio del estado del arte se ha visto que las soluciones existentes son poco flexibles y la escalabilidad de los sistemas que se pueden diseñar es reducida. Por ello las propuestas originales de esta tesis tienen como objetivo permitir el diseño e implementación de sistemas parcialmente reconfigurables con alta escalabilidad y flexibilidad. La tesis principal del trabajo de investigación ha sido basada en la idea que para obtener una mayor flexibilidad de los sistemas se debe desligar el diseño del sistema reconfigurable del diseño de los cores que serán consumidos por dicho sistema. La tesis doctoral ha contribuido proponiendo mejores soluciones a nivel de arquitectura, flujos de diseño y herramientas que han permitido el diseño e implementación de diversos sistemas parcialmente reconfigurables con distinto grado de flexibilidad y escalabilidad. La flexibilidad y la escalabilidad son términos que en los sistemas reconfigurables se pueden asociar a diversos aspectos. Dentro de esta tesis la flexibilidad está asociada principalmente a la diversidad de cores o tareas hardware que pueden ser consumidos o integrados en un sistema ya definido, mientras que la escalabilidad está referida al número de cores que pueden coexistir en el sistema y ser reconfigurados independientemente. Para poder diseñar sistemas flexibles y escalables, estas características deben estar cubiertas en distintos niveles. Más en detalle dentro de la presente tesis, desde el punto de vista de la arquitectura, la flexibilidad está cubierta por la posibilidad de posicionar libremente cores en una arquitectura escalable predefinida. Desde el punto de vista del sistema, la flexibilidad está reflejada por la posibilidad de no sólo de modificar o reconfigurar un core del sistema hardware, sino también de modificar las comunicaciones internas del mismo. Desde el punto de vista del dispositivo, la flexibilidad está garantizada por la transparencia en el proceso de reconfiguración. Por último, la flexibilidad en el proceso de diseño está cubierta por la definición de herramientas y flujos de diseño que permiten por un lado desligar el diseño del sistema reconfigurable del diseño de los cores para el sistema, y por otro lado que diseñadores sin conocimientos detallados de reconfiguración parcial puedan diseñar cores. Dentro de la tesis doctoral se presentan cuatro dispositivos reconfigurable integrados en distintos entornos y con distinto grado de flexibilidad que corresponde al grado de aprovechamiento de las aportaciones originales de la tesis. Las principales aportaciones de la tesis doctoral, relacionadas a cada uno de los aspectos mencionados en el párrafo anterior, y tratados en distintas partes de la tesis se resumen a continuación destacando en la medida de lo posible las diferencias con respecto al estado del arte: Se ha definido una metodología de diseño de Arquitecturas Virtuales (abstracción de la arquitectura física de la FPGA que incluye la distribución de los recursos programables en slots y la forma de interconexión de los slots). La metodología, propuesta originalmente en esta tesis, permite el diseño de sistemas reconfigurables con alta flexibilidad y escalabilidad comparadas con el estado del arte. Una solución a la adaptación de las comunicaciones internas en los sistemas reconfigurables llamada DRNoC (Dynamic Reconfigurable NoC). La solución original abarca diversos aspectos e incluye la definición de una arquitectura de interconexión para los sistemas reconfigurables basada en redes en chip (Network on Chip - NoC), la definición de métodos de reconfiguración y el direccionamiento interno del sistema, y de forma más específica para las comunicaciones basadas en redes, la definición de un formato de tramas y la arquitectura de los enrutadores. La principal diferencia de la solución propuesta con el estado del arte es que DRNoC no restringe la comunicación únicamente a NoCs y permite la definición de cualquier tipo de esquema de comunicación (NoC, punto a punto, punto a multipunto, bus, o una combinación de las anteriores) y además, permite que varios esquemas de comunicación coexistan en el mismo sistema y que funcionen de forma independiente. De esta forma la solución propuesta brinda una mayor flexibilidad que las ya existentes. Se ha propuesto una solución para la manipulación de los ficheros de configuración para las FPGA del tipo Virtex II/Pro que es la más completa comparada con el estado del arte. Asimismo, una serie de herramientas que permiten la generación y extracción de cores para sistemas reconfigurables basados en FPGA Virtex II que ha sido la primera solución existente para estas FPGA. Un flujo de diseño para cores basado en plantillas que permite el diseño de cores hardware sin ser un experto en reconfiguración parcial y sin conocer los detalles del sistema final en el que se implementará el core. El diseño, implementación y prueba de un sistema parcialmente reconfigurable basado en FPGAs comerciales de grano fino para redes de sensores. La primera aproximación existente en el estado del arte al uso de los sistemas parcialmente reconfigurables en las redes de sensores. La integración de un sistema reconfigurable en un entorno cliente-servidor que incluye un original sistema de control de la reconfiguración. Una solución para la depuración de los sistemas reconfigurables. Un sistema de emulación y prototipado rápido de las comunicaciones dentro de un chip basado originalmente en la idea de la reutilización de cores hardware por medio de la técnica de reconfiguración parcial. Como conclusión global del trabajo de investigación realizado cabe destacar que la presente tesis ha dado lugar a la creación y consolidación de una línea de investigación en el grupo de electrónica digital del Centro de Electrónica Industrial que actualmente se encuentra entre las más activas y de mayor importancia. Además, el trabajo de investigación y la divulgación de las aportaciones originales han permitido que el centro de investigación pase a formar parte del estado del arte de los sistemas parcialmente reconfigurables. The thesis is enclosed in the research area of reconfigurable computing which, in the last years, has experienced a remarkable growth as a result of the impressive evolution of reconfigurable devices. In this area, Field Programmable Gate Arrays (FPGAs) are the most outstanding representative from the commercial point of view. Traditionally FPGAs have been used for prototyping, in previous to the final Application Specific Integrated Circuit (ASIC) design stages. However, the interest in the integration of FPGAs in final products has been growing in the last years. FPGAs are preferred for small production volumes, where the ASIC masks high cost is unaffordable and also in products where time-to-market is a priority, and waiting for a complete ASIC design cycle is not desirable. State of the art FPGAs are highly integrated electronic circuits, composed of tens of millions of system gates, with competitive speed, performance and configurability. These devices have evolved from simple gate arrays to complex platforms that include embedded memory, multipliers and even microprocessors and digital signal processing elements. Additionally, the fine grain nature of the reconfigurable arrays, make FPGAs suitable for a broad set of application domains. On the other side, and mostly in the academic community, there are custom reconfigurable devices with different granularity levels that permit to achieve higher performance, compared to commercial FPGAs, but for a certain application domain. Although there are very good solutions in the academic state of the art, their main drawback from the industry point of view is that they require specific design environments and also, that the efforts and resources needed for designing such solutions are very high. This thesis work is focused on providing solutions that target commercial fine grain reconfigurable devices, FPGAs, in order to take advantage of existing tools and to keep the proposed solutions closer to the industry. Today FPGAs provide different reconfiguration options. Among them, the most challenging one is partial reconfiguration. This feature has special interest, as it permits system updates on the fly once the device is deployed, without the need of stopping it and without theoretical loss of performance. Partial reconfiguration is also an attractive feature because it permits to allocate different tasks/cores running in parallel in the device and change them on the fly as needed without disturbing other tasks/cores. This basic idea, brings software-like flexibility to hardware which, in combination with its inherited parallelism, opens the door for a broad amount of possibilities and applications, like runtime adaptive super-computing, adaptive embedded software ii accelerators, bio-inspired, self-reconfigurable and self-arrangeable systems. However, even though some commercial FPGAs provide partial reconfiguration features, its utilization is still in its early stages and it is not well supported by FPGA vendors, making its exploitation in real electronic systems very difficult. Therefore, there are several academic groups working to provide alternative solutions for the design and implementation of partially reconfigurable systems based on commercial FPGAs that intend to stimulate their integration and use in the industry. This research work intents to study different aspects of partially reconfigurable system on-chip and contribute with flexibility improvements. The main idea that will be followed along the thesis is that the design of reconfigurable systems will be considered an independent process from the design of cores that will be consumed by the system. This approach involves the design of flexible and scalable partial runtime reconfigurable systems, where most of the thesis contribution will be focused. More in detail, this thesis will contribute to improve architecture solutions, design tools and design flows of partially reconfigurable systems for commercial FPGAs and provide systems with higher flexibility and scalability. Flexibility and scalability in a reconfigurable system are terms that can be related to several aspects. In this thesis flexibility is mainly related to the diversity of tasks or cores a system can consume, while scalability is connected to the number of cores that can run in parallel and be independently reconfigured. Flexibility and scalability have to be covered by the system at different levels and the work presented in this thesis will contribute in all the specific levels. More in detail, from an architectural point of view, flexibility is reflected by the possibility of freely loading tasks or cores in a defined, scalable architecture. From the system point of view, flexibility is related to the possibility of modifying not only the system functionality by loading different tasks, but also to adapt the on-chip communications. From the device point of view, flexibility is reflected by the reconfiguration process transparency and, from the design point of view, it is oriented to the definition of design tools and flows that will permit, as far as possible, non specialized designers to design cores for a partially reconfigurable system and without knowing the system details. All the original proposed solutions, in each individual aspect, will be compared with the state of the art and complete systems solutions will be designed and will be integrated in different application domains in order to validate the thesis proposals. In order to achieve better understanding of the thesis and to facilitate the comparison with some, selected, related work, the thesis structure is not traditional. Instead of including a state of the art and a result Chapter, each Chapter is focused on a specific aspect of partially reconfigurable system design and includes state of the art and result sections. The first chapter, Chapter 1, introduces the main concepts to be used in the thesis. Chapter 2 is focused on reconfigurable systems architectures, contributing with architecture solutions and a design method. Chapter 3 proposes a solution that enhances the features of the architectures defined in Chapter 2, and provides more flexibility to the entire system by extending reconfiguration to the on-chip communication. Chapter 4 is related to the design flows and tools, where contributions are made in both aspects and the proposed solutions are compared with the state of the Abstract and Thesis Organization iii art. Complete systems, with different independency levels, are presented in Chapter 5 in order to validate the thesis contributions. Conclusions, a summary of contributions and the future work are included in Chapter 6. A more detailed description of each Chapter content is presented below: Chapter 1 provides an introduction to the reconfigurable systems based on FPGAs topic, by first defining the place of FPGAs in the electronic industry and afterwards, introducing the main concepts to be used along the thesis. Although the focus is put on commercial reconfigurable devices, some custom reconfigurable systems are also described in order to have a complete view of the options in reconfigurable devices. The Chapter discuses the thesis main topic, related to partial runtime reconfigurable systems, highlighting its main advantages and disadvantages and, introducing some of the approaches to be followed in this thesis. The main term introduced in this Chapter, associated to reconfigurable systems architectures, is ”Virtual Architecture”. The term defines the architecture of the partially reconfigurable system and how the different regions it is composed of are interconnected. A brief summary of the thesis main goals is included at the end of the Chapter. The main topic of Chapter 2 is related to reconfigurable systems architectures design. The Chapter includes a specific state of the art section that reviews some existing architecture solutions. After that, a general method for virtual architectures design, an original thesis contribution, is presented in detail. Afterwards, the method is applied to the design of general one dimensional (1D) and two dimensional (2D) architectures for Xilinx Virtex II/Pro FPGAs and, as an example, following the specific steps of the method, two 1D, bus based, architectures are designed. The architecture buses are compared with two state of the art solutions in terms of area and performance in the results section of the same Chapter. Chapter 3 is focused on reconfigurable systems on-chip communication issues, where the need of adaptability is the main topic. Again, a state of the art description of some 2D reconfigurable systems is presented at the beginning of the Chapter and, afterwards, an original solution, called Dynamic Reconfigurable NoC (DRNoC), is proposed. This solution covers different aspects. First, an architecture oriented to support adaptability in the on-chip communications is originally proposed. The architecture is mapped to a Virtex II FPGA by modifying a virtual architecture from the general ones presented in Chapter 2. Second, two types of reconfigurations that span through different levels of the OSI communication model are originally proposed. Third, a set of Network on Chip models, focused on the communication adaptability are designed and/or adapted and presented in the Chapter, along with an original NoC packet format and router architecture. These models are mapped to the DRNoC architecture and implementation cost parameters are defined and used to evaluate different implementation options. Regarding the architecture reconfigurability, it is important to remark that along the entire Chapter, intermediate test of possible partial reconfigurations and test results are included. At the end of the Chapter, the proposed architecture is compared with the state of the art using a set of structural parameters taken from a reference work and complemented with others defined in the Chapter. Chapter 4, focuses on the design tools and flows for partially reconfigurable systems. Again, an overview of the state of the art is included at the beginning of the Chapter. Abstract and Thesis Organization iv Afterwards, an original software solution for Virtex II configuration files (bitstreams) manipulations is presented. The first part of the solution is a study of the Virtex II/Pro FPGA bitstream format, used to define a set of equations for accessing a specific bitstream resource (at register or block level). Based on these equations, a set of tools for bitstream manipulation that target resource restricted devices are originally presented. Also, a design flow, based on systems and virtual architectures templates, which permits a straightforward core design by non partial reconfiguration experts and without knowing the system details, is originally proposed. In Chapter 5 four reconfigurable systems with different flexibility level, which corresponds to the level of the thesis proposals exploitation, are presented. The selected application domains attempt to demonstrate different advantages of the use of partial runtime reconfigurable systems and therefore are mainly a proof of concept. The first domain belongs to the wireless sensor networks, where t
    corecore