71 research outputs found

    FPGA-Based Real-Time Embedded System for RISS/GPS Integrated Navigation

    Get PDF
    Navigation algorithms integrating measurements from multi-sensor systems overcome the problems that arise from using GPS navigation systems in standalone mode. Algorithms which integrate the data from 2D low-cost reduced inertial sensor system (RISS), consisting of a gyroscope and an odometer or wheel encoders, along with a GPS receiver via a Kalman filter has proved to be worthy in providing a consistent and more reliable navigation solution compared to standalone GPS receivers. It has been also shown to be beneficial, especially in GPS-denied environments such as urban canyons and tunnels. The main objective of this paper is to narrow the idea-to-implementation gap that follows the algorithm development by realizing a low-cost real-time embedded navigation system capable of computing the data-fused positioning solution. The role of the developed system is to synchronize the measurements from the three sensors, relative to the pulse per second signal generated from the GPS, after which the navigation algorithm is applied to the synchronized measurements to compute the navigation solution in real-time. Employing a customizable soft-core processor on an FPGA in the kernel of the navigation system, provided the flexibility for communicating with the various sensors and the computation capability required by the Kalman filter integration algorithm

    High-speed dynamic partial reconfiguration for field programmable gate arrays

    Get PDF
    With dynamically and partially reconfigurable designs, it is necessary that the speed of the reconfiguration be accomplished in a time that is sufficiently small such that the operation of reconfiguration is not the limiting factor in the process. Therefore, the communication between the source of configuration and the configurable unit must be made as fast as possible. The aim of this work is to use an embedded controller internal to the FPGA to control the reconfiguration process and obtain the maximum speed at which reconfiguration can occur, with current FPGA technology. The use of Direct Memory Access (DMA) driven operations instead of the current arbitrated bus architectures yielded a 30% increase in the speed of reconfiguration compared to other methods such as OPB_HWICAP and PLB_HWICAP [1]. The use of interrupt driven partial reconfiguration was also introduced, allowing the processor to switch to other tasks during the reconfiguration operation. All of these contributions lead to significant performance improvements over current partial reconfiguration subsystems. The configuration controller was tested using four partially reconfigurable system implementations: (i) one targeting the Hard IP PowerPC405 on Virtex-4, (ii) a second targeting the Soft IP MicroBlaze on Virtex-5, (iii) a third targeting the Hard IP PowerPC440 on Virtex-5, and (iv) a fourth system targets the Hard IP PowerPC440 on Virtex-5 capable of adaptive feedback. The adaptive feedback Virtex-5 system can use internal voltage and temperature measurements from the Xilinx System Monitor IP to dynamically increase or decrease the speed of reconfiguration and/or change other reconfigurable aspects of the system to better match the environment

    Performance and area evaluations of processor-based benchmarks on FPGA devices

    Get PDF
    The computing system on SoCs is being long-term research since the FPGA technology has emerged due to its personality of re-programmable fabric, reconfigurable computing, and fast development time to market. During the last decade, uni-processor in a SoC is no longer to deal with the high growing market for complex applications such as Mobile Phones audio and video encoding, image and network processing. Due to the number of transistors on a silicon wafer is increasing, the recent FPGAs or embedded systems are advancing toward multi-processor-based design to meet tremendous performance and benefit this kind of systems are possible. Therefore, is an upcoming age of the MPSoC. In addition, most of the embedded processors are soft-cores, because they are flexible and reconfigurable for specific software functions and easy to build homogenous multi-processor systems for parallel programming. Moreover, behavioural synthesis tools are becoming a lot more powerful and enable to create datapath of logic units from high-level algorithms such as C to HDL and available for partitioning a HW/SW concurrent methodology. A range of embedded processors is able to implement on a FPGA-based prototyping to integrate the CPUs on a programmable device. This research is, firstly represent different types of computer architectures in modern embedded processors that are followed in different type of software applications (eg. Multi-threading Operations or Complex Functions) on FPGA-based SoCs; and secondly investigate their capability by executing a wide-range of multimedia software codes (Integer-algometric only) in different models of the processor-systems (uni-processor or multi-processor or Co-design), and finally compare those results in terms of the benchmarks and resource utilizations within FPGAs. All the examined programs were written in standard C and executed in a variety numbers of soft-core processors or hardware units to obtain the execution times. However, the number of processors and their customizable configuration or hardware datapath being generated are limited by a target FPGA resource, and designers need to understand the FPGA-based tradeoffs that have been considered - Speed versus Area. For this experimental purpose, I defined benchmarks into DLP / HLS catalogues, which are "data" and "function" intensive respectively. The programs of DLP will be executed in LEON3 MP and LE1 CMP multi-processor systems and the programs of HLS in the LegUp Co-design system on target FPGAs. In preliminary, the performance of the soft-core processors will be examined by executing all the benchmarks. The whole story of this thesis work centres on the issue of the execute times or the speed-up and area breakdown on FPGA devices in terms of different programs

    Design and resource management of reconfigurable multiprocessors for data-parallel applications

    Get PDF
    FPGA (Field-Programmable Gate Array)-based custom reconfigurable computing machines have established themselves as low-cost and low-risk alternatives to ASIC (Application-Specific Integrated Circuit) implementations and general-purpose microprocessors in accelerating a wide range of computation-intensive applications. Most often they are Application Specific Programmable Circuiits (ASPCs), which are developer programmable instead of user programmable. The major disadvantages of ASPCs are minimal programmability, and significant time and energy overheads caused by required hardware reconfiguration when the problem size outnumbers the available reconfigurable resources; these problems are expected to become more serious with increases in the FPGA chip size. On the other hand, dominant high-performance computing systems, such as PC clusters and SMPs (Symmetric Multiprocessors), suffer from high communication latencies and/or scalability problems. This research introduces low-cost, user-programmable and reconfigurable MultiProcessor-on-a-Programmable-Chip (MPoPC) systems for high-performance, low-cost computing. It also proposes a relevant resource management framework that deals with performance, power consumption and energy issues. These semi-customized systems reduce significantly runtime device reconfiguration by employing userprogrammable processing elements that are reusable for different tasks in large, complex applications. For the sake of illustration, two different types of MPoPCs with hardware FPUs (floating-point units) are designed and implemented for credible performance evaluation and modeling: the coarse-grain MIMD (Multiple-Instruction, Multiple-Data) CG-MPoPC machine based on a processor IP (Intellectual Property) core and the mixed-mode (MIMD, SIMD or M-SIMD) variant-grain HERA (HEterogeneous Reconfigurable Architecture) machine. In addition to alleviating the above difficulties, MPoPCs can offer several performance and energy advantages to our data-parallel applications when compared to ASPCs; they are simpler and more scalable, and have less verification time and cost. Various common computation-intensive benchmark algorithms, such as matrix-matrix multiplication (MMM) and LU factorization, are studied and their parallel solutions are shown for the two MPoPCs. The performance is evaluated with large sparse real-world matrices primarily from power engineering. We expect even further performance gains on MPoPCs in the near future by employing ever improving FPGAs. The innovative nature of this work has the potential to guide research in this arising field of high-performance, low-cost reconfigurable computing. The largest advantage of reconfigurable logic lies in its large degree of hardware customization and reconfiguration which allows reusing the resources to match the computation and communication needs of applications. Therefore, a major effort in the presented design methodology for mixed-mode MPoPCs, like HERA, is devoted to effective resource management. A two-phase approach is applied. A mixed-mode weighted Task Flow Graph (w-TFG) is first constructed for any given application, where tasks are classified according to their most appropriate computing mode (e.g., SIMD or MIMD). At compile time, an architecture is customized and synthesized for the TFG using an Integer Linear Programming (ILP) formulation and a parameterized hardware component library. Various run-time scheduling schemes with different performanceenergy objectives are proposed. A system-level energy model for HERA, which is based on low-level implementation data and run-time statistics, is proposed to guide performance-energy trade-off decisions. A parallel power flow analysis technique based on Newton\u27s method is proposed and employed to verify the methodology

    Optimization of a hardware/software coprocessing platform for EEG eyeblink detection and removal

    Get PDF
    The feasibility of implementing a real-time system for removing eyeblink artifacts from electroencephalogram (EEG) recordings utilizing a hardware/software coprocessing platform was investigated. A software based wavelet and independent component analysis (ICA) eyeblink detection and removal process was extended to enable variation in its processing parameters. Exploiting the efficiency of hardware and the reconfigurability of software, it was ported to a field programmable gate array (FPGA) development platform which was found to be capable of implementing the revised algorithm, although not in real-time. The implemented hardware and software solution was applied to a collection of both simulated and clinically acquired EEG data with known artifact and waveform characteristics to assess its speed and accuracy. Configured for optimal accuracy in terms of minimal false positives and negatives as well as maintaining the integrity of the underlying EEG, especially when encountering EEG waveform patterns with an appearance similar to eyeblink artifacts, the system was capable of processing a 10 second EEG epoch in an average of 123 seconds. Configured for efficiency, but with diminished accuracy, the system required an average of 34 seconds. Varying the ICA contrast function showed that the gaussian nonlinearity provided the best combination of reliability and accuracy, albeit with a long execution time. The cubic nonlinearity was fast, but unreliable, while the hyperbolic tangent contrast function frequently diverged. It is believed that the utilization of programmable logic with increased logic capacity and processing speed may enable this approach to achieve the objective of real-time operation

    Conceptual design and realization of a dynamic partial reconfiguration extension of an existing soft-core processor

    Get PDF
    Viele aktuelle Field Programmable Gate Arrays (FPGAs) unterstützen die Technik der partiellen Rekonfiguration (PR), durch die dynamisch zur Laufzeit ein Hardware-Design auch nur teilweise ausgetauscht werden kann. Die vorliegende Arbeit integriert PR-Funktionalität in die an der Technischen Universität Ilmenau für harte Echtzeitaufgaben mit hochpräzisen Fließkommaberechnungen entwickelte VHDL Integrated Softcore Architecture for Reconfigurable Devices (ViSARD). Zu diesem Zweck wird die arithmetisch-logische Einheit angepasst, um das Auswechseln von Fließkomma-Ausführungseinheiten zu ermöglichen. Ziele der Entwicklung des PR-Systems sind hohe Geschwindigkeit, niedrige Latenz, niedrige Ressourcenkosten und harte Echtzeitfähigkeit. Erreicht werden diese durch die Umsetzung einer eigenen Steuereinheit (partial reconfiguration controller), die partielle Bitströme aus externem RAM über einen standardmäßigen AXI-Bus lädt sowie die entsprechende Erweiterung der ViSARD. In einem Testdesign, das zwischen drei verschiedenen Konfigurationen mit je zwischen einer und drei Ausführungseinheiten wechselt, hat das entwickelte PR-System den maximal spezifierten Bitstromdurchsatz auf dem Ziel-FPGA erreicht und den Verbrauch an Lookup-Tabellen um etwa 40 % verringert.Many modern field-programmable gate arrays (FPGAs) support partial reconfiguration, which allows to dynamically replace only a part of a design at run time. In this thesis, partial reconfiguration capability is integrated with the VHDL Integrated Softcore Architecture for Reconfigurable Devices (ViSARD) developed at Technische Universität Ilmenau and conceived for hard real-time tasks requiring floating-point calculations with high precision. Specifically, its arithmetic logic unit is modified to allow exchanging floating-point arithmetic execution units. Design goals of the partial reconfiguration system are high speed, low latency, low resource overhead, and hard real-time capability. They are reached by implementing a custom partial reconfiguration controller loading partial bitstreams from external RAM over a standard AXI bus and extending the ViSARD appropriately. In a test design that switched between 3 different configurations each containing between 1 and 3 execution units, the proposed partial reconfiguration system achieved the maximum specified bitstream throughput on the target FPGA and allowed for roughly 40 % reduced look-up table usage

    Development of a Flexible FPGA-Based Platform for Flight Control System Research

    Get PDF
    This work is part of ongoing research conducted at Virginia Commonwealth University relating to unmanned aerial vehicles. The primary objective of this thesis was to develop a flexible, high-performance autopilot platform in order to facilitate research on advanced flight control algorithms. A dual FPGA-based system architecture utilizing a stacked, multi-board design was created to meet this goal. Processing tasks were split between the two FPGA devices, allowing for improved system timing and increased throughput. A combination of analog and digital filtering techniques were employed in the new system, resulting in enhanced sensor accuracy and precision compared to the previous generation autopilot system. Several important improvements to the safety and reliability of the overall system were also achieved

    Achieving a better balance between productivity and performance on FPGAs through Heterogeneous Extensible Multiprocessor Systems

    Get PDF
    Field Programmable Gate Arrays (FPGAs) were first introduced circa 1980, and they held the promise of delivering performance levels associated with customized circuits, but with productivity levels more closely associated with software development. Achieving both performance and productivity objectives has been a long standing challenge problem for the reconfigurable computing community and remains unsolved today. On one hand, Vendor supplied design flows have tended towards achieving the high levels of performance through gate level customization, but at the cost of very low productivity. On the other hand, FPGA densities are following Moore\u27s law and and can now support complete multiprocessor system architectures. Thus FPGAs can be turned into an architecture with programmable processors which brings productivity but sacrifices the peak performance advantages of custom circuits. In this thesis we explore how the two use cases can be combined to achieve the best from both. The flexibility of the FPGAs to host a heterogeneous multiprocessor system with different types of programmable processors and custom accelerators allows the software developers to design a platform that matches the unique performance needs of their application. However, currently no automated approaches are publicly available to create such heterogeneous architectures as well as the software support for these platforms. Creating base architectures, configuring multiple tool chains, and repetitive engineering design efforts can and should be automated. This thesis introduces Heterogeneous Extensible Multiprocessor System (HEMPS) template approach which allows an FPGA to be programmed with productivity levels close to those associated with parallel processing, and with performance levels close to those associated with customized circuits. The work in this thesis introduces an ArchGen script to automate the generation of HEMPS systems as well as a library of portable and self tuning polymorphic functions. These tools will abstract away the HW/SW co-design details and provide a transparent programming language to capture different levels of parallelisms, without sacrificing productivity or portability

    Adaptive Distributed Architectures for Future Semiconductor Technologies.

    Full text link
    Year after year semiconductor manufacturing has been able to integrate more components in a single computer chip. These improvements have been possible through systematic shrinking in the size of its basic computational element, the transistor. This trend has allowed computers to progressively become faster, more efficient and less expensive. As this trend continues, experts foresee that current computer designs will face new challenges, in utilizing the minuscule devices made available by future semiconductor technologies. Today's microprocessor designs are not fit to overcome these challenges, since they are constrained by their inability to handle component failures by their lack of adaptability to a wide range of custom modules optimized for specific applications and by their limited design modularity. The focus of this thesis is to develop original computer architectures, that can not only survive these new challenges, but also leverage the vast number of transistors available to unlock better performance and efficiency. The work explores and evaluates new software and hardware techniques to enable the development of novel adaptive and modular computer designs. The thesis first explores an infrastructure to quantitatively assess the fallacies of current systems and their inadequacy to operate on unreliable silicon. In light of these findings, specific solutions are then proposed to strengthen digital system architectures, both through hardware and software techniques. The thesis culminates with the proposal of a radically new architecture design that can fully adapt dynamically to operate on the hardware resources available on chip, however limited or abundant those may be.PHDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/102405/1/apellegr_1.pd

    Implementing an Embedded Linux System in Xilinx Zynq

    Get PDF
    [ENG]The final achievement of this project is to develop and implement a custom and Embedded Linux Operating System (OS) integrated with a specific PL peripheral. This OS will be developed on ZedBoard (Zynq Evaluation & Development Board) development kit, Xilinx's Zynq-7000 All Programmable System on Chip which contains a dual core ARM Cortex-A9 and a 7 Series FPGA Artix-7. Therefore, how to create, configure, build and implement an Embedded Linux OS on ZedBoard will be explained in detail during this Final Bachelor Thesis. The entire development process has been structured in several chapters according to the logic order which should be followed to perform it. An overview of the chapters is showed below: Chapter 1: overall vision of the goals of this project and why perform it. Chapter 2: short introduction to Embedded Systems, to ZedBoard, to Xilinx Design Environment, and to some GNU tools. Chapter 3: configuration and implementation of a “Basic” and Custom Embedded Linux OS. Note the importance which BuildRoot will have in this process. Chapter 4: configuration and implementation of a complete Ubuntu Linux OS. Note that this Ubuntu version can be used as any generic PC. Chapter 5: summary of the achieved objectives and the respective conclutions. Chapter 6: the bibliography which has been used to perform this thesis. Appendix 1: the whole process (i.e. step by step) of developing and implementing the two different operating systems on ZedBoard. Appendix 2: the tools and programs which have been required before start this thesis. [SPA] El objetivo final de este proyecto es el desarrollo y la implementación de un Sistema Operativo (SO) Linux Embebido personalizado. Este SO será desarrollado sobre el kit de desarrollo ZedBoard (Zynq Evaluation & Development Board), el cual consiste en un sistema Xilinx's Zynq-7000 All Programmable System on Chip, que se puede dividir en un procesador ARM Cortex-A9 de doble núcleo, y en una 7 Series FPGA Artix-7. Por lo tanto, cómo crear, configurar, construir e implementar este SO Linux Embebido personalizado sobre la ZedBoard, será profundamente explicado durante todo este Trabajo de Fin de Grado. El proceso de desarrollo ha sido estructurado en 6 capítulos y 2 apéndices, acorde con el orden lógico y temporal que debería ser seguido para implementar este proyecto. A continuación, se muestra un resumen de cada capítulo: Capítulo 1: visión general de todos los objetivos de este proyecto y por qué son interesantes. Capítulo 2: breve introducción a los sistemas embebidos, a la placa ZedBoard, al entorno de desarrollo Xilinx Design Environment y a algunas herramientas GNU. Capítulo 3: configuración e implementación de un básico y personalizado SO Linux Embebido. Destacar la importancia que BuildRoot tendrá en este proceso.Capítulo 4: configuración e implementación de un completo SO Ubuntu Linux. Destacar que esta versión de Ubuntu puede perfectamente ser utilizada para las mismas tareas que cualquier ordenador convencional. Capítulo 5: resumen de todos los logros alcanzados y respectivas conclusiones. Capítulo 6: bibliografía utilizada durante este proyecto. Apéndice 1: proceso íntegro (paso a paso) sobre cómo desarrollar e implementar los dos SO que han sido mencionados previamente sobre la ZedBoard. Apéndice 2: programas que deben ser instalados antes de empezar este proyecto.Escuela Técnica Superior de Ingeniería de TelecomunicaciónUniversidad Politécnica de Cartagen
    corecore