205 research outputs found

    An FPGA Implementation of HW/SW Codesign Architecture for H.263 Video Coding

    Get PDF
    Chapitre 12 http://www.intechopen.com/download/pdf/pdfs_id/1574

    Efficient Architecture and Implementation of Vector Median Filter in Co-Design Context

    Get PDF
    This work presents an efficient fast parallel architecture of the Vector Median Filter (VMF) using combined hardware/software (HW/SW) implementation. The hardware part of the system is implemented using VHDL language, whereas the software part is developed using C/C++ language. The software part of the embedded system uses the NIOS-II softcore processor and the operating system used is ÎĽClinux. The comparison between the software and HW/SW solutions shows that adding a hardware part in the design attempts to speed up the filtering process compared to the software solution. This efficient embedded system implementation can perform well in several image processing applications

    Embedded System Architecture for Mobile Augmented Reality. Sailor Assistance Case Study

    Get PDF
    International audienceWith upcoming see-through displays new kinds of applications of Augmented Reality are emerging. However this also raises questions about the design of associated embedded systems that must be lightweight and handle object positioning, heterogeneous sensors, wireless communications as well as graphic computation. This paper studies the specific case of a promising Mobile AR processor, which is different from usual graphics applications. A complete architecture is described, designed and prototyped on FPGA. It includes hard-ware/software partitioning based on the analysis of application requirements. The specification of an original and flexible coprocessor is detailed. Choices as well as optimizations of algorithms are also described. Implementation results and performance evaluation show the relevancy of the proposed approach and demonstrate a new kind of architecture focused on object processing and optimized for the AR domain

    Development of a multi-core and multi-accelerator platform for approximate computing

    Get PDF
    Proyecto de graduación (Licenciatura en Ingeniería en Electrónica) Instituto Tecnológico de Costa Rica, Escuela de Ingeniería Electrónica, 2017.Changing environment in the current technologies have introduce a gap between the ever growing needs of users and the state of present designs. As high data and hard computation applications moved forward in the near future, the current trend reaches for a greater performance. Approximate computing enters this scheme to boost a system overall attributes, while working with intrinsic and error tolerable characteristics both in software and hardware. This work proposes a multicore and multi-accelerator platform design that uses both exact and approximate versions, also providing interaction with a software counterpart to ensure usage of both layouts. A set of five di↵erent approximate accelerator versions and one exact, are present for three di↵erent image processing filters, Laplace, Sobel and Gauss, along with their respective characterization in terms of Power, Area and Delay time. This will show better results for design versions 2 and 3. Later it will be seen three di↵erent interfaces designs for accelerators along with a softcore processor, Altera’s NIOS II. Results gathered demonstrate a definitively improvement while using approximate accelerators in comparison with software and exact accelerator implementations. Memory accessing and filter operations times, for two di↵erent matrices sizes, present a gain of 500, 2000 and 1500 cycles measure for Laplace, Gauss and Sobel filters respectively, while contrasting software times, and a range of 28-84, 20-40 and 68-100 ticks decrease against the use of an exact accelerator

    ON FPGA BASED ACCELERATION OF IMAGE PROCESSING IN MOBILE ROBOTICS

    Get PDF
    In visual navigation tasks, a lack of the computational resources is one of the main limitations of micro robotic platforms to be deployed in autonomous missions. It is because the most of nowadays techniques of visual navigation relies on a detection of salient points that is computationally very demanding. In this paper, an FPGA assisted acceleration of image processing is considered to overcome limitations of computational resources available on-board and to enable high processing speeds while it may lower the power consumption of the system. The paper reports on performance evaluation of the CPU–based and FPGA–based implementations of a visual teach-and-repeat navigation system based on detection and tracking of the FAST image salient points. The results indicate that even a computationally efficient FAST algorithm can benefit from a parallel (low–cost) FPGA–based implementation that has a competitive processing time but more importantly it is a more power efficient

    Run-time management for future MPSoC platforms

    Get PDF
    In recent years, we are witnessing the dawning of the Multi-Processor Systemon- Chip (MPSoC) era. In essence, this era is triggered by the need to handle more complex applications, while reducing overall cost of embedded (handheld) devices. This cost will mainly be determined by the cost of the hardware platform and the cost of designing applications for that platform. The cost of a hardware platform will partly depend on its production volume. In turn, this means that ??exible, (easily) programmable multi-purpose platforms will exhibit a lower cost. A multi-purpose platform not only requires ??exibility, but should also combine a high performance with a low power consumption. To this end, MPSoC devices integrate computer architectural properties of various computing domains. Just like large-scale parallel and distributed systems, they contain multiple heterogeneous processing elements interconnected by a scalable, network-like structure. This helps in achieving scalable high performance. As in most mobile or portable embedded systems, there is a need for low-power operation and real-time behavior. The cost of designing applications is equally important. Indeed, the actual value of future MPSoC devices is not contained within the embedded multiprocessor IC, but in their capability to provide the user of the device with an amount of services or experiences. So from an application viewpoint, MPSoCs are designed to ef??ciently process multimedia content in applications like video players, video conferencing, 3D gaming, augmented reality, etc. Such applications typically require a lot of processing power and a signi??cant amount of memory. To keep up with ever evolving user needs and with new application standards appearing at a fast pace, MPSoC platforms need to be be easily programmable. Application scalability, i.e. the ability to use just enough platform resources according to the user requirements and with respect to the device capabilities is also an important factor. Hence scalability, ??exibility, real-time behavior, a high performance, a low power consumption and, ??nally, programmability are key components in realizing the success of MPSoC platforms. The run-time manager is logically located between the application layer en the platform layer. It has a crucial role in realizing these MPSoC requirements. As it abstracts the platform hardware, it improves platform programmability. By deciding on resource assignment at run-time and based on the performance requirements of the user, the needs of the application and the capabilities of the platform, it contributes to ??exibility, scalability and to low power operation. As it has an arbiter function between different applications, it enables real-time behavior. This thesis details the key components of such an MPSoC run-time manager and provides a proof-of-concept implementation. These key components include application quality management algorithms linked to MPSoC resource management mechanisms and policies, adapted to the provided MPSoC platform services. First, we describe the role, the responsibilities and the boundary conditions of an MPSoC run-time manager in a generic way. This includes a de??nition of the multiprocessor run-time management design space, a description of the run-time manager design trade-offs and a brief discussion on how these trade-offs affect the key MPSoC requirements. This design space de??nition and the trade-offs are illustrated based on ongoing research and on existing commercial and academic multiprocessor run-time management solutions. Consequently, we introduce a fast and ef??cient resource allocation heuristic that considers FPGA fabric properties such as fragmentation. In addition, this thesis introduces a novel task assignment algorithm for handling soft IP cores denoted as hierarchical con??guration. Hierarchical con??guration managed by the run-time manager enables easier application design and increases the run-time spatial mapping freedom. In turn, this improves the performance of the resource assignment algorithm. Furthermore, we introduce run-time task migration components. We detail a new run-time task migration policy closely coupled to the run-time resource assignment algorithm. In addition to detailing a design-environment supported mechanism that enables moving tasks between an ISP and ??ne-grained recon??gurable hardware, we also propose two novel task migration mechanisms tailored to the Network-on-Chip environment. Finally, we propose a novel mechanism for task migration initiation, based on reusing debug registers in modern embedded microprocessors. We propose a reactive on-chip communication management mechanism. We show that by exploiting an injection rate control mechanism it is possible to provide a communication management system capable of providing a soft (reactive) QoS in a NoC. We introduce a novel, platform independent run-time algorithm to perform quality management, i.e. to select an application quality operating point at run-time based on the user requirements and the available platform resources, as reported by the resource manager. This contribution also proposes a novel way to manage the interaction between the quality manager and the resource manager. In order to have a the realistic, reproducible and ??exible run-time manager testbench with respect to applications with multiple quality levels and implementation tradev offs, we have created an input data generation tool denoted Pareto Surfaces For Free (PSFF). The the PSFF tool is, to the best of our knowledge, the ??rst tool that generates multiple realistic application operating points either based on pro??ling information of a real-life application or based on a designer-controlled random generator. Finally, we provide a proof-of-concept demonstrator that combines these concepts and shows how these mechanisms and policies can operate for real-life situations. In addition, we show that the proposed solutions can be integrated into existing platform operating systems

    Leros: A Tiny Microcontroller for FPGAs

    Get PDF
    Abstract—Leros is a tiny microcontroller that is optimized for current low-cost FPGAs. Leros is designed with a balanced logic to on-chip memory relation. The design goal is a microcontroller that can be clocked in about half of the speed a pipelined on-chip memory and consuming less than 300 logic cells. The architecture, which follows from the design goals, is a pipelined 16-bit accumulator processor. An implementation of Leros needs at least one on-chip memory block and a few hundred logic cells. The application areas of Leros are twofold: First, it can be used as an intelligent peripheral device for auxiliary functions in an FPGA based system-on-chip design. Second, the very small size of Leros makes it an attractive softcore for many-core research with low-cost FPGAs. I

    Optimized Fast Fourier Transform Architecture Using Instruction Set Architecture Extension In Low-End Digital Signal Controller

    Get PDF
    Smart microgrids have emerged as a viable solution in case of emergency situations occurred at the main electricity grid. The main concern of a smart microgrid is the degradation of the power quality caused by harmonic distortion originated from the non-linear equipment. With the rapid development of power electronic technology, the increased of harmonic-producing loads in the smart microgrids necessitating a new digital signal controller architecture for the harmonic measurement system. While the current system configurations are directed towards the 32-bit architecture, it shows higher requirements in area footprint and multi-core setup. This thesis presents the design of a low-end digital signal controller architecture using instruction set architecture (ISA) extension for the implementation of the harmonic measurement system in a smart microgrid. A new architecture, called UTeMRISC, is developed from the baseline 8-bit microcontroller with the capability to perform signal processing applications such as Fast Fourier Transform (FFT). The architecture is improved using the Application-Specific Instruction Set Processor (ASIP) approach by extending the instruction set architecture to 16-bit length. Instruction set customization is implemented to enable the execution of computationally intensive tasks. The entire architecture is described in Verilog Hardware Description Language (HDL) and implemented on the Virtex-6 FPGA board. From the test programs, UTeMRISC has demonstrated faster execution times and higher maximum operating frequency while not significantly increased the core’s resource utilization. Compared to the initial processor architecture, the support of extended ISA has increased the UTeMRISC core by 21.8% but at the same time allows to execute Fast Fourier Transform algorithm up to 5× faster. The combine effort of ISA extension and optimized instruction set generation results in up to 1 Mega sample per second, which translated to 66.8% increase of data throughput in the FFT algorithm when compared to a 32-bit architecture. This research proves that with comprehensive ASIP methodology and ISA extension, a low-end digital signal controller architecture is feasible and effective to be implemented in a harmonic measurement system for a smart microgrid
    • …
    corecore