211 research outputs found

    A cross-platform OpenVX library for FPGA accelerators

    Get PDF
    FPGAs are an excellent platform to implement computer vision applications, since these applications tend to offer a high level of parallelism with many data-independent operations. However, the freedom in the solution design space of FPGAs represents a problem because each solution must be individually designed, verified, and tuned. The emergence of High Level Synthesis (HLS) helps solving this problem and has allowed the implementation of open programming standards as OpenVX for computer vision applications on FPGAs, such as the HiF1ipVX library developed exclusively for Xilinx devices. Although with the HiF1ipVX library, designers can develop solutions efficiently on Xilinx, they do not have an approach to port and run their code on FPGAs from other manufacturers. This work extends the HiFlipVX capabilities in two significant ways: supporting Intel FPGA devices and enabling execution on discrete FPGA accelerators. To provide both without affecting user-facing code, the new carried out implementation combines two HLS programming models: C++, using Intel''s system of tasks, and OpenCL, which provides the CPU interoperability. Comparing with pure OpenCL implementations, this work reduces kernel dispatch resources, saving up to 24% of ALUT resources for each kernel in a graph, and improves performance 2.6 x and energy consumption 1.6 x on average for a set of representative applications, compared with state-of-the-art frameworks

    FPGA Accelerators on Heterogeneous Systems: An Approach Using High Level Synthesis

    Get PDF
    La evolución de las FPGAs como dispositivos para el procesamiento con alta eficiencia energética y baja latencia de control, comparada con dispositivos como las CPUs y las GPUs, las han hecho atractivas en el ámbito de la computación de alto rendimiento (HPC).A pesar de las inumerables ventajas de las FPGAs, su inclusión en HPC presenta varios retos. El primero, la complejidad que supone la programación de las FPGAs comparada con dispositivos como las CPUs y las GPUs. Segundo, el tiempo de desarrollo es alto debido al proceso de síntesis del hardware. Y tercero, trabajar con más arquitecturas en HPC requiere el manejo y la sintonización de los detalles de cada dispositivo, lo que añade complejidad.Esta tesis aborda estos 3 problemas en diferentes niveles con el objetivo de mejorar y facilitar la adopción de las FPGAs usando la síntesis de alto nivel(HLS) en sistemas HPC.En un nivel próximo al hardware, en esta tesis se desarrolla un modelo analítico para las aplicaciones limitadas en memoria, que es una situación común en aplicaciones de HPC. El modelo, desarrollado para kernels programados usando HLS, puede predecir el tiempo de ejecución con alta precisión y buena adaptabilidad ante cambios en la tecnología de la memoria, como las DDR4 y HBM2, y en las variaciones en la frecuencia del kernel. Esta solución puede aumentar potencialmente la productividad de las personas que programan, reduciendo el tiempo de desarrollo y optimización de las aplicaciones.Entender los detalles de bajo nivel puede ser complejo para las programadoras promedio, y el desempeño de las aplicaciones para FPGA aún requiere un alto nivel en las habilidades de programación. Por ello, nuestra segunda propuesta está enfocada en la extensión de las bibliotecas con una propuesta para cómputo en visión artificial que sea portable entre diferentes fabricantes de FPGAs. La biblioteca se ha diseñado basada en templates, lo que permite una biblioteca que da flexibilidad a la generación del hardware y oculta decisiones de diseño críticas como la comunicación entre nodos, el modelo de concurrencia, y la integración de las aplicaciones en el sistema heterogéneo para facilitar el desarrollo de grafos de visión artificial que pueden ser complejos.Finalmente, en el runtime del host del sistema heterogéneo, hemos integrado la FPGA para usarla de forma trasparente como un dispositivo acelerador para la co-ejecución en sistemas heterogéneos. Hemos hecho una serie propuestas de altonivel de abstracción que abarca los mecanismos de sincronización y políticas de balanceo en un sistema altamente heterogéneo compuesto por una CPU, una GPU y una FPGA. Se presentan los principales retos que han inspirado esta investigación y los beneficios de la inclusión de una FPGA en rendimiento y energía.En conclusión, esta tesis contribuye a la adopción de las FPGAs para entornos HPC, aportando soluciones que ayudan a reducir el tiempo de desarrollo y mejoran el desempeño y la eficiencia energética del sistema.---------------------------------------------The emergence of FPGAs in the High-Performance Computing domain is arising thanks to their promise of better energy efficiency and low control latency, compared with other devices such as CPUs or GPUs.Albeit these benefits, their complete inclusion into HPC systems still faces several challenges. First, FPGA complexity means its programming more difficult compared to devices such as CPU and GPU. Second, the development time is longer due to the required synthesis effort. And third, working with multiple devices increments the details that should be managed and increase hardware complexity.This thesis tackles these 3 problems at different stack levels to improve and to make easier the adoption of FPGAs using High-Level Synthesis on HPC systems. At a close to the hardware level, this thesis contributes with a new analytical model for memory-bound applications, an usual situation for HPC applications. The model for HLS kernels can anticipate application performance before place and route, reducing the design development time. Our results show a high precision and adaptable model for external memory technologies such as DDR4 and HBM2, and kernel frequency changes. This solution potentially increases productivity, reducing application development time.Understanding low-level implementation details is difficult for average programmers, and the development of FPGA applications still requires high proficiency program- ming skills. For this reason, the second proposal is focused on the extension of a computer vision library to be portable among two of the main FPGA vendors. The template-based library allows hardware flexibility and hides design decisions such as the communication among nodes, the concurrency programming model, and the application’s integration in the heterogeneous system, to develop complex vision graphs easily.Finally, we have transparently integrated the FPGA in a high level framework for co-execution with other devices. We propose a set of high level abstractions covering synchronization mechanism and load balancing policies in a highly heterogeneous system with CPU, GPU, and FPGA devices. We present the main challenges that inspired this research and the benefits of the FPGA use demonstrating performance and energy improvements.<br /

    Hardware Accelarated Visual Tracking Algorithms. A Systematic Literature Review

    Get PDF
    Many industrial applications need object recognition and tracking capabilities. The algorithms developed for those purposes are computationally expensive. Yet ,real time performance, high accuracy and small power consumption are essential measures of the system. When all these requirements are combined, hardware acceleration of these algorithms becomes a feasible solution. The purpose of this study is to analyze the current state of these hardware acceleration solutions, which algorithms have been implemented in hardware and what modifications have been done in order to adapt these algorithms to hardware.Siirretty Doriast

    Efficient design and implementation of image processing algorithms on reconfigurable hardware using Handel-C

    Full text link
    Computer manipulation of images is generally defined as Digital Image Processing (DIP). DIP is used in variety of applications, including video surveillance, target recognition, and image enhancement. These applications are usually implemented in software but may use special purpose hardware for speed. With advances in the VLSI technology hardware implementation has become an attractive alternative. Assigning complex computation tasks to hardware and exploiting the parallelism and pipelining in algorithms yield significant speedup in running times. In this thesis the image processing algorithms like median filter, basic morphological operators, convolution and edge detection algorithms are implemented on FPGA. A pipelined architecture of these algorithms is presented. The proposed architectures are capable of producing one output on every clock cycle. The hardware modeling was accomplished using Handel-C (DK2 environment). The algorithm was tested on standard image processing benchmarks and the results are compared with that obtained on software

    Computer Vision Based Kidney’s (HK-2) Damaged Cells Classification with Reconfigurable Hardware Accelerator (FPGA)

    Get PDF
    In medical and health sciences, detection of cell injury plays an important role in diagnosis, personal treatment and disease prevention. Despite recent advancements in tools and methods for image classification, it is challenging to classify cell images with higher precision and accuracy. Cell classification based on computer vision offers significant benefits in biomedicine and healthcare. There have been studies reported where cell classification techniques have been complemented by Artificial Intelligence-based classifiers such as Convolutional Neural Networks. These classifiers suffer from the drawback of the scale of computational resources required for training and hence do not offer real-time classification capabilities for an embedded system plat-form. Field Programmable Gate Arrays (FPGAs) offer the flexibility of hardware reconfiguration and have emerged as a viable platform for algorithm acceleration. Given that the logic resources and on-chip memory available on a single device are still limited, hardware/software co-design is proposed where image pre-processing and network training was performed in software and trained architectures were mapped onto an FPGA device (Nexys4DDR) for real-time cell classification. This paper demonstrates that the embedded hardware-based cell classifier performs with almost 100% accuracy in detecting different types of damaged kidney cells

    Computer Vision-Based Kidney’s (HK-2) Damaged Cells Classification with Reconfigurable Hardware Accelerator (FPGA)

    Get PDF
    In medical and health sciences, the detection of cell injury plays an important role in diagnosis, personal treatment and disease prevention. Despite recent advancements in tools and methods for image classification, it is challenging to classify cell images with higher precision and accuracy. Cell classification based on computer vision offers significant benefits in biomedicine and healthcare. There have been studies reported where cell classification techniques have been complemented by Artificial Intelligence-based classifiers such as Convolutional Neural Networks. These classifiers suffer from the drawback of the scale of computational resources required for training and hence do not offer real-time classification capabilities for an embedded system platform. Field Programmable Gate Arrays (FPGAs) offer the flexibility of hardware reconfiguration and have emerged as a viable platform for algorithm acceleration. Given that the logic resources and on-chip memory available on a single device are still limited, hardware/software co-design is proposed where image pre-processing and network training were performed in software, and trained architectures were mapped onto an FPGA device (Nexys4DDR) for real-time cell classification. This paper demonstrates that the embedded hardware-based cell classifier performs with almost 100% accuracy in detecting different types of damaged kidney cells

    FPGA e sistemas embutidos multi-core para processamento de vídeo

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesO presente trabalho apresenta técnicas de processamento digital de sinal, nomeadamente em processamento de vídeo, recorrendo a tecnologia FPGA. Consiste numa introdução teórica sobre tópicos tais como o papel da visão artificial nos dias de hoje, reconhecimento de imagem, e técnicas matemáticas de processamento e análise morfol ógica de imagem. Aborda o tema do papel das FPGAs na tecnologia actual, e as suas vantagens quando utilizadas no processamento digital de sinal. Finalmente e demonstrado e explicado o algoritmo implementado na FPGA para deteção de contornos no processamento de vídeo, concluindo com uma análise a nível da sua eficiência, e discussão de melhorias a fazer num possível trabalho futuro em termos de otimização de recursos utilizados e velocidade de processamento.The present work presents techniques of digital signal processing, namely in video processing, using FPGA technology. It consists of a theoretical introduction about topics such as the role of artificial vision nowadays, image recognition and mathematical techniques of image processing and morphological analysis. It discusses the role of an FPGA in today's technology and its advantages when used in digital signal processing. Finally, it is demonstrated and explained the algorithm that was implemented in the FPGA for edge detection in video processing, concluding with an analysis in terms of efficiency, and discussion of improvements to do in a possible future work regarding the optimization of used resources and also of its processing speed

    Image Processing Using FPGAs

    Get PDF
    This book presents a selection of papers representing current research on using field programmable gate arrays (FPGAs) for realising image processing algorithms. These papers are reprints of papers selected for a Special Issue of the Journal of Imaging on image processing using FPGAs. A diverse range of topics is covered, including parallel soft processors, memory management, image filters, segmentation, clustering, image analysis, and image compression. Applications include traffic sign recognition for autonomous driving, cell detection for histopathology, and video compression. Collectively, they represent the current state-of-the-art on image processing using FPGAs

    FPGA Implementation for Image Edge Detection using Xilinx System Generator

    Get PDF
    Edge detection of an image is the primary and significant step in image processing. Image edge detection plays a vital role in computer vision and image processing. Hardware implementation of image edge detection is essential for real time application and it is used to increase the speed of operation. Field Programmable Gate Array(FPGA) plays a vital role in hardware implementation of image processing application because of its re-programmability and parallelism. The proposed work is FPGA implementation of image edge detection. The hardware implementation of edge detection algorithm is done using the most efficient tool called Xilinx System Generator(XSG).‘Xilinx System Generator’ (XSG) tool is used for system modeling and FPGA programming. The Xilinx System Generator tool is a new application in image processing, and offers a model based design for processing. The algorithms are designed by blocks and it also supports MATLAB codes through user customizable blocks. This paper aims at developing algorithmic models in MATLAB using Xilinx blockset for specific role then creating workspace in MATLAB to process image pixels and performing hardware implementation on FPGA
    corecore