Search CORE

167,129 research outputs found

2D hardware acceleration

Author: Bodenmann Joel
Corthay François
Publication venue
Publication date: 04/05/2018
Field of study

The objective of the project is to develop an IP-core that provides hardware acceleration for common 2D rendering operations in an embedded system. The requirements for graphical user interfaces (GUI) on modern display and touchscreen based systems are increasing steadily. Rendering complex and attractive GUIs requires a lot of processing power. At the same time, energy consumption for most of these embedded systems should decrease. Being able to off-load processor intensive tasks such as rendering of 2D shapes to dedicated hardware vastly decreases rendering time and frees a lot of processor resources which leads to a faster GUI and a less power consuming system

RERO DOC Digital Library

Achieving High Speed CFD simulations: Optimization, Parallelization, and FPGA Acceleration for the unstructured DLR TAU Code

Author: Andres-Perez Esther
Caloto Aitor
Widhalm Markus
Publication venue
Publication date: 01/01/2009
Field of study

Today, large scale parallel simulations are fundamental tools to handle complex problems. The number of processors in current computation platforms has been recently increased and therefore it is necessary to optimize the application performance and to enhance the scalability of massively-parallel systems. In addition, new heterogeneous architectures, combining conventional processors with specific hardware, like FPGAs, to accelerate the most time consuming functions are considered as a strong alternative to boost the performance. In this paper, the performance of the DLR TAU code is analyzed and optimized. The improvement of the code efficiency is addressed through three key activities: Optimization, parallelization and hardware acceleration. At first, a profiling analysis of the most time-consuming processes of the Reynolds Averaged Navier Stokes flow solver on a three-dimensional unstructured mesh is performed. Then, a study of the code scalability with new partitioning algorithms are tested to show the most suitable partitioning algorithms for the selected applications. Finally, a feasibility study on the application of FPGAs and GPUs for the hardware acceleration of CFD simulations is presented

Institute of Transport Research:Publications

Crossref

Hardware Acceleration of Cipher Attack

Author: Okuliar Adam
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2009
Field of study

Hardwarová akcelerácia výpočtu býva často vhodným nástrojom ako docieliť výrazne lepšieho výkonu pri spracovávaní veľkého množstva dát alebo pri realizácii algoritmu ktorý je možné dobre paralelizovať. Cieľom práce je demonštrovať výsledky použitia FPGA obvodov na implementáciu algoritmu s exponenciálnou zložitosťou. Zvoleným algoritmom je útok hrubou silou na šifrovací algoritmus WEP so 40 bitovým klúčom. Účelom práce je porovnať vlastnosti a výkon softwarovej a hardwarovej implementácie algoritmu.Hardware acceleration is often good tool to achieve significantly better performance of processing great ammount of data or of realization of parallel algoritms. Aim of this work is to demonstrate resoluts of using FPGA circuits for implementation exponentially complex algorithm. As example haschosen brute-force attack on WEP cryptographic algorithm with 40-bit long key. Goal of this work is to compare properties and performance of software and hardware implementation of choosen algorithm.

Digital library of Brno University of Technology

National Repository of Grey Literature

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

Author: Gupta Rajesh K.
Lin Jeng-Hau
Srivastava Mani
Tu Zhuowen
Xing Tianwei
Zhang Zhiru
Zhao Ritchie
Publication venue
Publication date: 15/07/2017
Field of study

State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution. Such networks strain the computational capabilities and energy available to embedded and mobile processing platforms, restricting their use in many important applications. In this paper, we push the boundaries of hardware-effective CNN design by proposing BCNN with Separable Filters (BCNNw/SF), which applies Singular Value Decomposition (SVD) on BCNN kernels to further reduce computational and storage complexity. To enable its implementation, we provide a closed form of the gradient over SVD to calculate the exact gradient with respect to every binarized weight in backward propagation. We verify BCNNw/SF on the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator for CIFAR-10 on FPGA hardware. Our BCNNw/SF accelerator realizes memory savings of 17% and execution time reduction of 31.3% compared to BCNN with only minor accuracy sacrifices.Comment: 9 pages, 6 figures, accepted for Embedded Vision Workshop (CVPRW

arXiv.org e-Print Archive

Crossref

Hardware-Based Acceleration of Image Filtration

Author: Zelinka Martin
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2009
Field of study

Tato bakalářská práce se zabývá hardwarovou akcelerací filtrace obrazu s využitím FIR filtrů. Definuje základní pojmy týkající se digitálního obrazu, popisuje princip filtrace a stručně vysvětluje techniky používané při detekci hran v obraze a při vyhlazování obrazu. Hlavním cílem práce je rozbor několika metod akcelerace FIR filtrů, které jsou vhodné pro realizaci v hardwaru, a následná implementace vybrané metody s možností změny konfigurace za běhu s ohledem na maximální propustnost. V závěru práce je uvedeno vyhodnocení metody z hlediska propustnosti a provedeno srovnání s optimální softwarovou implementací.This bachelor's thesis deals with the hardware-based acceleration of image filtration using FIR filters. Define basic concept about digital image, describes the principle of filtration and briefly explains the techniques used to detect edges in an image and smoothing the image. The main aim of this work is the analysis of several acceleration methods of FIR filters, which are suitable for implementation in hardware, and the subsequent implementation of the selected method with the possibility of configuration changes at runtime with regard to the maximum throughput. In conclusion of this work it is described the evaluation of the method and made compared with the optimal software implementations.

Digital library of Brno University of Technology

National Repository of Grey Literature