Search CORE

3,187 research outputs found

A general framework for efficient FPGA implementation of matrix product

Author: Amira A.
Bensaali F.
Sotudeh R.
Publication venue
Publication date: 01/01/2007
Field of study

Original article can be found at: http://www.medjcn.com/ Copyright Softmotor LimitedHigh performance systems are required by the developers for fast processing of computationally intensive applications. Reconfigurable hardware devices in the form of Filed-Programmable Gate Arrays (FPGAs) have been proposed as viable system building blocks in the construction of high performance systems at an economical price. Given the importance and the use of matrix algorithms in scientific computing applications, they seem ideal candidates to harness and exploit the advantages offered by FPGAs. In this paper, a system for matrix algorithm cores generation is described. The system provides a catalog of efficient user-customizable cores, designed for FPGA implementation, ranging in three different matrix algorithm categories: (i) matrix operations, (ii) matrix transforms and (iii) matrix decomposition. The generated core can be either a general purpose or a specific application core. The methodology used in the design and implementation of two specific image processing application cores is presented. The first core is a fully pipelined matrix multiplier for colour space conversion based on distributed arithmetic principles while the second one is a parallel floating-point matrix multiplier designed for 3D affine transformations.Peer reviewe

University of Hertfordshire Research Archive

FPGA-Based Bandwidth Selection for Kernel Density Estimation Using High Level Synthesis Approach

Author: Gramacki Artur
Gramacki Jarosław
Sawerwain Marek
Publication venue
Publication date: 29/09/2015
Field of study

FPGA technology can offer significantly hi\-gher performance at much lower power consumption than is available from CPUs and GPUs in many computational problems. Unfortunately, programming for FPGA (using ha\-rdware description languages, HDL) is a difficult and not-trivial task and is not intuitive for C/C++/Java programmers. To bring the gap between programming effectiveness and difficulty the High Level Synthesis (HLS) approach is promoting by main FPGA vendors. Nowadays, time-intensive calculations are mainly performed on GPU/CPU architectures, but can also be successfully performed using HLS approach. In the paper we implement a bandwidth selection algorithm for kernel density estimation (KDE) using HLS and show techniques which were used to optimize the final FPGA implementation. We are also going to show that FPGA speedups, comparing to highly optimized CPU and GPU implementations, are quite substantial. Moreover, power consumption for FPGA devices is usually much less than typical power consumption of the present CPUs and GPUs.Comment: 23 pages, 6 figures, extended version of initial pape

arXiv.org e-Print Archive

Biblioteka Nauki - repozytorium artykuÅÃ³w

OPTIMAL AREA AND PERFORMANCE MAPPING OF K-LUT BASED FPGAS

Author: Alexandru E. ŞUŞU
Cornel POPESCU
George CULEA
Ion I. BUCUR
Publication venue
Publication date
Field of study

FPGA circuits are increasingly used in many fields: for rapid prototyping of new products (including fast ASIC implementation), for logic emulation, for producing a small number of a device, or if a device should be reconfigurable in use (reconfigurable computing). Determining if an arbitrary, given wide, function can be implemented by a programmable logic block, unfortunately, it is generally, a very difficult problem. This problem is called the Boolean matching problem. This paper introduces a new implemented algorithm able to map, both for area and performance, combinational networks using k-LUT based FPGAs.k-LUT based FPGAs, combinational circuits, performance-driven mapping.

Research Papers in Economics

Evaluation of Single-Chip, Real-Time Tomographic Data Processing on FPGA - SoC Devices

Author: Białas P.
Curceanu C.
Czerwiński E.
Dulski K.
Flak B.
Gajos A.
Gorgol M.
Głowacz B.
Hiesmayr B. C.
Jasińska B.
Kacprzak K.
Kajetanowicz M.
Kisielewska D.
Korcyl G.
Kowalski P.
Kozik T.
Krawczyk N.
Krzemień W.
Kubicz E.
Mohammed M.
Moskal P.
Niedźwiecki Sz.
Pawlik-Niedźwiecka M.
Pałka M.
Raczyński L.
Rajda P.
Rudy Z.
Salabura P.
Sharma N. G.
Sharma S.
Shopa R. Y.
Silarski M.
Skurzok M.
Strzempek P.
Wieczorek A.
Wiślicki W.
Zaleski R.
Zgardzińska B.
Zieliński M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

A novel approach to tomographic data processing has been developed and evaluated using the Jagiellonian PET (J-PET) scanner as an example. We propose a system in which there is no need for powerful, local to the scanner processing facility, capable to reconstruct images on the fly. Instead we introduce a Field Programmable Gate Array (FPGA) System-on-Chip (SoC) platform connected directly to data streams coming from the scanner, which can perform event building, filtering, coincidence search and Region-Of-Response (ROR) reconstruction by the programmable logic and visualization by the integrated processors. The platform significantly reduces data volume converting raw data to a list-mode representation, while generating visualization on the fly.Comment: IEEE Transactions on Medical Imaging, 17 May 201

arXiv.org e-Print Archive

Crossref

Jagiellonian Univeristy Repository

Recommended from our members

On evolution of relatively large combinational logic circuits

Author: Kalganova T
Lambert C
Lipnitskaya N
Stomeo E
Yatskevich Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1
Field of study

Evolvable hardware (EHW) (Yao and Higuchi, 1999) is a technique introduced to automatically design circuits where the circuit configuration is carried out by evolutionary algorithms. One of the main difficulties in using EHW to solve real-world problems is the scalability. Until now, several strategies have been proposed to avoid this problem, but none of them completely tackle the issue. In this paper three different methods for evolving the most complex circuits have been tested for their scalability. These methods are bi-directional incremental evolution (SO-BIE); generalised disjunction decomposition (GD-BIE) and evolutionary strategies (ES) with dynamic mutation rate. In order to achieve the generalised conclusions the chosen approaches were tested using multipliers, traditionally used in EHW, but also logic circuits taken from MCNC (Yang, 1991) benchmark library and randomly generated circuits. The analysis of the approaches demonstrated that PLA-based ES is capable of evolving logic circuits of up to 12 inputs. The use of SO-BIE allows the generation of fully functional circuits of 14 inputs and GD-BIE is estimated to be able to evolve circuits of 21 inputs

Brunel University Research Archive

Achieving High Speed CFD simulations: Optimization, Parallelization, and FPGA Acceleration for the unstructured DLR TAU Code

Author: Andres-Perez Esther
Caloto Aitor
Widhalm Markus
Publication venue
Publication date: 01/01/2009
Field of study

Today, large scale parallel simulations are fundamental tools to handle complex problems. The number of processors in current computation platforms has been recently increased and therefore it is necessary to optimize the application performance and to enhance the scalability of massively-parallel systems. In addition, new heterogeneous architectures, combining conventional processors with specific hardware, like FPGAs, to accelerate the most time consuming functions are considered as a strong alternative to boost the performance. In this paper, the performance of the DLR TAU code is analyzed and optimized. The improvement of the code efficiency is addressed through three key activities: Optimization, parallelization and hardware acceleration. At first, a profiling analysis of the most time-consuming processes of the Reynolds Averaged Navier Stokes flow solver on a three-dimensional unstructured mesh is performed. Then, a study of the code scalability with new partitioning algorithms are tested to show the most suitable partitioning algorithms for the selected applications. Finally, a feasibility study on the application of FPGAs and GPUs for the hardware acceleration of CFD simulations is presented

Institute of Transport Research:Publications

Generalized disjunction decomposition for evolvable hardware

Author: Kalganova T
Lambert C
Stomeo E
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2006
Field of study

Evolvable hardware (EHW) refers to self-reconfiguration hardware design, where the configuration is under the control of an evolutionary algorithm (EA). One of the main difficulties in using EHW to solve real-world problems is scalability, which limits the size of the circuit that may be evolved. This paper outlines a new type of decomposition strategy for EHW, the “generalized disjunction decomposition” (GDD), which allows the evolution of large circuits. The proposed method has been extensively tested, not only with multipliers and parity bit problems traditionally used in the EHW community, but also with logic circuits taken from the Microelectronics Center of North Carolina (MCNC) benchmark library and randomly generated circuits. In order to achieve statistically relevant results, each analyzed logic circuit has been evolved 100 times, and the average of these results is presented and compared with other EHW techniques. This approach is necessary because of the probabilistic nature of EA; the same logic circuit may not be solved in the same way if tested several times. The proposed method has been examined in an extrinsic EHW system using the

(1 + lambda)

evolution strategy. The results obtained demonstrate that GDD significantly improves the evolution of logic circuits in terms of the number of generations, reduces computational time as it is able to reduce the required time for a single iteration of the EA, and enables the evolution of larger circuits never before evolved. In addition to the proposed method, a short overview of EHW systems together with the most recent applications in electrical circuit design is provided

Crossref

Brunel University Research Archive