Search CORE

3,262 research outputs found

A FPGA system for QRS complex detection based on Integer Wavelet Transform

Author: Karadaglic Dejan
Milosevic Danijela
Mirkovic Mimo
Stojanovic Radovan
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2011
Field of study

Due to complexity of their mathematical computation, many QRS detectors are implemented in software and cannot operate in real time. The paper presents a real-time hardware based solution for this task. To filter ECG signal and to extract QRS complex it employs the Integer Wavelet Transform. The system includes several components and is incorporated in a single FPGA chip what makes it suitable for direct embedding in medical instruments or wearable health care devices. It has sufficient accuracy (about 95%), showing remarkable noise immunity and low cost. Additionally, each system component is composed of several identical blocks/cells what makes the design highly generic. The capacity of today existing FPGAs allows even dozens of detectors to be placed in a single chip. After the theoretical introduction of wavelets and the review of their application in QRS detection, it will be shown how some basic wavelets can be optimized for easy hardware implementation. For this purpose the migration to the integer arithmetic and additional simplifications in calculations has to be done. Further, the system architecture will be presented with the demonstrations in both, software simulation and real testing. At the end, the working performances and preliminary results will be outlined and discussed. The same principle can be applied with other signals where the hardware implementation of wavelet transform can be of benefit

Directory of Open Access Journals

SCIDAR - Дигитална архива Универзитета у Крагујевцу

ResearchOnline@GCU

FPGA-Based CNN Inference Accelerator Synthesized from Multi-Threaded C Software

Author: Anderson Jason H.
Brothers John
Grady Brett
Kim Jin Hee
Lian Ruolong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/07/2018
Field of study

A deep-learning inference accelerator is synthesized from a C-language software program parallelized with Pthreads. The software implementation uses the well-known producer/consumer model with parallel threads interconnected by FIFO queues. The LegUp high-level synthesis (HLS) tool synthesizes threads into parallel FPGA hardware, translating software parallelism into spatial parallelism. A complete system is generated where convolution, pooling and padding are realized in the synthesized accelerator, with remaining tasks executing on an embedded ARM processor. The accelerator incorporates reduced precision, and a novel approach for zero-weight-skipping in convolution. On a mid-sized Intel Arria 10 SoC FPGA, peak performance on VGG-16 is 138 effective GOPS

arXiv.org e-Print Archive

Crossref

Our Future Engineers Can Bridge the Software/Hardware Paradigm Chasm

Author: HUEBNER M
REIS R
Stroobandt Dirk
Publication venue
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

Low-complexity RLS algorithms using dichotomous coordinate descent iterations

Author: Liu Jie
White George P.
Zakharov Yuriy V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2008
Field of study

In this paper, we derive low-complexity recursive least squares (RLS) adaptive filtering algorithms. We express the RLS problem in terms of auxiliary normal equations with respect to increments of the filter weights and apply this approach to the exponentially weighted and sliding window cases to derive new RLS techniques. For solving the auxiliary equations, line search methods are used. We first consider conjugate gradient iterations with a complexity of O(N-2) operations per sample; N being the number of the filter weights. To reduce the complexity and make the algorithms more suitable for finite precision implementation, we propose a new dichotomous coordinate descent (DCD) algorithm and apply it to the auxiliary equations. This results in a transversal RLS adaptive filter with complexity as low as 3N multiplications per sample, which is only slightly higher than the complexity of the least mean squares (LMS) algorithm (2N multiplications). Simulations are used to compare the performance of the proposed algorithms against the classical RLS and known advanced adaptive algorithms. Fixed-point FPGA implementation of the proposed DCD-based RLS algorithm is also discussed and results of such implementation are presented

Crossref

White Rose Research Online