Search CORE

10,563 research outputs found

Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device

Author: Buckley Kevan
Sillitoe Ian
Yang Shufan
Yu Zheqi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2017
Field of study

Currently, most designers face a daunting task to research different design flows and learn the intricacies of specific software from various manufacturers in hardware/software co-design. An urgent need of creating a scalable hardware/software co-design platform has become a key strategic element for developing hardware/software integrated systems. In this paper, we propose a new design flow for building a scalable co-design platform on FPGA-based system-on-chip. We employ an integrated approach to implement a histogram oriented gradients (HOG) and a support vector machine (SVM) classification on a programmable device for pedestrian tracking. Not only was hardware resource analysis reported, but the precision and success rates of pedestrian tracking on nine open access image data sets are also analysed. Finally, our proposed design flow can be used for any real-time image processingrelated products on programmable ZYNQ-based embedded systems, which benefits from a reduced design time and provide a scalable solution for embedded image processing products

Crossref

Enlighten

FPGA-accelerated machine learning inference as a service for particle physics computing

Author: Duarte Javier
Harris Philip
Hauck Scott
Holzman Burt
Hsu Shih-Chieh
Jindariani Sergo
Khan Suffian
Kreis Benjamin
Lee Brian
Liu Mia
Lončar Vladimir
Ngadiuba Jennifer
Pedro Kevin
Perez Brandon
Pierini Maurizio
Rankin Dylan
Trahms Matthew
Tran Nhan
Tsaris Aristeidis
Versteeg Colin
Way Ted W.
Werran Dustin
Wu Zhenbin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/04/2019
Field of study

New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.Comment: 16 pages, 14 figures, 2 table

arXiv.org e-Print Archive

CERN Document Server

A novel system architecture for real-time low-level vision

Author: Benedetti A.
Perona P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1999
Field of study

A novel system architecture that exploits the spatial locality in memory access that is found in most low-level vision algorithms is presented. A real-time feature selection system is used to exemplify the underlying ideas, and an implementation based on commercially available Field Programmable Gate Arrays (FPGA’s) and synchronous SRAM memory devices is proposed. The peak memory access rate of a system based on this architecture is estimated at 2.88 G-Bytes/s, which represents a four to five times improvement with respect to existing reconfigurable computers

Caltech Authors

Real-time human action recognition on an embedded, reconfigurable video processing architecture

Author: A. Aizerman
A.F. Bobick
B. Farnell
Chris Bailey
G.R. Bradski
Hongying Meng
J.K. Aggarwal
Michael Freeman
Nick Pears
T. Moeslund
T. Ogata
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Copyright @ 2008 Springer-Verlag.In recent years, automatic human motion recognition has been widely researched within the computer vision and image processing communities. Here we propose a real-time embedded vision solution for human motion recognition implemented on a ubiquitous device. There are three main contributions in this paper. Firstly, we have developed a fast human motion recognition system with simple motion features and a linear Support Vector Machine (SVM) classifier. The method has been tested on a large, public human action dataset and achieved competitive performance for the temporal template (eg. “motion history image”) class of approaches. Secondly, we have developed a reconfigurable, FPGA based video processing architecture. One advantage of this architecture is that the system processing performance can be reconfiured for a particular application, with the addition of new or replicated processing cores. Finally, we have successfully implemented a human motion recognition system on this reconfigurable architecture. With a small number of human actions (hand gestures), this stand-alone system is performing reliably, with an 80% average recognition rate using limited training data. This type of system has applications in security systems, man-machine communications and intelligent environments.DTI and Broadcom Ltd

Crossref

UCL Discovery

Brunel University Research Archive