19,848 research outputs found
DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives
We present a new parallel algorithm for probabilistic graphical model
optimization. The algorithm relies on data-parallel primitives (DPPs), which
provide portable performance over hardware architecture. We evaluate results on
CPUs and GPUs for an image segmentation problem. Compared to a serial baseline,
we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare
our performance to a reference, OpenMP-based algorithm, and find speedups of up
to 7X (CPU).Comment: LDAV 2018, October 201
Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device
Currently, most designers face a daunting task to
research different design flows and learn the intricacies of
specific software from various manufacturers in
hardware/software co-design. An urgent need of creating a
scalable hardware/software co-design platform has become a key
strategic element for developing hardware/software integrated
systems. In this paper, we propose a new design flow for building
a scalable co-design platform on FPGA-based system-on-chip.
We employ an integrated approach to implement a histogram
oriented gradients (HOG) and a support vector machine (SVM)
classification on a programmable device for pedestrian tracking.
Not only was hardware resource analysis reported, but the
precision and success rates of pedestrian tracking on nine open
access image data sets are also analysed. Finally, our proposed
design flow can be used for any real-time image processingrelated
products on programmable ZYNQ-based embedded
systems, which benefits from a reduced design time and provide a
scalable solution for embedded image processing products
Concurrent Viola Jones classifiers on a portable Beowulf cluster : a thesis presented in partial fulfilment of the requirements for the degree of Master of Engineering in Mechatronics at Massey University
Real-time Computer Vision is an interesting application for supercomputing, real-time applications (vision processing in particular) employ special purpose hardware such as DSPs to achieve high performance. This thesis explores parallel computers particularly commodity general purpose hardware. We also build a prototype to better understand the economics of supercomputing, specifically related to mobile computing - low power, rugged design by building a mobile computer. A new communication layer is built, where by the nature of the locality of the nodes allows one to optimise the protocols to reduce the latency comparably. Finally a study and in depth results of the algorithm, the Viola Jones Object detector in parallel are presented followed by reflection and future work based on the current results and platform
- …