19,848 research outputs found

    DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives

    Full text link
    We present a new parallel algorithm for probabilistic graphical model optimization. The algorithm relies on data-parallel primitives (DPPs), which provide portable performance over hardware architecture. We evaluate results on CPUs and GPUs for an image segmentation problem. Compared to a serial baseline, we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare our performance to a reference, OpenMP-based algorithm, and find speedups of up to 7X (CPU).Comment: LDAV 2018, October 201

    Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device

    Get PDF
    Currently, most designers face a daunting task to research different design flows and learn the intricacies of specific software from various manufacturers in hardware/software co-design. An urgent need of creating a scalable hardware/software co-design platform has become a key strategic element for developing hardware/software integrated systems. In this paper, we propose a new design flow for building a scalable co-design platform on FPGA-based system-on-chip. We employ an integrated approach to implement a histogram oriented gradients (HOG) and a support vector machine (SVM) classification on a programmable device for pedestrian tracking. Not only was hardware resource analysis reported, but the precision and success rates of pedestrian tracking on nine open access image data sets are also analysed. Finally, our proposed design flow can be used for any real-time image processingrelated products on programmable ZYNQ-based embedded systems, which benefits from a reduced design time and provide a scalable solution for embedded image processing products

    Concurrent Viola Jones classifiers on a portable Beowulf cluster : a thesis presented in partial fulfilment of the requirements for the degree of Master of Engineering in Mechatronics at Massey University

    Get PDF
    Real-time Computer Vision is an interesting application for supercomputing, real-time applications (vision processing in particular) employ special purpose hardware such as DSPs to achieve high performance. This thesis explores parallel computers particularly commodity general purpose hardware. We also build a prototype to better understand the economics of supercomputing, specifically related to mobile computing - low power, rugged design by building a mobile computer. A new communication layer is built, where by the nature of the locality of the nodes allows one to optimise the protocols to reduce the latency comparably. Finally a study and in depth results of the algorithm, the Viola Jones Object detector in parallel are presented followed by reflection and future work based on the current results and platform
    corecore