8 research outputs found
Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines
In this paper, we address the problem of efficient execution of a computation
pattern, referred to here as the irregular wavefront propagation pattern
(IWPP), on hybrid systems with multiple CPUs and GPUs. The IWPP is common in
several image processing operations. In the IWPP, data elements in the
wavefront propagate waves to their neighboring elements on a grid if a
propagation condition is satisfied. Elements receiving the propagated waves
become part of the wavefront. This pattern results in irregular data accesses
and computations. We develop and evaluate strategies for efficient computation
and propagation of wavefronts using a multi-level queue structure. This queue
structure improves the utilization of fast memories in a GPU and reduces
synchronization overheads. We also develop a tile-based parallelization
strategy to support execution on multiple CPUs and GPUs. We evaluate our
approaches on a state-of-the-art GPU accelerated machine (equipped with 3 GPUs
and 2 multicore CPUs) using the IWPP implementations of two widely used image
processing operations: morphological reconstruction and euclidean distance
transform. Our results show significant performance improvements on GPUs. The
use of multiple CPUs and GPUs cooperatively attains speedups of 50x and 85x
with respect to single core CPU executions for morphological reconstruction and
euclidean distance transform, respectively.Comment: 37 pages, 16 figure
Accelerating Sensitivity Analysis in Microscopy Image Segmentation Workflows
With the increasingly availability of digital microscopy imagery equipments
there is a demand for efficient execution of whole slide tissue image
applications. Through the process of sensitivity analysis it is possible to
improve the output quality of such applications, and thus, improve the desired
analysis quality. Due to the high computational cost of such analyses and the
recurrent nature of executed tasks from sensitivity analysis methods (i.e.,
reexecution of tasks), the opportunity for computation reuse arises. By
performing computation reuse we can optimize the run time of sensitivity
analysis applications. This work focuses then on finding new ways to take
advantage of computation reuse opportunities on multiple task abstraction
levels. This is done by presenting the coarse-grain merging strategy and the
new fine-grain merging algorithms, implemented on top of the Region Templates
Framework.Comment: 44 page