5,072 research outputs found
Distributed Connected Component Filtering and Analysis in 2-D and 3-D Tera-Scale Data Sets
Connected filters and multi-scale tools are region-based operators acting on the connected components of an image. Component trees are image representations to efficiently perform these operations as they represent the inclusion relationship of the connected components hierarchically. This paper presents disccofan (DIStributed Connected COmponent Filtering and ANalysis), a new method that extends the previous 2-D implementation of the Distributed Component Forests (DCFs) to handle 3-D processing and higher dynamic range data sets. disccofan combines shared and distributed memory techniques to efficiently compute component trees, user-defined attributes filters, and multi-scale analysis. Compared to similar methods, disccofan is faster and scales better on low and moderate dynamic range images, and is the only method with a speed-up larger than 1 on a realistic, astronomical floating-point data set. It achieves a speed-up of 11.20 using 48 processes to compute the DCF of a 162 Gigapixels, single-precision floating-point 3-D data set, while reducing the memory used by a factor of 22. This approach is suitable to perform attribute filtering and multi-scale analysis on very large 2-D and 3-D data sets, up to single-precision floating-point value
Parallel Attribute Computation for Distributed Component Forests
Component trees are powerful image processing tools to analyze the connected components of an image. One attractive strategy consists in building the nested relations at first and then deriving the components' attributes afterward, such that the user can switch between different attribute functions without having to re-compute the entire tree. Only sequential algorithms allow such an approach, while no parallel algorithm is available. In this paper, we extend a recent method using distributed memory techniques to enable posterior attribute computation in a parallel or distributed manner. This novel approach significantly reduces the computational time needed for combining several attribute functions interactively in Giga and Tera-Scale data sets
C Language Extensions for Hybrid CPU/GPU Programming with StarPU
Modern platforms used for high-performance computing (HPC) include machines
with both general-purpose CPUs, and "accelerators", often in the form of
graphical processing units (GPUs). StarPU is a C library to exploit such
platforms. It provides users with ways to define "tasks" to be executed on CPUs
or GPUs, along with the dependencies among them, and by automatically
scheduling them over all the available processing units. In doing so, it also
relieves programmers from the need to know the underlying architecture details:
it adapts to the available CPUs and GPUs, and automatically transfers data
between main memory and GPUs as needed. While StarPU's approach is successful
at addressing run-time scheduling issues, being a C library makes for a poor
and error-prone programming interface. This paper presents an effort started in
2011 to promote some of the concepts exported by the library as C language
constructs, by means of an extension of the GCC compiler suite. Our main
contribution is the design and implementation of language extensions that map
to StarPU's task programming paradigm. We argue that the proposed extensions
make it easier to get started with StarPU,eliminate errors that can occur when
using the C library, and help diagnose possible mistakes. We conclude on future
work
Boosting Multi-Core Reachability Performance with Shared Hash Tables
This paper focuses on data structures for multi-core reachability, which is a
key component in model checking algorithms and other verification methods. A
cornerstone of an efficient solution is the storage of visited states. In
related work, static partitioning of the state space was combined with
thread-local storage and resulted in reasonable speedups, but left open whether
improvements are possible. In this paper, we present a scaling solution for
shared state storage which is based on a lockless hash table implementation.
The solution is specifically designed for the cache architecture of modern
CPUs. Because model checking algorithms impose loose requirements on the hash
table operations, their design can be streamlined substantially compared to
related work on lockless hash tables. Still, an implementation of the hash
table presented here has dozens of sensitive performance parameters (bucket
size, cache line size, data layout, probing sequence, etc.). We analyzed their
impact and compared the resulting speedups with related tools. Our
implementation outperforms two state-of-the-art multi-core model checkers (SPIN
and DiVinE) by a substantial margin, while placing fewer constraints on the
load balancing and search algorithms.Comment: preliminary repor
Spectral-spatial classification of n-dimensional images in real-time based on segmentation and mathematical morphology on GPUs
The objective of this thesis is to develop efficient schemes for spectral-spatial n-dimensional image
classification. By efficient schemes, we mean schemes that produce good classification results in
terms of accuracy, as well as schemes that can be executed in real-time on low-cost computing
infrastructures, such as the Graphics Processing Units (GPUs) shipped in personal computers. The
n-dimensional images include images with two and three dimensions, such as images coming from
the medical domain, and also images ranging from ten to hundreds of dimensions, such as the multiand
hyperspectral images acquired in remote sensing.
In image analysis, classification is a regularly used method for information retrieval in areas such as
medical diagnosis, surveillance, manufacturing and remote sensing, among others. In addition, as
the hyperspectral images have been widely available in recent years owing to the reduction in the
size and cost of the sensors, the number of applications at lab scale, such as food quality control, art
forgery detection, disease diagnosis and forensics has also increased. Although there are many
spectral-spatial classification schemes, most are computationally inefficient in terms of execution
time. In addition, the need for efficient computation on low-cost computing infrastructures is
increasing in line with the incorporation of technology into everyday applications.
In this thesis we have proposed two spectral-spatial classification schemes: one based on
segmentation and other based on wavelets and mathematical morphology. These schemes were
designed with the aim of producing good classification results and they perform better than other
schemes found in the literature based on segmentation and mathematical morphology in terms of
accuracy. Additionally, it was necessary to develop techniques and strategies for efficient GPU
computing, for example, a block–asynchronous strategy, resulting in an efficient implementation on
GPU of the aforementioned spectral-spatial classification schemes. The optimal GPU parameters
were analyzed and different data partitioning and thread block arrangements were studied to exploit
the GPU resources. The results show that the GPU is an adequate computing platform for on-board
processing of hyperspectral information
- …