5,072 research outputs found

    Concurrent Computation of Attribute Filters on Shared Memory Parallel Machines

    Full text link

    Distributed Connected Component Filtering and Analysis in 2-D and 3-D Tera-Scale Data Sets

    Get PDF
    Connected filters and multi-scale tools are region-based operators acting on the connected components of an image. Component trees are image representations to efficiently perform these operations as they represent the inclusion relationship of the connected components hierarchically. This paper presents disccofan (DIStributed Connected COmponent Filtering and ANalysis), a new method that extends the previous 2-D implementation of the Distributed Component Forests (DCFs) to handle 3-D processing and higher dynamic range data sets. disccofan combines shared and distributed memory techniques to efficiently compute component trees, user-defined attributes filters, and multi-scale analysis. Compared to similar methods, disccofan is faster and scales better on low and moderate dynamic range images, and is the only method with a speed-up larger than 1 on a realistic, astronomical floating-point data set. It achieves a speed-up of 11.20 using 48 processes to compute the DCF of a 162 Gigapixels, single-precision floating-point 3-D data set, while reducing the memory used by a factor of 22. This approach is suitable to perform attribute filtering and multi-scale analysis on very large 2-D and 3-D data sets, up to single-precision floating-point value

    Parallel Attribute Computation for Distributed Component Forests

    Get PDF
    Component trees are powerful image processing tools to analyze the connected components of an image. One attractive strategy consists in building the nested relations at first and then deriving the components' attributes afterward, such that the user can switch between different attribute functions without having to re-compute the entire tree. Only sequential algorithms allow such an approach, while no parallel algorithm is available. In this paper, we extend a recent method using distributed memory techniques to enable posterior attribute computation in a parallel or distributed manner. This novel approach significantly reduces the computational time needed for combining several attribute functions interactively in Giga and Tera-Scale data sets

    C Language Extensions for Hybrid CPU/GPU Programming with StarPU

    Get PDF
    Modern platforms used for high-performance computing (HPC) include machines with both general-purpose CPUs, and "accelerators", often in the form of graphical processing units (GPUs). StarPU is a C library to exploit such platforms. It provides users with ways to define "tasks" to be executed on CPUs or GPUs, along with the dependencies among them, and by automatically scheduling them over all the available processing units. In doing so, it also relieves programmers from the need to know the underlying architecture details: it adapts to the available CPUs and GPUs, and automatically transfers data between main memory and GPUs as needed. While StarPU's approach is successful at addressing run-time scheduling issues, being a C library makes for a poor and error-prone programming interface. This paper presents an effort started in 2011 to promote some of the concepts exported by the library as C language constructs, by means of an extension of the GCC compiler suite. Our main contribution is the design and implementation of language extensions that map to StarPU's task programming paradigm. We argue that the proposed extensions make it easier to get started with StarPU,eliminate errors that can occur when using the C library, and help diagnose possible mistakes. We conclude on future work

    Boosting Multi-Core Reachability Performance with Shared Hash Tables

    Get PDF
    This paper focuses on data structures for multi-core reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related work, static partitioning of the state space was combined with thread-local storage and resulted in reasonable speedups, but left open whether improvements are possible. In this paper, we present a scaling solution for shared state storage which is based on a lockless hash table implementation. The solution is specifically designed for the cache architecture of modern CPUs. Because model checking algorithms impose loose requirements on the hash table operations, their design can be streamlined substantially compared to related work on lockless hash tables. Still, an implementation of the hash table presented here has dozens of sensitive performance parameters (bucket size, cache line size, data layout, probing sequence, etc.). We analyzed their impact and compared the resulting speedups with related tools. Our implementation outperforms two state-of-the-art multi-core model checkers (SPIN and DiVinE) by a substantial margin, while placing fewer constraints on the load balancing and search algorithms.Comment: preliminary repor

    Spectral-spatial classification of n-dimensional images in real-time based on segmentation and mathematical morphology on GPUs

    Get PDF
    The objective of this thesis is to develop efficient schemes for spectral-spatial n-dimensional image classification. By efficient schemes, we mean schemes that produce good classification results in terms of accuracy, as well as schemes that can be executed in real-time on low-cost computing infrastructures, such as the Graphics Processing Units (GPUs) shipped in personal computers. The n-dimensional images include images with two and three dimensions, such as images coming from the medical domain, and also images ranging from ten to hundreds of dimensions, such as the multiand hyperspectral images acquired in remote sensing. In image analysis, classification is a regularly used method for information retrieval in areas such as medical diagnosis, surveillance, manufacturing and remote sensing, among others. In addition, as the hyperspectral images have been widely available in recent years owing to the reduction in the size and cost of the sensors, the number of applications at lab scale, such as food quality control, art forgery detection, disease diagnosis and forensics has also increased. Although there are many spectral-spatial classification schemes, most are computationally inefficient in terms of execution time. In addition, the need for efficient computation on low-cost computing infrastructures is increasing in line with the incorporation of technology into everyday applications. In this thesis we have proposed two spectral-spatial classification schemes: one based on segmentation and other based on wavelets and mathematical morphology. These schemes were designed with the aim of producing good classification results and they perform better than other schemes found in the literature based on segmentation and mathematical morphology in terms of accuracy. Additionally, it was necessary to develop techniques and strategies for efficient GPU computing, for example, a block–asynchronous strategy, resulting in an efficient implementation on GPU of the aforementioned spectral-spatial classification schemes. The optimal GPU parameters were analyzed and different data partitioning and thread block arrangements were studied to exploit the GPU resources. The results show that the GPU is an adequate computing platform for on-board processing of hyperspectral information
    corecore