6 research outputs found

    Air pollution modelling using a graphics processing unit with CUDA

    Get PDF
    The Graphics Processing Unit (GPU) is a powerful tool for parallel computing. In the past years the performance and capabilities of GPUs have increased, and the Compute Unified Device Architecture (CUDA) - a parallel computing architecture - has been developed by NVIDIA to utilize this performance in general purpose computations. Here we show for the first time a possible application of GPU for environmental studies serving as a basement for decision making strategies. A stochastic Lagrangian particle model has been developed on CUDA to estimate the transport and the transformation of the radionuclides from a single point source during an accidental release. Our results show that parallel implementation achieves typical acceleration values in the order of 80-120 times compared to CPU using a single-threaded implementation on a 2.33 GHz desktop computer. Only very small differences have been found between the results obtained from GPU and CPU simulations, which are comparable with the effect of stochastic transport phenomena in atmosphere. The relatively high speedup with no additional costs to maintain this parallel architecture could result in a wide usage of GPU for diversified environmental applications in the near future.Comment: 5 figure

    Simulation of reaction-diffusion processes in three dimensions using CUDA

    Get PDF
    Numerical solution of reaction-diffusion equations in three dimensions is one of the most challenging applied mathematical problems. Since these simulations are very time consuming, any ideas and strategies aiming at the reduction of CPU time are important topics of research. A general and robust idea is the parallelization of source codes/programs. Recently, the technological development of graphics hardware created a possibility to use desktop video cards to solve numerically intensive problems. We present a powerful parallel computing framework to solve reaction-diffusion equations numerically using the Graphics Processing Units (GPUs) with CUDA. Four different reaction-diffusion problems, (i) diffusion of chemically inert compound, (ii) Turing pattern formation, (iii) phase separation in the wake of a moving diffusion front and (iv) air pollution dispersion were solved, and additionally both the Shared method and the Moving Tiles method were tested. Our results show that parallel implementation achieves typical acceleration values in the order of 5-40 times compared to CPU using a single-threaded implementation on a 2.8 GHz desktop computer.Comment: 8 figures, 5 table

    CUDA Implementation of a Navier-Stokes Solver on Multi-GPU Desktop Platforms for Incompressible Flows

    Get PDF
    Graphics processor units (GPU) that are traditionally designed for graphics rendering have emerged as massively-parallel co-processors to the central processing unit (CPU). Small-footprint desktop supercomputers with hundreds of cores that can deliver teraflops peak performance at the price of conventional workstations have been realized. A computational fluid dynamics (CFD) simulation capability with rapid computational turnaround time has the potential to transform engineering analysis and design optimization procedures. We describe the implementation of a Navier-Stokes solver for incompressible fluid flow using desktop platforms equipped with multi-GPUs. Specifically, NVIDIA’s Compute Unified Device Architecture (CUDA) programming model is used to implement the discretized form of the governing equations. The projection algorithm to solve the incompressible fluid flow equations is divided into distinct CUDA kernels, and a unique implementation that exploits the memory hierarchy of the CUDA programming model is suggested. Using a quad-GPU platform, we observe two orders of magnitude speedup relative to a serial CPU implementation. Our results demonstrate that multi-GPU desktops can serve as a cost-effective small-footprint parallel computing platform to accelerate CFD simulations substantially. I. Introductio

    Efficient Knowledge Extraction from Structured Data

    Get PDF
    Knowledge extraction from structured data aims for identifying valid, novel, potentially useful, and ultimately understandable patterns in the data. The core step of this process is the application of a data mining algorithm in order to produce an enumeration of particular patterns and relationships in large databases. Clustering is one of the major data mining tasks and aims at grouping the data objects into meaningful classes (clusters) such that the similarity of objects within clusters is maximized, and the similarity of objects from different clusters is minimized. In this thesis, we advance the state-of-the-art data mining algorithms for analyzing structured data types. We describe the development of innovative solutions for hierarchical data mining. The EM-based hierarchical clustering method ITCH (Information-Theoretic Cluster Hierarchies) is designed to propose solid solutions for four different challenges. (1) to guide the hierarchical clustering algorithm to identify only meaningful and valid clusters. (2) to represent each cluster content in the hierarchy by an intuitive description with e.g. a probability density function. (3) to consistently handle outliers. (4) to avoid difficult parameter settings. ITCH is built on a hierarchical variant of the information-theoretic principle of Minimum Description Length (MDL). Interpreting the hierarchical cluster structure as a statistical model of the dataset, it can be used for effective data compression by Huffman coding. Thus, the achievable compression rate induces a natural objective function for clustering, which automatically satisfies all four above mentioned goals. The genetic-based hierarchical clustering algorithm GACH (Genetic Algorithm for finding Cluster Hierarchies) overcomes the problem of getting stuck in a local optimum by a beneficial combination of genetic algorithms, information theory and model-based clustering. Besides hierarchical data mining, we also made contributions to more complex data structures, namely objects that consist of mixed type attributes and skyline objects. The algorithm INTEGRATE performs integrative mining of heterogeneous data, which is one of the major challenges in the next decade, by a unified view on numerical and categorical information in clustering. Once more, supported by the MDL principle, INTEGRATE guarantees the usability on real world data. For skyline objects we developed SkyDist, a similarity measure for comparing different skyline objects, which is therefore a first step towards performing data mining on this kind of data structure. Applied in a recommender system, for example SkyDist can be used for pointing the user to alternative car types, exhibiting a similar price/mileage behavior like in his original query. For mining graph-structured data, we developed different approaches that have the ability to detect patterns in static as well as in dynamic networks. We confirmed the practical feasibility of our novel approaches on large real-world case studies ranging from medical brain data to biological yeast networks. In the second part of this thesis, we focused on boosting the knowledge extraction process. We achieved this objective by an intelligent adoption of Graphics Processing Units (GPUs). The GPUs have evolved from simple devices for the display signal preparation into powerful coprocessors that do not only support typical computer graphics tasks but can also be used for general numeric and symbolic computations. As major advantage, GPUs provide extreme parallelism combined with a high bandwidth in memory transfer at low cost. In this thesis, we propose algorithms for computationally expensive data mining tasks like similarity search and different clustering paradigms which are designed for the highly parallel environment of a GPU, called CUDA-DClust and CUDA-k-means. We define a multi-dimensional index structure which is particularly suited to support similarity queries under the restricted programming model of a GPU. We demonstrate the superiority of our algorithms running on GPU over their conventional counterparts on CPU in terms of efficiency
    corecore