Search CORE

74,124 research outputs found

Vivaldi: a Python-like domain-specific language for volume rendering and processing on distributed multi CPU-GPU systems

Author: Choi Hyungsuk
Publication venue: Graduate school of UNIST
Publication date: 01/02/2015
Field of study

Department of Computer EngineeringIn this thesis, a Python-like domain-specific language alled Vivaldi is proposed. Vivaldi is based on Python, but can also provide parallel volume rendering and processing on distributed heterogeneous systems. Nowadays, visualization requires high performance for processing large data and creating high quality images. Therefore, computing systems have also advanced to meet this requirement; one example is graphics processing unit (GPU) clusters. However, even though high-performance systems exist, effective utilization of these systems is not easy for non-experts such as domain scientists and researchers because they require expert programming skills and a significant amount of software development for parallel programming, distributed systems, and heterogeneous architecture. One aim of Vivaldi is to minimize these required programming skills using a domain-specific language for volume rendering and visualization that is Python-like and platform independent. In this language, a parallel visualization model, virtual shared memory model, and platform independent architecture for distributed heterogeneous systems are proposed. The parallel visualization model provides a simple means to implement visualization pipelines. The virtual shared memory model enables the use of cluster memory without Message Passing Interface (MPI) and CUDA. Finally, the platform independent design integrates central processing unit(CPU)s and GPUs into a common, domain-specific language. Vivaldi code was compared to C++ implementations to evaluate its performance according to number of lines, performance, and scalability. The results show that Vivaldi achieved comparable scalability and performance while requiring much less programming effort.ope

ScholarWorks@UNIST

Unleashing the Power of Distributed CPU/GPU Architectures: Massive Astronomical Data Analysis and Visualization case study

Author: Barnes D. G.
Fluke C. J.
Hassan A. H.
Publication venue
Publication date: 28/11/2011
Field of study

Upcoming and future astronomy research facilities will systematically generate terabyte-sized data sets moving astronomy into the Petascale data era. While such facilities will provide astronomers with unprecedented levels of accuracy and coverage, the increases in dataset size and dimensionality will pose serious computational challenges for many current astronomy data analysis and visualization tools. With such data sizes, even simple data analysis tasks (e.g. calculating a histogram or computing data minimum/maximum) may not be achievable without access to a supercomputing facility. To effectively handle such dataset sizes, which exceed today's single machine memory and processing limits, we present a framework that exploits the distributed power of GPUs and many-core CPUs, with a goal of providing data analysis and visualizing tasks as a service for astronomers. By mixing shared and distributed memory architectures, our framework effectively utilizes the underlying hardware infrastructure handling both batched and real-time data analysis and visualization tasks. Offering such functionality as a service in a "software as a service" manner will reduce the total cost of ownership, provide an easy to use tool to the wider astronomical community, and enable a more optimized utilization of the underlying hardware infrastructure.Comment: 4 Pages, 1 figures, To appear in the proceedings of ADASS XXI, ed. P.Ballester and D.Egret, ASP Conf. Serie

arXiv.org e-Print Archive

Swinburne Research Bank

GPU Accelerated Particle Visualization with Splotch

Author: Dolag Klaus
Dykes Tim
Gheller Claudio
Krokos Mel
Rivi Marzia
Publication venue: 'Elsevier BV'
Publication date: 23/03/2014
Field of study

Splotch is a rendering algorithm for exploration and visual discovery in particle-based datasets coming from astronomical observations or numerical simulations. The strengths of the approach are production of high quality imagery and support for very large-scale datasets through an effective mix of the OpenMP and MPI parallel programming paradigms. This article reports our experiences in re-designing Splotch for exploiting emerging HPC architectures nowadays increasingly populated with GPUs. A performance model is introduced for data transfers, computations and memory access, to guide our re-factoring of Splotch. A number of parallelization issues are discussed, in particular relating to race conditions and workload balancing, towards achieving optimal performances. Our implementation was accomplished by using the CUDA programming paradigm. Our strategy is founded on novel schemes achieving optimized data organisation and classification of particles. We deploy a reference simulation to present performance results on acceleration gains and scalability. We finally outline our vision for future work developments including possibilities for further optimisations and exploitation of emerging technologies.Comment: 25 pages, 9 figures. Astronomy and Computing (2014

arXiv.org e-Print Archive

Portsmouth University Research Portal (Pure)

DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives

Author: Bethel E. Wes
Camp David
Childs Hank
Heinemann Colleen
Lessley Brenton
Perciano Talita
Publication venue
Publication date: 13/09/2018
Field of study

We present a new parallel algorithm for probabilistic graphical model optimization. The algorithm relies on data-parallel primitives (DPPs), which provide portable performance over hardware architecture. We evaluate results on CPUs and GPUs for an image segmentation problem. Compared to a serial baseline, we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare our performance to a reference, OpenMP-based algorithm, and find speedups of up to 7X (CPU).Comment: LDAV 2018, October 201

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Interactive Visualization of the Largest Radioastronomy Cubes

Author: A.H. Hassan
Barnes
Becciani
Beeson
C.J. Fluke
D.G. Barnes
Drebin
Goel
Graham
Hamada
Lacroute
Levoy
Li
Lombeyda
McClure-Griffiths
Oosterloo
Pence
Sabella
Samet
Schive
Thibault
Wayth
Publication venue: 'Elsevier BV'
Publication date: 31/07/2010
Field of study

3D visualization is an important data analysis and knowledge discovery tool, however, interactive visualization of large 3D astronomical datasets poses a challenge for many existing data visualization packages. We present a solution to interactively visualize larger-than-memory 3D astronomical data cubes by utilizing a heterogeneous cluster of CPUs and GPUs. The system partitions the data volume into smaller sub-volumes that are distributed over the rendering workstations. A GPU-based ray casting volume rendering is performed to generate images for each sub-volume, which are composited to generate the whole volume output, and returned to the user. Datasets including the HI Parkes All Sky Survey (HIPASS - 12 GB) southern sky and the Galactic All Sky Survey (GASS - 26 GB) data cubes were used to demonstrate our framework's performance. The framework can render the GASS data cube with a maximum render time < 0.3 second with 1024 x 1024 pixels output resolution using 3 rendering workstations and 8 GPUs. Our framework will scale to visualize larger datasets, even of Terabyte order, if proper hardware infrastructure is available.Comment: 15 pages, 12 figures, Accepted New Astronomy July 201

arXiv.org e-Print Archive

Crossref

Swinburne Research Bank

Volume visualization of time-varying data using parallel, multiresolution and adaptive-resolution techniques

Author: Shams Sadaf
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2006
Field of study

This paper presents a parallel rendering approach that allows high-quality visualization of large time-varying volume datasets. Multiresolution and adaptive-resolution techniques are also incorporated to improve the efficiency of the rendering. Three basic steps are needed to implement this kind of an application. First we divide the task through decomposition of data. This decomposition can be either temporal or spatial or a mix of both. After data has been divided, each of the data portions is rendered by a separate processor to create sub-images or frames. Finally these sub-images or frames are assembled together into a final image or animation. After developing this application, several experiments were performed to show that this approach indeed saves time when a reasonable number of processors are used. Also, we conclude that the optimal number of processors is dependent on the size of the dataset used

UNH Scholars' Repository