2,575 research outputs found

    DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives

    Full text link
    We present a new parallel algorithm for probabilistic graphical model optimization. The algorithm relies on data-parallel primitives (DPPs), which provide portable performance over hardware architecture. We evaluate results on CPUs and GPUs for an image segmentation problem. Compared to a serial baseline, we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare our performance to a reference, OpenMP-based algorithm, and find speedups of up to 7X (CPU).Comment: LDAV 2018, October 201

    Inviwo -- A Visualization System with Usage Abstraction Levels

    Full text link
    The complexity of today's visualization applications demands specific visualization systems tailored for the development of these applications. Frequently, such systems utilize levels of abstraction to improve the application development process, for instance by providing a data flow network editor. Unfortunately, these abstractions result in several issues, which need to be circumvented through an abstraction-centered system design. Often, a high level of abstraction hides low level details, which makes it difficult to directly access the underlying computing platform, which would be important to achieve an optimal performance. Therefore, we propose a layer structure developed for modern and sustainable visualization systems allowing developers to interact with all contained abstraction levels. We refer to this interaction capabilities as usage abstraction levels, since we target application developers with various levels of experience. We formulate the requirements for such a system, derive the desired architecture, and present how the concepts have been exemplary realized within the Inviwo visualization system. Furthermore, we address several specific challenges that arise during the realization of such a layered architecture, such as communication between different computing platforms, performance centered encapsulation, as well as layer-independent development by supporting cross layer documentation and debugging capabilities

    Streaming narrow-band algorithm: interactive computation and visualization of level sets

    Get PDF
    Journal ArticleAbstract-Deformable isosurfaces, implemented with level-set methods, have demonstrated a great potential in visualization and computer graphics for applications such as segmentation, surface processing, and physically-based modeling. Their usefulness has been limited, however, by their high computational cost and reliance on significant parameter tuning. This paper presents a solution to these challenges by describing graphics processor (GPU) based algorithms for solving and visualizing level-set solutions at interactive rates. The proposed solution is based on a new, streaming implementation of the narrow-band algorithm. The new algorithm packs the level-set isosurface data into 2D texture memory via a multidimensional virtual memory system. As the level set moves, this texturebased representation is dynamically updated via a novel GPU-to-CPU message passing scheme. By integrating the level-set solver with a real-time volume renderer, a user can visualize and intuitively steer the level-set surface as it evolves. We demonstrate the capabilities of this technology for interactive volume segmentation and visualization

    How GPU Rendering Affects Image Processing and Scientific Calculation Speed, Power and Energy on a Raspberry Pi

    Get PDF
    In this thesis, we explore the speed, power, and energy performance of the same data process on the central processing unit (CPU) with and without the acceleration of the Graphics Processing Unit (GPU) on the microcomputer Raspberry Pi (RPI). We tested on the RPI in two different fields. The first was comparing the speed, power, and energy usage with and without GPU acceleration in the image processing impacts on RPI model B+. The second was comparing speed, power, energy usage, and accuracy for scientific calculation with and without GPU acceleration on RPI model B+ and 3B. We used a novel method to correlate graphics processing, CPU load, power consumption, and total energy consumption. Three different benchmarks were utilized to play a short video. OMXplayer was used with GPU rendering while the Mplayer and VLC player were without GPU rendering. A 3 Dimensions model simulator (3D Slash) benchmark was also used to compare its power usage with the previous benchmarks’. We used system counter tool PERF and system usage monitor TOP for acquiring accurate system CPU and Random-Access Memory (RAM) usage information. The first study design included a comparison of the running time, frame rate, power usage, and the total energy consumed by the benchmarks. We used the Adafruit USB Power Gauge to log the power and energy consumed by the RPI, and its values were output to a CSV file for ease of graphing and calculation. The first study results showed that the number of frames rendered per second increased dramatically when hardware rendering was used, as did electrical power consumption. Interestingly, the hardware rendering takes less time than the software rendering, and the total energy consumed by the hardware rendering lower than the software rendering despite the power during hardware rendering being higher. In the second study, we used the Fast Fourier Transform (FFT) as the calculation method for analyzing. We developed six benchmark programs using three libraries that included: GPU_FFT, Fastest Fourier Transform in the West (FFTW) and Python SciPy FFTpack (SciPy FFT) [1-3]. They were used for doing FFT in both one dimension (1D) and two dimensions (2D) using single precision floating point numbers as the primary data type. The study design includes: the write-up of the involved code, a comparison of the accuracy of the results compared to the known solution, running time, power consumption during the calculation, and the total energy consumed by the calculation. The Power Gauge was used to measure the power and energy consumed by the RPI as we did in the first field. In the second study, we found that General-purpose computing on graphics processing units (GPGPU) code was more energy efficient and faster than the serial code on both RPI models without much sacrifice of the precision. From the two studies, we interpreted that particular type of data processing like image processing and typical complex matrices value calculating would have numerous benefits in speed, energy expenditure with the GPU rendering

    Interactive deformation and visualization of level set surfaces using graphics hardware

    Get PDF
    Journal ArticleDeformable isosurfaces, implemented with level-set methods, have demonstrated a great potential in visualization for applications such as segmentation, surface processing, and surface reconstruction. Their usefulness has been limited, however, by their high computational cost and and reliance on significant parameter tuning. This paper presents a solution to these challenges by describing graphics processor (GPU) based algorithms for solving and visualizing levelset solutions at interactive rates. Our efficient GPU-based solution relies on packing the level-set isosurface data into a dynamic, sparse texture format. As the level set moves, this sparse data structure is updated via a novel GPU to CPU message passing scheme. When the level-set solver is integrated with a real-time volume renderer operating on the same packed format, a user can visualize and steer the deformable level-set surface as it evolves. In addition, the resulting isosurface can serve as a region-of-interest specifier for the volume renderer. This paper demonstrates the capabilities of this technology for interactive volume visualization and segmentation
    corecore