86,351 research outputs found
Multi-GPU aggregation-based AMG preconditioner for iterative linear solvers
We present and release in open source format a sparse linear solver which
efficiently exploits heterogeneous parallel computers. The solver can be easily
integrated into scientific applications that need to solve large and sparse
linear systems on modern parallel computers made of hybrid nodes hosting NVIDIA
Graphics Processing Unit (GPU) accelerators.
The work extends our previous efforts in the exploitation of a single GPU
accelerator and proposes an implementation, based on the hybrid MPI-CUDA
software environment, of a Krylov-type linear solver relying on an efficient
Algebraic MultiGrid (AMG) preconditioner already available in the BootCMatchG
library. Our design for the hybrid implementation has been driven by the best
practices for minimizing data communication overhead when multiple GPUs are
employed, yet preserving the efficiency of the single GPU kernels. Strong and
weak scalability results on well-known benchmark test cases of the new version
of the library are discussed. Comparisons with the Nvidia AmgX solution show an
improvement of up to 2.0x in the solve phase
A GPU-based, three-dimensional level set solver with curvature flow
technical reportLevel set methods are a powerful tool for implicitly representing deformable surfaces. Since their inception, these techniques have been used to solve prob- lems in fields as varied as computer vision, scientific visualization, computer graphics and computational physics. With the power and flexibility of this approach; however, comes a large computational burden. In the level set ap- proach, surface motion is computed via a partial differential equation (PDE) framework. One possibility for accelerating level-set based applications is to map the solver kernel onto a commodity graphics processing unit (GPU). GPUs are parallel, vector computers whose power is currently increasing at a faster rate than that of CPUs. in this work, we demonstrate a GPU-based, three- dimensional level set solver that is capable of computing curvature flow as well as other speed terms. Results are shown for this solver segmenting the brain surface from an MRI data set
PC-CUBE: A Personal Computer Based Hypercube
PC-CUBE is an ensemble of IBM PCs or close compatibles connected in the hypercube topology with ordinary computer cables. Communication occurs at the rate of 115.2 K-band via the RS-232 serial links. Available for PC-CUBE is the Crystalline Operating System III (CrOS III), Mercury Operating System, CUBIX and PLOTIX which are parallel I/O and graphics libraries. A CrOS performance monitor was developed to facilitate the measurement of communication and computation time of a program and their effects on performance. Also available are CXLISP, a parallel version of the XLISP interpreter; GRAFIX, some graphics routines for the EGA and CGA; and a general execution profiler for determining execution time spent by program subroutines. PC-CUBE provides a programming environment similar to all hypercube systems running CrOS III, Mercury and CUBIX. In addition, every node (personal computer) has its own graphics display monitor and storage devices. These allow data to be displayed or stored at every processor, which has much instructional value and enables easier debugging of applications. Some application programs which are taken from the book Solving Problems on Concurrent Processors (Fox 88) were implemented with graphics enhancement on PC-CUBE. The applications range from solving the Mandelbrot set, Laplace equation, wave equation, long range force interaction, to WaTor, an ecological simulation
Overlapping Multi-Processing and Graphics Hardware Acceleration: Performance Evaluation
Colloque avec actes et comité de lecture.Recently, multi-processing has been shown to deliver good performance in rendering. However, in some applications, processors spend too much time executing tasks that could be more efficiently done through intensive use of new graphics hardware. We present in this paper a novel solution combining multi-processing and advanced graphics hardware, where graphics pipelines are used both for classical visualization tasks and to advantageously perform geometric calculations while remaining computations are handled by multi-processors. The experiment is based on an implementation of a new parallel wavelet radiosity algorithm. The application is executed on the SGI Origin2000 connected to the SGI InfiniteReality2 rendering pipeline. A performance evaluation is presented. Keeping in mind that the approach can benefit all available workstations and super-computers, from small scale (2 processors and 1 graphics pipeline) to large scale ( processors and graphics pipelines), we highlight some important bottlenecks that impede performance. However, our results show that this approach could be a promising avenue for scientific and engineering simulation and visualization applications that need intensive geometric calculations
Scalable Interactive Volume Rendering Using Off-the-shelf Components
This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volu-metric visualization of large rectilinear scalar fields. By employingpipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow map-ping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators
- …