67 research outputs found

    GPU-based Streaming for Parallel Level of Detail on Massive Model Rendering

    Get PDF
    Rendering massive 3D models in real-time has long been recognized as a very challenging problem because of the limited computational power and memory space available in a workstation. Most existing rendering techniques, especially level of detail (LOD) processing, have suffered from their sequential execution natures, and does not scale well with the size of the models. We present a GPU-based progressive mesh simplification approach which enables the interactive rendering of large 3D models with hundreds of millions of triangles. Our work contributes to the massive rendering research in two ways. First, we develop a novel data structure to represent the progressive LOD mesh, and design a parallel mesh simplification algorithm towards GPU architecture. Second, we propose a GPU-based streaming approach which adopt a frame-to-frame coherence scheme in order to minimize the high communication cost between CPU and GPU. Our results show that the parallel mesh simplification algorithm and GPU-based streaming approach significantly improve the overall rendering performance

    Shape from Projections via Differentiable Forward Projector for Computed Tomography

    Full text link
    In computed tomography, the reconstruction is typically obtained on a voxel grid. In this work, however, we propose a mesh-based reconstruction method. For tomographic problems, 3D meshes have mostly been studied to simulate data acquisition, but not for reconstruction, for which a 3D mesh means the inverse process of estimating shapes from projections. In this paper, we propose a differentiable forward model for 3D meshes that bridge the gap between the forward model for 3D surfaces and optimization. We view the forward projection as a rendering process, and make it differentiable by extending recent work in differentiable rendering. We use the proposed forward model to reconstruct 3D shapes directly from projections. Experimental results for single-object problems show that the proposed method outperforms traditional voxel-based methods on noisy simulated data. We also apply the proposed method on electron tomography images of nanoparticles to demonstrate the applicability of the method on real data

    A parallel algorithm for construction of uniform grids

    Full text link

    A sparse octree gravitational N-body code that runs entirely on the GPU processor

    Get PDF
    We present parallel algorithms for constructing and traversing sparse octrees on graphics processing units (GPUs). The algorithms are based on parallel-scan and sort methods. To test the performance and feasibility, we implemented them in CUDA in the form of a gravitational tree-code which completely runs on the GPU.(The code is publicly available at: http://castle.strw.leidenuniv.nl/software.html) The tree construction and traverse algorithms are portable to many-core devices which have support for CUDA or OpenCL programming languages. The gravitational tree-code outperforms tuned CPU code during the tree-construction and shows a performance improvement of more than a factor 20 overall, resulting in a processing rate of more than 2.8 million particles per second.Comment: Accepted version. Published in Journal of Computational Physics. 35 pages, 12 figures, single colum

    Fast, effective BVH updates for dynamic ray-traced scenes using tree rotations

    Get PDF
    technical reportBounding volume hierarchies are a popular choice for ray tracing animated scenes due to the relative simplicity of refitting bounding volumes around moving geometry. However, the quality of such a refitted tree can degrade rapidly if objects in the scene deform or rearrange significantly as the animation progresses, resulting in dramatic increases in rendering times. Existing solutions involve occasional or heuristically triggered rebuilds of the BVH to reduce this effect. In this work, we describe how to efficiently extend refitting with local restructuring operations called tree rotations which can mitigate the effects that moving primitives have on BVH quality by rearranging nodes in the tree during each refit rather than triggering a full rebuild. The result is a fast, lightweight, incremental update algorithm that requires negligible memory, has minor update times and parallelizes easily, yet avoids significant degradation in tree quality or the need for rebuilding while maintaining fast rendering times. We show that our method approaches or exceeds the frame rates of other techniques and is consistently among the best options regardless of the animation scene

    A scalable H-matrix approach for the solution of boundary integral equations on multi-GPU clusters

    Get PDF
    In this work, we consider the solution of boundary integral equations by means of a scalable hierarchical matrix approach on clusters equipped with graphics hardware, i.e. graphics processing units (GPUs). To this end, we extend our existing single-GPU hierarchical matrix library hmglib such that it is able to scale on many GPUs and such that it can be coupled to arbitrary application codes. Using a model GPU implementation of a boundary element method (BEM) solver, we are able to achieve more than 67 percent relative parallel speed-up going from 128 to 1024 GPUs for a model geometry test case with 1.5 million unknowns and a real-world geometry test case with almost 1.2 million unknowns. On 1024 GPUs of the cluster Titan, it takes less than 6 minutes to solve the 1.5 million unknowns problem, with 5.7 minutes for the setup phase and 20 seconds for the iterative solver. To the best of the authors' knowledge, we here discuss the first fully GPU-based distributed-memory parallel hierarchical matrix Open Source library using the traditional H-matrix format and adaptive cross approximation with an application to BEM problems

    An Investigation into Animating Plant Structures within Real-time Constraints

    Get PDF
    This paper is an analysis of current developments in rendering botanical structures for scientic and entertainment purposes with a focus on visualising growth. The choices of practical investigations produce a novel approach for parallel parsing of difficult bracketed L-Systems, based upon the work of Lipp, Wonka and Wimmer (2010). Alongside this is a general overview of the issues involved when looking at growing systems, technical details involving programming for the Graphics Processing Unit (GPU) and other possible solutions for further work that also could achieve the project's goals
    • …
    corecore