38 research outputs found

    Scalable ray tracing with multiple GPGPUs

    Get PDF
    Rapid development in the field of computer graphics over the last 40 years has brought forth different techniques to render scenes. Rasterization is today’s most widely used technique, which in its most basic form sequentially draws thousands of polygons and applies texture on them. Ray tracing is an alternative method that mimics light transport by using rays to sample a scene in memory and render the color found at each ray’s scene intersection point. Although mainstream hardware directly supports rasterization, ray tracing would be the preferred technique due to its ability to produce highly crisp and realistic graphics, if hardware were not a limitation. Making an immediate hardware transition from rasterization to ray tracing would have a severe impact on the computer graphics industry since it would require redevelopment of existing 3D graphics-employing software, so any transition to ray tracing would be gradual. Previous efforts to perform ray tracing on mainstream rasterizing hardware platforms with a single processor have performed poorly. This thesis explores how a multiple GPGPU system can be used to render scenes via ray tracing. A ray tracing engine and API groundwork was developed using NVIDIA’s CUDA (Compute Unified Device Architecture) GPGPU programming environment and was used to evaluate performance scalability across a multi-GPGPU system. This engine supports triangle, sphere, disc, rectangle, and torus rendering. It also allows independent activation of graphics features including procedural texturing, Phong illumination, reflections, translucency, and shadows. Correctness of rendered images validates the ray traced results, and timing of rendered scenes benchmarks performance. The main test scene contains all object types, has a total of 32 Abstract objects, and applies all graphics features. Ray tracing this scene using two GPGPUs outperformed the single-GPGPU and single-CPU systems, yielding respective speedups of up to 1.8 and 31.25. The results demonstrate how much potential exists in treating a modern dual-GPU architecture as a dual-GPGPU system in order to facilitate a transition from rasterization to ray tracing

    Scalable data abstractions for distributed parallel computations

    Get PDF
    The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many current parallel programming models use a shared memory model to provide data abstraction but this doesn't scale well with large numbers of cores due to non-determinism and access latency. This paper proposes a simple programming model that allows scalable parallel programs to be expressed with distributed representations of data and it provides the programmer with the flexibility to employ shared or distributed styles of data-parallelism where applicable. It is capable of an efficient implementation, and with the provision of a small set of primitive capabilities in the hardware, it can be compiled to operate directly on the hardware, in the same way stack-based allocation operates for subroutines in sequential machines

    The use of primitives in the calculation of radiative view factors

    Get PDF
    Compilations of radiative view factors (often in closed analytical form) are readily available in the open literature for commonly encountered geometries. For more complex three-dimensional (3D) scenarios, however, the effort required to solve the requisite multi-dimensional integrations needed to estimate a required view factor can be daunting to say the least. In such cases, a combination of finite element methods (where the geometry in question is sub-divided into a large number of uniform, often triangular, elements) and Monte Carlo Ray Tracing (MC-RT) has been developed, although frequently the software implementation is suitable only for a limited set of geometrical scenarios. Driven initially by a need to calculate the radiative heat transfer occurring within an operational fibre-drawing furnace, this research set out to examine options whereby MC-RT could be used to cost-effectively calculate any generic 3D radiative view factor using current vectorisation technologies

    Iterative transmission image reconstruction for the DPET positron emission tomograph

    Get PDF
    Positron emission tomography (PET) systems use transmission imaging to compensate for attenuation. One commercial example of this approach is the Siemens Inveon Dedicated PET (DPET), a 120mm bore system dedicated to the study of small animals. DPET transmission images are currently reconstructed using single slice rebinning followed by filtered backprojection. Single slice rebinning attributes the attenuation associated with an oblique line integral to the direct midplane intersected thereby. This leads to position-dependent axial blurring, especially for large diameter animals, and objects with abrupt axial changes in diameter. The mathematics underlying filtered backprojection are based on assumptions that are not met by the scanner, including but not limited to data being sampled in a uniform fashion. These limitations can be alleviated by an iterative algorithm if the associated system model is made to match the physical set-up. The downside is typically viewed as a potentially prohibitive increase in the computational cost. In this dissertation, we report on the implementation and use of Simultaneous Iterative Reconstruction Technique (SIRT) (a weighted least-squares solver) for transmission imaging on the DPET. We provide experimental evidence regarding the improvement in transmission image quality. We also show that these new, higher quality images can be computed in less than two minutes on the existing DPET host computer thus making the approach practical. Computational speed is gained both algorithmically through relaxation and use of ordered subsets and implementation-wise through vector based arithmetic and multi-core program execution

    Parallel 3-D Method of Characteristics with Linear Source and Advanced Transverse Integration

    Full text link
    In the design and analysis of nuclear fission reactor systems, simulations are an essential tool for improving efficiency and safety. Neutronics simulations have always been limited by the available computational resources. This is because of the large discretizations necessary for the neutron transport equation, which has a 6-dimensional phase space for steady-state eigenvalue problems. The “gold standard” for 3-D neutron transport simulations is Monte Carlo with explicit geometry representation because it treats all dependent variables continuously. However, there are significant remaining challenges for Monte Carlo methods that prohibit widespread use and put them at a disadvantage compared to deterministic methods. The “gold standard” for deterministic 3-D neutron transport is the MoC. Numerous deterministic methods exist for solving the transport equation. Each of them has their own drawback. MoC is considered the “best” due to its ability to accurately model the exact geometry and approximate anisotropic scattering (other methods do just one of these well or become undesirably complex). The downside of the 3-D MoC method is the substantial computational resources required to discretize the problem. There has been renewed interest in assessing the state of the art for MoC and the tractability of this problem on the newest computer architectures. Previous work made significant strides in parallelizing the 3-D MoC algorithm for 100,000’s of processors, but ultimately did not prove viable due to the extreme compute resources required. Since then there has been progress in making 3-D MoC less computationally burdensome by adopting more advanced discretization methods that lead to fewer spatial mesh regions and rays; namely the linear-source approximation (LSA), and chord-classification or on-the-fly ray-tracing. The goal of this thesis is to continue progress in reducing the computational burden of MoC calculations, with a focus on three-dimensional calculation. This thesis tries to reach this goal through three related contributions: the utilization of graph-theory for spatial decomposition, improvements to the LSA for Multiphysics calculations, and a novel 3-D ray-tracing method with advanced transverse integration. Spatial decomposition is typically very beneficial, if not necessary, for whole-core direct transport methods. Previous works on 3-D MoC calculations have used simple spatial decomposition schemes, that often resulted in poor load-balancing, particularly when using the LSA. This work addresses this issue by utilizing graph partitioning methods to give better load-balance, even in cases where the number of computational cells is very different in different regions of the reactor. The LSA has previously been shown to allow for the use of a coarser mesh while maintaining accuracy in pure neutronics calculations. However, typically the problems of interest involve multiple physics such as isotopic depletion and thermal-hydraulic (T/H) feedback. This work improves the LSA method for such problems by re-formulating the equations to eliminate an inefficiency in cases with non-constant cross sections. This is shown to significantly improve run-times and reduce memory usage, even in such cases. Finally, a novel 3-D ray-tracing method, based on the macroband, is developed to reduce the number of characteristic tracks necessary for accurate results. The method is compared against a traditional ray-tracing method for several benchmark problems. In several of these cases, the method is shown to significantly reduce the number of segments necessary for similar accuracy. The ray-tracing method is also shown to have very desirable properties such as near-monotonic convergence, and can act as more of a “black-box” solver.PHDNuclear Engineering & Radiological SciencesUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155260/1/apfitzge_1.pd

    GPU PERFORMANCE MODELLING AND OPTIMIZATION

    Get PDF
    Ph.DNUS-TU/E JOINT PH.D

    Large-Scale Spatial Data Management on Modern Parallel and Distributed Platforms

    Full text link
    Rapidly growing volume of spatial data has made it desirable to develop efficient techniques for managing large-scale spatial data. Traditional spatial data management techniques cannot meet requirements of efficiency and scalability for large-scale spatial data processing. In this dissertation, we have developed new data-parallel designs for large-scale spatial data management that can better utilize modern inexpensive commodity parallel and distributed platforms, including multi-core CPUs, many-core GPUs and computer clusters, to achieve both efficiency and scalability. After introducing background on spatial data management and modern parallel and distributed systems, we present our parallel designs for spatial indexing and spatial join query processing on both multi-core CPUs and GPUs for high efficiency as well as their integrations with Big Data systems for better scalability. Experiment results using real world datasets demonstrate the effectiveness and efficiency of the proposed techniques on managing large-scale spatial data
    corecore