Search CORE

1,858 research outputs found

Fast Reliable Ray-tracing of Procedurally Defined Implicit Surfaces Using Revised Affine Arithmetic

Author: Comninos Peter
Fryazinov Oleg
Pasko Alexander
Publication venue: Bournemouth University
Publication date: 05/10/2009
Field of study

Fast and reliable rendering of implicit surfaces is an important area in the field of implicit modelling. Direct rendering, namely ray-tracing, is shown to be a suitable technique for obtaining good-quality visualisations of implicit surfaces. We present a technique for reliable ray-tracing of arbitrary procedurally defined implicit surfaces by using a modification of Affine Arithmetic called Revised Affine Arithmetic. A wide range of procedurally defined implicit objects can be rendered using this technique including polynomial surfaces, constructive solids, pseudo-random objects, procedurally defined microstructures, and others. We compare our technique with other reliable techniques based on Interval and Affine Arithmetic to show that our technique provides the fastest, while still reliable, ray-surface intersections and ray-tracing. We also suggest possible modifications for the GPU implementation of this technique for real-time rendering of relatively simple implicit models and for near real-time for complex implicit models

Bournemouth University Research Online

High Performance Direct Gravitational N-body Simulations on Graphics Processing Units -- II: An implementation in CUDA

Author: Aarseth
Barnes
Buck
Fernando
Heggie
Jeroen Bédorf
Makino
Makino
Mark
McMillan
Moore
Nitadori
Owens
Owens
Pharr
Portegies Zwart
Portegies Zwart
Robert G. Belleman
Simon F. Portegies Zwart
Warren
Publication venue: 'Elsevier BV'
Publication date: 16/07/2007
Field of study

We present the results of gravitational direct

N

-body simulations using the Graphics Processing Unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed for gaming computers. The force evaluation of the

N

-body problem is implemented in ``Compute Unified Device Architecture'' (CUDA) using the GPU to speed-up the calculations. We tested the implementation on three different

N

-body codes: two direct

N

-body integration codes, using the 4th order predictor-corrector Hermite integrator with block time-steps, and one Barnes-Hut treecode, which uses a 2nd order leapfrog integration scheme. The integration of the equations of motions for all codes is performed on the host CPU. We find that for

N > 512

particles the GPU outperforms the GRAPE-6Af, if some softening in the force calculation is accepted. Without softening and for very small integration time steps the GRAPE still outperforms the GPU. We conclude that modern GPUs offer an attractive alternative to GRAPE-6Af special purpose hardware. Using the same time-step criterion, the total energy of the

N

-body system was conserved better than to one in

10^6

on the GPU, only about an order of magnitude worse than obtained with GRAPE-6Af. For N \apgt 10^5 the 8800GTX outperforms the host CPU by a factor of about 100 and runs at about the same speed as the GRAPE-6Af.Comment: Accepted for publication in New Astronom

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Explicit Cache Management for Volume Ray-Casting on Parallel Architectures

Author: Doggett Michael
Ganestam Per
Jönsson Daniel
Ropinski Timo
Ynnerman Anders
Publication venue: Eurographics - European Association for Computer Graphics
Publication date: 01/01/2012
Field of study

A major challenge when designing general purpose graphics hardware is to allow efficient access to texture data. Although different rendering paradigms vary with respect to their data access patterns, there is no flexibility when it comes to data caching provided by the graphics architecture. In this paper we focus on volume ray-casting, and show the benefits of algorithm-aware data caching. Our Marching Caches method exploits inter-ray coherence and thus utilizes the memory layout of the highly parallel processors by allowing them to share data through a cache which marches along with the ray front. By exploiting Marching Caches we can apply higher-order reconstruction and enhancement filters to generate more accurate and enriched renderings with an improved rendering performance. We have tested our Marching Caches with seven different filters, e. g., Catmul-Rom, B- spline, ambient occlusion projection, and could show that a speed up of four times can be achieved compared to using the caching implicitly provided by the graphics hardware, and that the memory bandwidth to global memory can be reduced by orders of magnitude. Throughout the paper, we will introduce the Marching Cache concept, provide implementation details and discuss the performance and memory bandwidth impact when using different filters

Lund University Publications

Heterogeneous Acceleration for 5G New Radio Channel Modelling Using FPGAs and GPUs

Author: SHAH NASIR ALI
Publication venue: country:Italy
Publication date: 09/10/2023
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Faster Ray Tracing through Hierarchy Cut Code

Author: Kou QiLong
Li Dan
Liu FengQi
Liu MeiZhi
Tan ZhaoNan
Xiang WeiLai
Xu PengZhan
Publication venue
Publication date: 19/07/2023
Field of study

We propose a novel ray reordering technique to accelerate the ray tracing process by encoding and sorting rays prior to traversal. Instead of spatial coordinates, our method encodes rays according to the cuts of the hierarchical acceleration structure, which is called the hierarchy cut code. This approach can better adapt to the acceleration structure and obtain a more reliable encoding result. We also propose a compression scheme to decrease the sorting overhead by a shorter sorting key. In addition, based on the phenomenon of boundary drift, we theoretically explain the reason why existing reordering methods cannot achieve better performance by using longer sorting keys. The experiment demonstrates that our method can accelerate secondary ray tracing by up to 1.81 times, outperforming the existing methods. Such result proves the effectiveness of hierarchy cut code, and indicate that the reordering technique can achieve greater performance improvement, which worth further research

arXiv.org e-Print Archive

The use of primitives in the calculation of radiative view factors

Author: Walker Trevor John
Publication venue: Faculty of Engineering and Information Technologies, School of Chemical and Biomolecular Engineering
Publication date: 01/01/2014
Field of study

Compilations of radiative view factors (often in closed analytical form) are readily available in the open literature for commonly encountered geometries. For more complex three-dimensional (3D) scenarios, however, the effort required to solve the requisite multi-dimensional integrations needed to estimate a required view factor can be daunting to say the least. In such cases, a combination of finite element methods (where the geometry in question is sub-divided into a large number of uniform, often triangular, elements) and Monte Carlo Ray Tracing (MC-RT) has been developed, although frequently the software implementation is suitable only for a limited set of geometrical scenarios. Driven initially by a need to calculate the radiative heat transfer occurring within an operational fibre-drawing furnace, this research set out to examine options whereby MC-RT could be used to cost-effectively calculate any generic 3D radiative view factor using current vectorisation technologies

Sydney eScholarship

DISPATCH: A Numerical Simulation Framework for the Exa-scale Era. I. Fundamentals

Author: Kuffmeier M.
Nordlund Å.
Popovas A.
Ramsey J. P.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

We introduce a high-performance simulation framework that permits the semi-independent, task-based solution of sets of partial differential equations, typically manifesting as updates to a collection of `patches' in space-time. A hybrid MPI/OpenMP execution model is adopted, where work tasks are controlled by a rank-local `dispatcher' which selects, from a set of tasks generally much larger than the number of physical cores (or hardware threads), tasks that are ready for updating. The definition of a task can vary, for example, with some solving the equations of ideal magnetohydrodynamics (MHD), others non-ideal MHD, radiative transfer, or particle motion, and yet others applying particle-in-cell (PIC) methods. Tasks do not have to be grid-based, while tasks that are, may use either Cartesian or orthogonal curvilinear meshes. Patches may be stationary or moving. Mesh refinement can be static or dynamic. A feature of decisive importance for the overall performance of the framework is that time steps are determined and applied locally; this allows potentially large reductions in the total number of updates required in cases when the signal speed varies greatly across the computational domain, and therefore a corresponding reduction in computing time. Another feature is a load balancing algorithm that operates `locally' and aims to simultaneously minimise load and communication imbalance. The framework generally relies on already existing solvers, whose performance is augmented when run under the framework, due to more efficient cache usage, vectorisation, local time-stepping, plus near-linear and, in principle, unlimited OpenMP and MPI scaling.Comment: 17 pages, 8 figures. Accepted by MNRA

arXiv.org e-Print Archive

Copenhagen University Research Information System

Doctor of Philosophy

Author: Kensler Andrew E.
Publication venue: University of Utah
Publication date: 27/04/2011
Field of study

dissertationThis dissertation explores three key facets of software algorithms for custom hardware ray tracing: primitive intersection, shading, and acceleration structure construction. For the first, primitive intersection, we show how nearly all of the existing direct three-dimensional (3D) ray-triangle intersection tests are mathematically equivalent. Based on this, a genetic algorithm can automatically tune a ray-triangle intersection test for maximum speed on a particular architecture. We also analyze the components of the intersection test to determine how much floating point precision is required and design a numerically robust intersection algorithm. Next, for shading, we deconstruct Perlin noise into its basic parts and show how these can be modified to produce a gradient noise algorithm that improves the visual appearance. This improved algorithm serves as the basis for a hardware noise unit. Lastly, we show how an existing bounding volume hierarchy can be postprocessed using tree rotations to further reduce the expected cost to traverse a ray through it. This postprocessing also serves as the basis for an efficient update algorithm for animated geometry. Together, these contributions should improve the efficiency of both software- and hardware-based ray tracers

The University of Utah: J. Willard Marriott Digital Library

Doctor of Philosophy in Computer Science

Author: Kopta Daniel
Publication venue: University of Utah
Publication date: 01/01/2016
Field of study

dissertationRay tracing is becoming more widely adopted in offline rendering systems due to its natural support for high quality lighting. Since quality is also a concern in most real time systems, we believe ray tracing would be a welcome change in the real time world, but is avoided due to insufficient performance. Since power consumption is one of the primary factors limiting the increase of processor performance, it must be addressed as a foremost concern in any future ray tracing system designs. This will require cooperating advances in both algorithms and architecture. In this dissertation I study ray tracing system designs from a data movement perspective, targeting the various memory resources that are the primary consumer of power on a modern processor. The result is high performance, low energy ray tracing architectures

The University of Utah: J. Willard Marriott Digital Library