16 research outputs found
Real-time selective rendering
Traditional physically-based renderers can produce highly realistic imagery; however, suffer from lengthy execution times, which make them impractical for use in interactive applications. Selective rendering exploits limitations in the human visual system to render images that are perceptually similar to high-fidelity renderings in a fraction of the time. This paper outlines current research being carried out by the author to tackle this problem, using a combination of ray-tracing acceleration techniques, GPU-based processing, and selective rendering methods. The research will also seek to confirm results published in literature, which indicate that users fail to notice any quality degradation between high-fidelity imagery and a corresponding selective rendering.peer-reviewe
Организация конвейерных вычислений в задачах описания газодинамических объектов
На прикладі двовимірного випадку показаний порядок розрахунку осередків на кожному із трьох
етапів. Виконано оцінку мінімально необхідного числа обчислювачів для кожного етапу.
Запропоновано порядок запуску обчислювачів кожного етапу для організації безперервних
конвеєрних обчислень. Отримано співвідношення для розрахунку числа тактів роботи
обчислювачів на кожному етапі для однієї ітерації залежно від розмірності розрахункової сітки.
Отримані результати можуть бути використані при розробці систем візуалізації у реальному часі
для наукових досліджень та тренажерів різного призначення.By the example of a two-dimensional case the procedure of calculation of cells on each of three stages is
shown. The estimation of minimally necessary number of calculators for each stage is executed. The
order of start of calculators of each stage for the organization of continuous conveyor calculations is
offered. Parities for calculation of number of steps of work of calculators at each stage for one iteration
are received depending on dimension of a settlement grid. The received results can be used at system
engineering visualization of real time for scientific researches and simulators of various purposes
Подход к визуализации триангулированных поверхностей методом обратного трассирования лучей
У статті введене поняття гранично триангульованої поверхні, розглянуті способи її отримання і
завдання, приведено алгоритм пошуку точок перетинання проекційного променя з трикутником
цієї поверхні. Для запропонованого підходу розглянуті особливості реалізації технології
субдискретизації екрану, виконана оцінка необхідної пропускної здатності шини даних, проведене
порівняння по обчислювальних витратах з архітектурою SaarCOR, запропоновано структуру
спецобчислювача, який оптимізований для паралельно-конвеєрної архітектури з безперервним
потоком вхідних даних.In the article it is introduced conception of ultimate triangulated surface, obtaining method and
representation such surface is considered. Intersection algorithm projection ray with triangle is described.
For proposed approach is analyzed features of supersampling realization, rough estimate for capacity of
data bus is performed, computing costs comparison with SaarCOR architecture is introduced, special
numerator architecture is proposed
Estimating performance of an ray- tracing ASIC design
Journal ArticleRecursive ray tracing is a powerful rendering technique used to compute realistic images by simulating the global light transport in a scene. Algorithmic improvements and FPGA-based hardware implementations of ray tracing have demonstrated realtime performance but hardware that achieves performance levels comparable to commodity rasterization graphics chips is still not available. This paper describes the architecture and ASIC implementations of the DRPU design (Dynamic Ray Processing Unit) that closes this performance gap. The DRPU supports fully programmable shading and most kinds of dynamic scenes and thus provides similar capabilities as current GPUs. It achieves high efficiency due to SIMD processing of floating point vectors, massive multithreading, synchronous execution of packets of threads, and careful management of caches for scene data. To support dynamic scenes B-KD trees are used as spatial index structures that are processed by a custom traversal and intersection unit and modified by an Update Processor on scene changes
Efficient algorithms for occlusion culling and shadows
The goal of this research is to develop more efficient techniques for computing the visibility and shadows in real-time rendering of three-dimensional scenes. Visibility algorithms determine what is visible from a camera, whereas shadow algorithms solve the same problem from the viewpoint of a light source.
In rendering, a lot of computational resources are often spent on primitives that are not visible in the final image. One visibility algorithm for reducing the overhead is occlusion culling, which quickly discards the objects or primitives that are obstructed from the view by other primitives. A new method is presented for performing occlusion culling using silhouettes of meshes instead of triangles. Additionally, modifications are suggested to occlusion queries in order to reduce their computational overhead.
The performance of currently available graphics hardware depends on the ordering of input primitives. A new technique, called delay streams, is proposed as a generic solution to order-dependent problems. The technique significantly reduces the pixel processing requirements by improving the efficiency of occlusion culling inside graphics hardware. Additionally, the memory requirements of order-independent transparency algorithms are reduced.
A shadow map is a discretized representation of the scene geometry as seen by a light source. Typically the discretization causes difficult aliasing issues, such as jagged shadow boundaries and incorrect self-shadowing. A novel solution is presented for suppressing all types of aliasing artifacts by providing the correct sampling points for shadow maps, thus fully abandoning the previously used regular structures. Also, a simple technique is introduced for limiting the shadow map lookups to the pixels that get projected inside the shadow map.
The fillrate problem of hardware-accelerated shadow volumes is greatly reduced with a new hierarchical rendering technique. The algorithm performs per-pixel shadow computations only at visible shadow boundaries, and uses lower resolution shadows for the parts of the screen that are guaranteed to be either fully lit or fully in shadow.
The proposed techniques are expected to improve the rendering performance in most real-time applications that use 3D graphics, especially in computer games. More efficient algorithms for occlusion culling and shadows are important steps towards larger, more realistic virtual environments.reviewe
Implicit Object Space Partitioning: The No-Memory BVH
We present a new ray tracing algorithm that requires no explicit acceleration data structure and therefore no memory. It is represented in a completely implicit way by triangle reordering. This new implicit data structure is simple to build, efficient to traverse and has a fast total time to image. The implicit acceleration data structure must be constructed only once and can be reused for arbitrary numbers of rays or ray batches without the need to rebuild the hierarchy. Due to the fast build times it is very well suitable for dynamic and animated scenes. We compare it to classic acceleration data structures, like a Bounding Volume Hierarchy, and analyze its effciency
Software-Based Ray Tracing for Mobile Devices
Ray tracing is a way to produce realistic images of three dimensional virtual scenes. It scales more to the number of pixels in the image than to the amount of details in the scene. This makes it an interesting application for mobile systems, which in general have smaller screens.
Modern high-performance ray tracing depends on special acceleration data structures such as bounding volume hierarchies. Compressing the size of the bounding volume hierarchy leads to smaller memory bandwidth usage. This should be especially beneficial for mobile systems, which in general have smaller memory bandwidth. Compression also reduces cache misses and memory usage. Unfortunately, compression reduces the quality of the data structure, leading the ray traversal into unnecessary computations. In addition, compression increases the amount of work which needs to be carried out in the performance critical inner loop.
The previous work on bounding volume hierarchy compression concentrates on inferring some of the coordinates from other coordinates or using different integer precisions. This thesis concentrates on using half-precision floating-point numbers, which have potential due to their greater dynamic range. If the halfs are too inaccurate for use as plain world coordinates, they can be used with hierarchical encoding. This restores the quality of the data structure back to original, but it requires even more work in the inner loop.
Halfs reduce the whole memory usage by 7% and cache misses by 16%. Furthermore, they reduce power usage by 1.7%. The halfs’ effect on the performance is heavily dependent on the targeted hardware’s support for them. If decompression of the halfs is too slow, they will have a negative impact. Compared to integers, halfs have better performance in the so-called teapot-in-a-stadium problem
Doctor of Philosophy in Computer Science
dissertationRay tracing is becoming more widely adopted in offline rendering systems due to its natural support for high quality lighting. Since quality is also a concern in most real time systems, we believe ray tracing would be a welcome change in the real time world, but is avoided due to insufficient performance. Since power consumption is one of the primary factors limiting the increase of processor performance, it must be addressed as a foremost concern in any future ray tracing system designs. This will require cooperating advances in both algorithms and architecture. In this dissertation I study ray tracing system designs from a data movement perspective, targeting the various memory resources that are the primary consumer of power on a modern processor. The result is high performance, low energy ray tracing architectures
Doctor of Philosophy
dissertationThis dissertation explores three key facets of software algorithms for custom hardware ray tracing: primitive intersection, shading, and acceleration structure construction. For the first, primitive intersection, we show how nearly all of the existing direct three-dimensional (3D) ray-triangle intersection tests are mathematically equivalent. Based on this, a genetic algorithm can automatically tune a ray-triangle intersection test for maximum speed on a particular architecture. We also analyze the components of the intersection test to determine how much floating point precision is required and design a numerically robust intersection algorithm. Next, for shading, we deconstruct Perlin noise into its basic parts and show how these can be modified to produce a gradient noise algorithm that improves the visual appearance. This improved algorithm serves as the basis for a hardware noise unit. Lastly, we show how an existing bounding volume hierarchy can be postprocessed using tree rotations to further reduce the expected cost to traverse a ray through it. This postprocessing also serves as the basis for an efficient update algorithm for animated geometry. Together, these contributions should improve the efficiency of both software- and hardware-based ray tracers