15 research outputs found

    Ray-Tracing Using SSE

    Get PDF
    Tato práce se zabývá využitím SSE instrukcí k akceleraci výpočtů probíhajících při ray-tracingu. Aby bylo možné SSE instrukce co nejefektivněji použít, bylo zvoleno současné sledování čtyř paprsků uzavřených v jednom svazku. Byla provedena vektorizace algoritmů použitých v ray-tracingu a také bylo navrženo a implementováno řešení rozpadu svazku paprsků. Provedenými testy pak byla sledována doba renderování obrazu pro případ, kdy jsou všechny paprsky pohromadě, ale také pro případ, kdy se ve svazku nachází pouze jeden paprsek.This thesis describes the acceleration technique of ray-tracing method using SSE instruction set. Choosing the parallel tracing of four rays enclosed in one beam turned to be the best way of using SSE effectively. Also the vectorization of algorithms which are used in ray-tracing method was implemented. The solution of beam splitting was designed and implemented too. The time for rendering image was monitored in the tests - one for case when the beam includes all the rays and one for case when there is just one ray in the beam.

    Scalable ray tracing with multiple GPGPUs

    Get PDF
    Rapid development in the field of computer graphics over the last 40 years has brought forth different techniques to render scenes. Rasterization is today’s most widely used technique, which in its most basic form sequentially draws thousands of polygons and applies texture on them. Ray tracing is an alternative method that mimics light transport by using rays to sample a scene in memory and render the color found at each ray’s scene intersection point. Although mainstream hardware directly supports rasterization, ray tracing would be the preferred technique due to its ability to produce highly crisp and realistic graphics, if hardware were not a limitation. Making an immediate hardware transition from rasterization to ray tracing would have a severe impact on the computer graphics industry since it would require redevelopment of existing 3D graphics-employing software, so any transition to ray tracing would be gradual. Previous efforts to perform ray tracing on mainstream rasterizing hardware platforms with a single processor have performed poorly. This thesis explores how a multiple GPGPU system can be used to render scenes via ray tracing. A ray tracing engine and API groundwork was developed using NVIDIA’s CUDA (Compute Unified Device Architecture) GPGPU programming environment and was used to evaluate performance scalability across a multi-GPGPU system. This engine supports triangle, sphere, disc, rectangle, and torus rendering. It also allows independent activation of graphics features including procedural texturing, Phong illumination, reflections, translucency, and shadows. Correctness of rendered images validates the ray traced results, and timing of rendered scenes benchmarks performance. The main test scene contains all object types, has a total of 32 Abstract objects, and applies all graphics features. Ray tracing this scene using two GPGPUs outperformed the single-GPGPU and single-CPU systems, yielding respective speedups of up to 1.8 and 31.25. The results demonstrate how much potential exists in treating a modern dual-GPU architecture as a dual-GPGPU system in order to facilitate a transition from rasterization to ray tracing

    Doctor of Philosophy

    Get PDF
    dissertationThe embedded system space is characterized by a rapid evolution in the complexity and functionality of applications. In addition, the short time-to-market nature of the business motivates the use of programmable devices capable of meeting the conflicting constraints of low-energy, high-performance, and short design times. The keys to achieving these conflicting constraints are specialization and maximally extracting available application parallelism. General purpose processors are flexible but are either too power hungry or lack the necessary performance. Application-specific integrated circuits (ASICS) efficiently meet the performance and power needs but are inflexible. Programmable domain-specific architectures (DSAs) are an attractive middle ground, but their design requires significant time, resources, and expertise in a variety of specialties, which range from application algorithms to architecture and ultimately, circuit design. This dissertation presents CoGenE, a design framework that automates the design of energy-performance-optimal DSAs for embedded systems. For a given application domain and a user-chosen initial architectural specification, CoGenE consists of a a Compiler to generate execution binary, a simulator Generator to collect performance/energy statistics, and an Explorer that modifies the current architecture to improve energy-performance-area characteristics. The above process repeats automatically until the user-specified constraints are achieved. This removes or alleviates the time needed to understand the application, manually design the DSA, and generate object code for the DSA. Thus, CoGenE is a new design methodology that represents a significant improvement in performance, energy dissipation, design time, and resources. This dissertation employs the face recognition domain to showcase a flexible architectural design methodology that creates "ASIC-like" DSAs. The DSAs are instruction set architecture (ISA)-independent and achieve good energy-performance characteristics by coscheduling the often conflicting constraints of data access, data movement, and computation through a flexible interconnect. This represents a significant increase in programming complexity and code generation time. To address this problem, the CoGenE compiler employs integer linear programming (ILP)-based 'interconnect-aware' scheduling techniques for automatic code generation. The CoGenE explorer employs an iterative technique to search the complete design space and select a set of energy-performance-optimal candidates. When compared to manual designs, results demonstrate that CoGenE produces superior designs for three application domains: face recognition, speech recognition and wireless telephony. While CoGenE is well suited to applications that exhibit a streaming behavior, multithreaded applications like ray tracing present a different but important challenge. To demonstrate its generality, CoGenE is evaluated in designing a novel multicore N-wide SIMD architecture, known as StreamRay, for the ray tracing domain. CoGenE is used to synthesize the SIMD execution cores, the compiler that generates the application binary, and the interconnection subsystem. Further, separating address and data computations in space reduces data movement and contention for resources, thereby significantly improving performance compared to existing ray tracing approaches

    Doctor of Philosophy in Computer Science

    Get PDF
    dissertationRay tracing is becoming more widely adopted in offline rendering systems due to its natural support for high quality lighting. Since quality is also a concern in most real time systems, we believe ray tracing would be a welcome change in the real time world, but is avoided due to insufficient performance. Since power consumption is one of the primary factors limiting the increase of processor performance, it must be addressed as a foremost concern in any future ray tracing system designs. This will require cooperating advances in both algorithms and architecture. In this dissertation I study ray tracing system designs from a data movement perspective, targeting the various memory resources that are the primary consumer of power on a modern processor. The result is high performance, low energy ray tracing architectures

    On real-time ray tracing

    Get PDF
    Rendering of increasingly complex and detailed objects and scenes, with physically correct light simulation, is an important problem for many fields ranging from medical imaging to computer games. While even the latest graphics processing units are unable to render truly massive models consisting of hundreds of millions of primitives, an algorithm known as ray tracing – which by its very nature approximates light transport – can be used to solve such problems. Ray tracing is a simple but powerful method known to produce high image quality, but it is also known for its slow execution speed. This thesis examines parts of the research made to bring ray tracing into the interactive sphere. Specifically, it explores ray-triangle intersections, ray coherency, as well as kd-tree building and traversal. Even though these issues are delved into in the context of interactive graphics, the insights provided by the analyzed literature will also translate to other domains. Asiasanat:ray tracing, kd-tre

    Doctor of Philosophy

    Get PDF
    dissertationBalancing the trade off between the spatial and temporal quality of interactive computer graphics imagery is one of the fundamental design challenges in the construction of rendering systems. Inexpensive interactive rendering hardware may deliver a high level of temporal performance if the level of spatial image quality is sufficiently constrained. In these cases, the spatial fidelity level is an independent parameter of the system and temporal performance is a dependent variable. The spatial quality parameter is selected for the system by the designer based on the anticipated graphics workload. Interactive ray tracing is one example; the algorithm is often selected due to its ability to deliver a high level of spatial fidelity, and the relatively lower level of temporal performance isreadily accepted. This dissertation proposes an algorithm to perform fine-grained adjustments to the trade off between the spatial quality of images produced by an interactive renderer, and the temporal performance or quality of the rendered image sequence. The approach first determines the minimum amount of sampling work necessary to achieve a certain fidelity level, and then allows the surplus capacity to be directed towards spatial or temporal fidelity improvement. The algorithm consists of an efficient parallel spatial and temporal adaptive rendering mechanism and a control optimization problem which adjusts the sampling rate based on a characterization of the rendered imagery and constraints on the capacity of the rendering system

    Spin-scanning Cameras for Planetary Exploration: Imager Analysis and Simulation

    Get PDF
    In this thesis, a novel approach to spaceborne imaging is investigated, building upon the scan imaging technique in which camera motion is used to construct an image. This thesis investigates its use with wide-angle (≥90° field of view) optics mounted on spin stabilised probes for large-coverage imaging of planetary environments, and focusses on two instruments. Firstly, a descent camera concept for a planetary penetrator. The imaging geometry of the instrument is analysed. Image resolution is highest at the penetrator’s nadir and lowest at the horizon, whilst any point on the surface is imaged with highest possible resolution when the camera’s altitude is equal to that point’s radius from nadir. Image simulation is used to demonstrate the camera’s images and investigate analysis techniques. A study of stereophotogrammetric measurement of surface topography using pairs of descent images is conducted. Measurement accuracies and optimum stereo geometries are presented. Secondly, the thesis investigates the EnVisS (Entire Visible Sky) instrument, under development for the Comet Interceptor mission. The camera’s imaging geometry, coverage and exposure times are calculated, and used to model the expected signal and noise in EnVisS observations. It is found that the camera’s images will suffer from low signal, and four methods for mitigating this – binning, coaddition, time-delay integration and repeat sampling – are investigated and described. Use of these methods will be essential if images of sufficient signal are to be acquired, particularly for conducting polarimetry, the performance of which is modelled using Monte Carlo simulation. Methods of simulating planetary cameras’ images are developed to facilitate the study of both cameras. These methods enable the accurate simulation of planetary surfaces and cometary atmospheres, are based on Python libraries commonly used in planetary science, and are intended to be readily modified and expanded for facilitating the study of a variety of planetary cameras

    Packet-based whitted and distribution ray tracing

    No full text
    technical reportMuch progress has been made toward interactive ray tracing, but most research has focused specifically on ray casting. A common approach is to use ?packets? of rays to amortize cost across sets of rays. Little is known about how well packet-based techniques will work for reflection and refraction rays, which do not share common origins, and often have less directional coherence than viewing and shadow rays. Since the primary advantage of ray tracing over rasterization is the computation of global effects, such as accurate reflection and refraction, this lack of knowledge should be corrected. Our ultimate goal is to achieve interactive distribution ray tracing with randomized rays for glossy reflections, soft shadows, motion blur and depth of field. But it is not clear that the randomization would not further erode the effectiveness of techniques used to accelerate ray casting. This paper addresses the question of whether packet-based ray algorithms can be effectively used for more than visibility computation. It is shown that with the appropriate choice of data structure and packet assembly algorithm, useful algorithms for ray casting do indeed extend to both Whitted-style and distribution ray tracing programs

    Packet-based whitted and distribution ray tracing

    No full text
    Much progress has been made toward interactive ray tracing, but most research has focused specifically on ray casting. A common approach is to use “packets ” of rays to amortize cost across sets of rays. Whether “packets ” can be used to speed up the cost of reflection and refraction rays is unclear. The issue is complicated since such rays do not share common origins and often have less directional coherence than viewing and shadow rays. Since the primary advantage of ray tracing over rasterization is the computation of global effects, such as accurate reflection and refraction, this lack of knowledge should be corrected. We are also interested in exploring whether distribution ray tracing, due to its stochastic properties, further erodes the effectiveness of techniques used to accelerate ray casting. This paper addresses the question of whether packet-based ray algorithms can be effectively used for more than visibility computation. We show that by choosing an appropriate data structure and a suitable packet assembly algorithm we can extend the idea of “packets” from ray casting to Whitted-style and distribution ray tracing, while maintaining efficiency
    corecore