1,964 research outputs found
Method and system for enabling real-time speckle processing using hardware platforms
An accelerator for the speckle atmospheric compensation algorithm may enable real-time speckle processing of video feeds that may enable the speckle algorithm to be applied in numerous real-time applications. The accelerator may be implemented in various forms, including hardware, software, and/or machine-readable media
Hardware acceleration of photon mapping
PhD ThesisThe quest for realism in computer-generated graphics has yielded a range of algorithmic
techniques, the most advanced of which are capable of rendering images at close to photorealistic
quality. Due to the realism available, it is now commonplace that computer graphics are used in
the creation of movie sequences, architectural renderings, medical imagery and product
visualisations.
This work concentrates on the photon mapping algorithm [1, 2], a physically based global
illumination rendering algorithm. Photon mapping excels in producing highly realistic, physically
accurate images.
A drawback to photon mapping however is its rendering times, which can be significantly longer
than other, albeit less realistic, algorithms. Not surprisingly, this increase in execution time is
associated with a high computational cost. This computation is usually performed using the
general purpose central processing unit (CPU) of a personal computer (PC), with the algorithm
implemented as a software routine. Other options available for processing these algorithms
include desktop PC graphics processing units (GPUs) and custom designed acceleration hardware
devices.
GPUs tend to be efficient when dealing with less realistic rendering solutions such as rasterisation,
however with their recent drive towards increased programmability they can also be used to
process more realistic algorithms. A drawback to the use of GPUs is that these algorithms often
have to be reworked to make optimal use of the limited resources available.
There are very few custom hardware devices available for acceleration of the photon mapping
algorithm. Ray-tracing is the predecessor to photon mapping, and although not capable of
producing the same physical accuracy and therefore realism, there are similarities between the
algorithms. There have been several hardware prototypes, and at least one commercial offering,
created with the goal of accelerating ray-trace rendering [3]. However, properties making many of
these proposals suitable for the acceleration of ray-tracing are not shared by photon mapping.
There are even fewer proposals for acceleration of the additional functions found only in photon
mapping.
All of these approaches to algorithm acceleration offer limited scalability. GPUs are inherently
difficult to scale, while many of the custom hardware devices available thus far make use of large
processing elements and complex acceleration data structures.
In this work we make use of three novel approaches in the design of highly scalable specialised
hardware structures for the acceleration of the photon mapping algorithm. Increased scalability is
gained through:
• The use of a brute-force approach in place of the commonly used smart approach, thus
eliminating much data pre-processing, complex data structures and large processing units
often required.
• The use of Logarithmic Number System (LNS) arithmetic computation, which facilitates a
reduction in processing area requirement.
• A novel redesign of the photon inclusion test, used within the photon search method of
the photon mapping algorithm. This allows an intelligent memory structure to be used for
the search.
The design uses two hardware structures, both of which accelerate one core rendering function.
Renderings produced using field programmable gate array (FPGA) based prototypes are presented,
along with details of 90nm synthesised versions of the designs which show that close to an orderof-
magnitude speedup over a software implementation is possible. Due to the scalable nature of
the design, it is likely that any advantage can be maintained in the face of improving processor
speeds.
Significantly, due to the brute-force approach adopted, it is possible to eliminate an often-used
software acceleration method. This means that the device can interface almost directly to a frontend
modelling package, minimising much of the pre-processing required by most other proposals
Body of Knowledge for Graphics Processing Units (GPUs)
Graphics Processing Units (GPU) have emerged as a proven technology that enables high performance computing and parallel processing in a small form factor. GPUs enhance the traditional computer paradigm by permitting acceleration of complex mathematics and providing the capability to perform weighted calculations, such as those in artificial intelligence systems. Despite the performance enhancements provided by this type of microprocessor, there exist tradeoffs in regards to reliability and radiation susceptibility, which may impact mission success. This report provides an insight into GPU architecture and its potential applications in space and other similar markets. It also discusses reliability, qualification, and radiation considerations for testing GPUs
Satellite on-board processing for earth resources data
Results of a survey of earth resources user applications and their data requirements, earth resources multispectral scanner sensor technology, and preprocessing algorithms for correcting the sensor outputs and for data bulk reduction are presented along with a candidate data format. Computational requirements required to implement the data analysis algorithms are included along with a review of computer architectures and organizations. Computer architectures capable of handling the algorithm computational requirements are suggested and the environmental effects of an on-board processor discussed. By relating performance parameters to the system requirements of each of the user requirements the feasibility of on-board processing is determined for each user. A tradeoff analysis is performed to determine the sensitivity of results to each of the system parameters. Significant results and conclusions are discussed, and recommendations are presented
Dynamic partial reconfiguration for pipelined digital systems— A Case study using a color space conversion engine
In digital hardware design, reconfigurable devices such as Field Programmable Gate Arrays (FPGAs) allow for a unique feature called partial reconfiguration PR). This refers to the reprogramming of a subset of the reconfigurable logic during active operation. PR allows multiple hardware blocks to be consolidated into a single partition, which can be reprogrammed at run-time as desired. This may reduce the logic circuit (and silicon area) requirements and greatly extend functionality. Furthermore, dynamic partial reconfiguration (DPR) refers to PR that does not halt the system during reprogramming. This allows for configuration to overlap with normal processing, potentially achieving better system performance than a static(halting) PR implementation. This work has investigated the advantages and trade-offs of DPR as applied to an existing color space conversion(CSC) engine provided by Hewlett-Packard (HP). Two versions were created: a single-pipeline engine, which can only overlap tasks in specific sequences; and a dual-pipeline engine, which can overlap any consecutive tasks. These were implemented in a Virtex-6 FPGA. Data communication occurs over the PCI Express (PCIe) interface. Test results show improvements in execution speed and resource utilization, though some are minor due to intrinsic characteristics of the CSC engine pipeline. The dual-pipeline version outperformed the single-pipeline in most test cases. Therefore, future work will focus on multiple-pipeline architectures
- …