1,791 research outputs found

    FPGA implementation of a simple 3D graphics pipeline

    Get PDF
    Conventional methods for computing 3D projects are nowadays usually implemented on standard or graphics processors. The performance of these devices is limited especially by the used architecture, which to some extent works in a sequential manner. In this article we describe a project which utilizes parallel computation for simple projection of a wireframe 3D model. The algorithm is optimized for a FPGA-based implementation. The design of the numerical logic is described in VHDL with the use of several basic IP cores used especially for computing trigonometric functions. The implemented algorithms allow smooth rotation of the model in two axes (azimuth and elevation) and a change of the viewing angle. Tests carried out on a FPGA Xilinx Spartan-6 development board have resulted in real-time rendering at over 5000fps. In the conclusion of the article, we discuss additional possibilities for increasing the computational output in graphics applications via the use of HPC (High Performance Computing)

    A full field, 3-D velocimeter for microgravity crystallization experiments

    Get PDF
    The programming and algorithms needed for implementing a full-field, 3-D velocimeter for laminar flow systems and the appropriate hardware to fully implement this ultimate system are discussed. It appears that imaging using a synched pair of video cameras and digitizer boards with synched rails for camera motion will provide a viable solution to the laminar tracking problem. The algorithms given here are simple, which should speed processing. On a heavily loaded VAXstation 3100 the particle identification can take 15 to 30 seconds, with the tracking taking less than one second. It seeems reasonable to assume that four image pairs can thus be acquired and analyzed in under one minute

    TEAPOT: a toolset for evaluating performance, power and image quality on mobile graphics systems

    Get PDF
    In this paper we present TEAPOT, a full system GPU simulator, whose goal is to allow the evaluation of the GPUs that reside in mobile phones and tablets. To this extent, it has a cycle accurate GPU model for evaluating performance, power models for the GPU, the memory subsystem and for OLED screens, and image quality metrics. Unlike prior GPU simulators, TEAPOT supports the OpenGL ES 1.1/2.0 API, so that it can simulate all commercial graphical applications available for Android systems. To illustrate potential uses of this simulating infrastructure, we perform two case studies. We first turn our attention to evaluating the impact of the OS when simulating graphical applications. We show that the overall GPU power/performance is greatly aff ected by common OS tasks, such as image composition, and argue that application level simulation is not sufficient to understand the overall GPU behavior. We then utilize the capabilities of TEAPOT to perform studies that trade image quality for energy. We demonstrate that by allowing for small distortions in the overall image quality, a signifi cant amount of energy can be saved.Postprint (author’s final draft

    Real Time 3-D Graphics Processing Hardware Design using Field-Programmable Gate Arrays.

    Get PDF
    Three dimensional graphics processing requires many complex algebraic and matrix based operations to be performed in real-time. In early stages of graphics processing, such tasks were delegated to a Central Processing Unit (CPU). Over time as more complex graphics rendering was demanded, CPU solutions became inadequate. To meet this demand, custom hardware solutions that take advantage of pipelining and massive parallelism become more preferable to CPU software based solutions. This fact has lead to the many custom hardware solutions that are available today. Since real time graphics processing requires extreme high performance, hardware solutions using Application Specific Integrated Circuits (ASICs) are the standard within the industry. While ASICs are a more than adequate solution for implementing high performance custom hardware, the design, implementation and testing of ASIC based designs are becoming cost prohibitive due to the massive up front verification effort needed as well as the cost of fixing design defects.Field Programmable Gate Arrays (FPGAs) provide an alternative to the ASIC design flow. More importantly, in recent years FPGA technology have begun to improve in performance to the point where ASIC and FPGA performance has become comparable. In addition, FPGAs address many of the issues of the ASIC design flow. The ability to reconfigure FPGAs reduces the upfront verification effort and allows design defects to be fixed easily. This thesis demonstrates that a 3-D graphics processor implementation on and FPGA is feasible by implementing both a two dimensional and three dimensional graphics processor prototype. By using a Xilinx Virtex 5 ML506 FPGA development kit a fully functional wireframe graphics rendering engine is implemented using VHDL and Xilinx's development tools. A VHDL testbench was designed to verify that the graphics engine works functionally. This is followed by synthesizing the design and real hardware and developing test applications to verify functionality and performance of the design. This thesis provides the ground work for push forward the use of FPGA technology in graphics processing applications

    GPGPU Accelerated Deep Object Classification on a Heterogeneous Mobile Platform

    Get PDF
    Deep convolutional neural networks achieve state-of-the-art performance in image classification. The computational and memory requirements of such networks are however huge, and that is an issue on embedded devices due to their constraints. Most of this complexity derives from the convolutional layers and in particular from the matrix multiplications they entail. This paper proposes a complete approach to image classification providing common layers used in neural networks. Namely, the proposed approach relies on a heterogeneous CPU-GPU scheme for performing convolutions in the transform domain. The Compute Unified Device Architecture(CUDA)-based implementation of the proposed approach is evaluated over three different image classification networks on a Tegra K1 CPU-GPU mobile processor. Experiments show that the presented heterogeneous scheme boasts a 50 speedup over the CPU-only reference and outperforms a GPU-based reference by 2, while slashing the power consumption by nearly 30%

    Real-Time Visal Motion Detection by Spatiotemporal Energy Model Implemented on GPU

    Get PDF
    The aim of this study is to develop a real-time visual motion detection system by using physiologically meaningful image processing algorithm. Spatiotemporal energy model has been recognized as the most plausible algorithm corresponding to the jobs in motion detection performed by simple and complex cells existing in area V1 of cats or macaque monkeys. Because of the parallelism of the brain, this algorithm inherently has high parallel performance. Together with the locality, spatiotemporal Gabor filtering and succeeding energy extraction process fit with the architecture of present GPU (Graphic Processing Unit). Enabling real-time motion detection at each pixel location over the entire input image is fundamental in many applications as for instances in robotics vision and carmounted camera. This system, moreover, is open for further expansion based on the physiological knowledge about mammalian visual system

    A Multiprocessor three-dimensional graphics systems.

    Get PDF
    by Hui Chau Man.Thesis (M.Phil.)--Chinese University of Hong Kong, 1991.Includes bibliographical references.ABSTRACT --- p.iACKNOWLEDGEMENTS --- p.iiTABLE OF CONTENTS --- p.iiiChapter CHAPTER 1 --- INTRODUCTIONChapter 1.1 --- Computer Graphics Today --- p.2Chapter 1.1.1 --- 3D Graphics Synthesis Techniques --- p.2Chapter 1.1.2 --- Hardware-assisted Computer Graphics --- p.4Chapter 1.2 --- About The Thesis --- p.5Chapter CHAPTER 2 --- GRAPHICS SYSTEM ARCHITECTURESChapter 2.1 --- Basic Structure of a Graphics Subsystem --- p.8Chapter 2.2 --- VLSI Graphics Chips --- p.9Chapter 2.2.1 --- The CRT Controllers --- p.10Chapter 2.2.2 --- The VLSI Graphics Processors --- p.11Chapter 2.2.3 --- Design Philosophies for VLSI Graphics Processors --- p.12Chapter 2.3 --- Graphics Boards --- p.14Chapter 2.3.1 --- The ARTIST 10 Graphics Controller --- p.14Chapter 2.3.2 --- The MATROX PG-1281 Graphics Controller --- p.16Chapter 2.4 --- High-end Graphics System Architectures --- p.17Chapter 2.4.1 --- Graphics Accelerator with Multiple Functional Units --- p.18Chapter 2.4.2 --- Parallel Processing Graphics Systems --- p.18Chapter 2.4.3 --- The Parallel Processor Architecture --- p.19Chapter 2.4.4 --- The Pipelined Architecture --- p.21Chapter 2.5 --- Comparisons and Discussions --- p.22Chapter 2.5.1 --- Parallel Processors versus Pipelined Processing --- p.23Chapter 2.5.2 --- Parallel Processors versus Multiple Functional Units --- p.23Chapter 2.6 --- Summary of High-end Graphics Systems --- p.24Chapter CHAPTER 3 --- AN ISA 3D GRAPHICS DISPLAY SERVERChapter 3.1 --- Common ISA Graphics Cards --- p.26Chapter 3.1.1 --- Standard Video Display Cards --- p.26Chapter 3.1.2 --- Graphics Processing Boards --- p.27Chapter 3.2 --- A Depth Processor for the ISA computers --- p.28Chapter 3.2.1 --- The Z-buffer Algorithm for HLHSR --- p.28Chapter 3.2.2 --- Our Hardware Solution for HLHSR --- p.29Chapter 3.2.3 --- Design of the Depth Processor --- p.31Chapter 3.2.4 --- Structure of the Depth Processor --- p.34Chapter 3.2.5 --- The Depth Processor Operations --- p.35Chapter 3.2.6 --- Software Support --- p.40Chapter 3.2.7 --- Performance of the Depth Processor --- p.44Chapter 3.3 --- A VGA Accelerator for the ISA Computers --- p.45Chapter 3.3.1 --- Display Buffer Structure of the SuperVGA --- p.46Chapter 3.3.2 --- Design of the VGA Accelerator --- p.47Chapter 3.3.3 --- Structure of the VGA Accelerator --- p.49Chapter 3.3.4 --- Combining the VGA Accelerator and the Depth Processor --- p.51Chapter 3.3.5 --- Actual Performance of the DP-VA Board --- p.54Chapter 3.3.6 --- 3D Graphics Applications Using the DP-VA Board --- p.55Chapter 3.4 --- A 3D Graphics Display Server --- p.57Chapter 3.5 --- Host Connection for the 3D Graphics Display Server --- p.59Chapter 3.5.1 --- The Single Board Computers --- p.60Chapter 3.5.2 --- The VME-to-ISA bus convenor --- p.61Chapter 3.5.3 --- Structure of the VME-to-ISA Bus Convertor --- p.61Chapter 3.5.4 --- Communications through the bus convertor --- p.64Chapter 3.6 --- Physical Construction of the DP-VA Board and the Bus Convertor --- p.65Chapter 3.7 --- Summary --- p.66Chapter CHAPTER 4 --- A MULTI-i860 3D GRAPHICS SYSTEMChapter 4.1 --- The i860 Processor --- p.69Chapter 4.2 --- Design of a Multiprocessor 3D Graphics System --- p.70Chapter 4.2.1 --- A Reconfigurable Processor-Pipeline System --- p.72Chapter 4.2.2 --- The Depth-Processing Unit --- p.73Chapter 4.2.3 --- A Multiprocessor Graphics System --- p.75Chapter 4.3 --- Structure of the Multi-i860 3D --- p.77Chapter 4.3.1 --- The 64-bit-wide Global Data Buses --- p.77Chapter 4.3.2 --- The 1280x1024 True-colour Display Unit --- p.79Chapter 4.3.3 --- The Depth Processing Unit --- p.82Chapter 4.3.4 --- The i860 Processing Units --- p.84Chapter 4.3.5 --- The System Control Unit --- p.87Chapter 4.3.6 --- Performance Prediction --- p.89Chapter 4.4 --- Summary --- p.90Chapter CHAPTER 5 --- CONCLUSIONSChapter 5.1 --- The 3D Graphics Synthesis Pipeline ……… --- p.91Chapter 5.2 --- 3D Graphics Hardware --- p.91Chapter 5.3 --- Design Approach for the ISA 3D Graphics Display Server --- p.92Chapter 5.4 --- Flexibility in the Multi-i860 3D Graphics System --- p.93Chapter 5.5 --- Future Work --- p.94Chapter APPENDIX A --- DISPLAYING REALISTIC 3D SCENESChapter A.1 --- Modelling 3D Objects in Boundary Representation --- p.96Chapter A.2 --- Transformations of 3D scenes --- p.98Chapter A.2.1 --- Composite Modelling Transformation --- p.98Chapter A.2.2 --- Viewing Transformations --- p.99Chapter A.2.3 --- Projection --- p.102Chapter A.2.4 --- Window to Viewport Mapping --- p.104Chapter A.3 --- Implementation of the Viewing Pipeline --- p.105Chapter A.3.1 --- Defining the View Volume --- p.105Chapter A.3.2 --- Normalization of The View Volume --- p.106Chapter A.3.3 --- The Overall Transformation Pipeline --- p.108Chapter A.4 --- Rendering Realistic 3D Scenes --- p.108Chapter A.4.1 --- Scan-conversion of Lines and Polygons --- p.108Chapter A.4.2 --- Hidden Surface Removal --- p.109Chapter A.4.3 --- Shading --- p.112Chapter A.4.4 --- The Complete 3D Graphics Pipeline --- p.114Chapter APPENDIX B --- DEPTH PROCESSOR DESIGN DETAILSChapter B.l --- PAL Definitions --- p.116Chapter B.2 --- Circuit Diagrams --- p.118Chapter B.3 --- Depth Processor User's Guide --- p.121Chapter APPENDIX C --- VGA ACCELERATOR DESIGN DETAILSChapter C.1 --- PAL Definitions --- p.124Chapter C.2 --- Circuit Diagram --- p.125Chapter C.3 --- The DP-VA User's Guide --- p.127Chapter APPENDIX D --- VME-TO-ISA BUS CONVERTOR DESIGN DETAILSChapter D.1 --- PAL Definitions --- p.131Chapter D.2 --- Circuit Diagrams --- p.133Chapter APPENDIX E --- 3D GRAPHICS LIBRARY ROUTINES FOR THE DP-VA BOARDChapter E.1 --- 3D Drawing Routines --- p.136Chapter E.2 --- 3D Transformation Routines --- p.137Chapter E.3 --- Shading Routines --- p.138Chapter APPENDIX F --- PIPELINE CONFIGURATIONS FOR N PROCESSORSREFERENCE
    corecore