7 research outputs found

    An Application-Oriented Approach for Accelerating Data-Parallel Computation with Graphics Processing Unit

    Get PDF
    This paper presents a novel parallelization and quantitative characterization of various optimization strategies for data-parallel computation on a graphics processing unit (GPU) using NVIDIA's new GPU programming framework, Compute Unified Device Architecture (CUDA). CUDA is an easy-to-use development framework that has drawn the attention of many different application areas looking for dramatic speed-ups in their code. However, the performance tradeoffs in CUDA are not yet fully understood, especially for data-parallel applications. Consequently, we study two fundamental mathematical operations that are common in many data-parallel applications: convolution and accumulation. Specifically, we profile and optimize the performance of these operations on a 128-core NVIDIA GPU. We then characterize the impact of these operations on a video-based motion-tracking algorithm called vector coherence mapping, which consists of a series of convolutions and dynamically weighted accumulations, and present a comparison of different implementations and their respective performance profiles

    Using graphics devices in reverse: GPU-based Image Processing and Computer Vision

    Full text link

    Efficient occupancy grid computation on the GPU with lidar and radar for road boundary detection

    Full text link
    Abstract—Accurate maps of the static environment are essential for many advanced driver-assistance systems. In this paper a new method for the fast computation of occupancy grid maps with laser range-finders and radar sensors is proposed. The approach utilizes the Graphics Processing Unit to overcome the limitations of classical occupancy grid computation in auto-motive environments. It is possible to generate highly accurate grid maps in just a few milliseconds without the loss of sensor precision. Moreover, in the case of a lower resolution radar sensor it is shown that it is suitable to apply super-resolution algorithms to achieve the accuracy of a higher resolution laser-scanner. Finally, a novel histogram based approach for road boundary detection with lidar and radar sensors is presented. I

    Locating moving objects in car-driving sequences

    Get PDF

    Variational optic flow on the Sony PlayStation 3 – accurate dense flow fields for real-time applications

    Get PDF
    While modern variational methods for optic flow computation offer dense flow fields and highly accurate results, their computational complexity has prevented their use in many real-time applications. With cheap modern parallel hardware such as the Sony PlayStation 3 new possibilities arise. For a linear and a nonlinear variant of the popular combined local-global (CLG) method, we present specific algorithms that are tailored towards real-time performance. They are based on bidirectional full multigrid methods with a full approximation scheme (FAS) in the nonlinear setting. Their parallelisation on the Cell hardware uses a temporal instead of a spatial decomposition, and processes operations in a vector-based manner. Memory latencies are reduced by a locality-preserving cache management and optimised access patterns. With images of size 316×252 pixels, we obtain dense flow fields for up to 210 frames per second

    Optimization techniques for computationally expensive rendering algorithms

    Get PDF
    Realistic rendering in computer graphics simulates the interactions of light and surfaces. While many accurate models for surface reflection and lighting, including solid surfaces and participating media have been described; most of them rely on intensive computation. Common practices such as adding constraints and assumptions can increase performance. However, they may compromise the quality of the resulting images or the variety of phenomena that can be accurately represented. In this thesis, we will focus on rendering methods that require high amounts of computational resources. Our intention is to consider several conceptually different approaches capable of reducing these requirements with only limited implications in the quality of the results. The first part of this work will study rendering of time-­¿varying participating media. Examples of this type of matter are smoke, optically thick gases and any material that, unlike the vacuum, scatters and absorbs the light that travels through it. We will focus on a subset of algorithms that approximate realistic illumination using images of real world scenes. Starting from the traditional ray marching algorithm, we will suggest and implement different optimizations that will allow performing the computation at interactive frame rates. This thesis will also analyze two different aspects of the generation of anti-­¿aliased images. One targeted to the rendering of screen-­¿space anti-­¿aliased images and the reduction of the artifacts generated in rasterized lines and edges. We expect to describe an implementation that, working as a post process, it is efficient enough to be added to existing rendering pipelines with reduced performance impact. A third method will take advantage of the limitations of the human visual system (HVS) to reduce the resources required to render temporally antialiased images. While film and digital cameras naturally produce motion blur, rendering pipelines need to explicitly simulate it. This process is known to be one of the most important burdens for every rendering pipeline. Motivated by this, we plan to run a series of psychophysical experiments targeted at identifying groups of motion-­¿blurred images that are perceptually equivalent. A possible outcome is the proposal of criteria that may lead to reductions of the rendering budgets

    Optical Flow Computation on Compute Unified Device Architecture

    No full text
    corecore