52 research outputs found
A low-power geometric mapping co-processor for high-speed graphics application
In this article we present a novel design of a low-power geometric mapping co-processor that can be used for high-performance graphics system. The processor can carry out any single or a combination of transformations belonging to affine transformation family ranging from 1-D to 3-D. It allows interactive operations which can be defined either by a user (allowing it to be a stand-alone geometric transformation processor) or by a host processor (allowing it to be a co-processor to accelerate certain graphics operations). It occupies a silicon area of 6 mm2 and consumes 40 mW power when synthesized with 0.25?m technology
Playing Smart - Artificial Intelligence in Computer Games
Abstract: With this document we will present an overview of artificial intelligence in general and artificial intelligence in the context of its use in modern computer games in particular. To this end we will firstly provide an introduction to the terminology of artificial intelligence, followed by a brief history of this field of computer science and finally we will discuss the impact which this science has had on the development of computer games. This will be further illustrated by a number of case studies, looking at how artificially intelligent behaviour has been achieved in selected games
OpenCL Actors - Adding Data Parallelism to Actor-based Programming with CAF
The actor model of computation has been designed for a seamless support of
concurrency and distribution. However, it remains unspecific about data
parallel program flows, while available processing power of modern many core
hardware such as graphics processing units (GPUs) or coprocessors increases the
relevance of data parallelism for general-purpose computation.
In this work, we introduce OpenCL-enabled actors to the C++ Actor Framework
(CAF). This offers a high level interface for accessing any OpenCL device
without leaving the actor paradigm. The new type of actor is integrated into
the runtime environment of CAF and gives rise to transparent message passing in
distributed systems on heterogeneous hardware. Following the actor logic in
CAF, OpenCL kernels can be composed while encapsulated in C++ actors, hence
operate in a multi-stage fashion on data resident at the GPU. Developers are
thus enabled to build complex data parallel programs from primitives without
leaving the actor paradigm, nor sacrificing performance. Our evaluations on
commodity GPUs, an Nvidia TESLA, and an Intel PHI reveal the expected linear
scaling behavior when offloading larger workloads. For sub-second duties, the
efficiency of offloading was found to largely differ between devices. Moreover,
our findings indicate a negligible overhead over programming with the native
OpenCL API.Comment: 28 page
Recommended from our members
Object Space EWA Surface Splatting: A Hardware Accelerated Approach to High Quality Point Rendering
Elliptical weighted average (EWA) surface splatting is a technique for high quality rendering of point-sampled 3D objects. EWA surface splatting renders water-tight surfaces of complex point models with high quality, anisotropic texture filtering. In this paper we introduce a new multi-pass approach to perform EWA surface splatting on modern PC graphics hardware, called object space EWA splatting. We derive an object space formulation of the EWA filter, which is amenable for acceleration by conventional triangle-based graphics hardware. We describe how to implement the object space EWA filter using a two pass rendering algorithm. In the first rendering pass, visibility splatting is performed by shifting opaque surfel polygons backward along the viewing rays, while in the second rendering pass view-dependent EWA prefiltering is performed by deforming texture mapped surfel polygons. We use texture mapping and alpha blending to facilitate the splatting process. We implement our algorithm using programmable vertex and pixel shaders, fully exploiting the capabilities of todayâs graphics processing units (GPUs). Our implementation renders up to 3 million points per second on recent PC graphics hardware, an order of magnitude more than a pure software implementation of screen space EWA surface splatting.Engineering and Applied Science
OpenCLâbased implementation of an unstructured edgeâbased finite element convectionâdiffusion solver on graphics hardware
The solution of problems in computational fluid dynamics (CFD) represents a classical field for the application of advanced numerical methods. Many different approaches were developed over the years to address CFD applications. Good examples are finite volumes, finite differences (FD), and finite elements (FE) but also newer approaches such as the latticeâBoltzmann (LB), smooth particle hydrodynamics or the particle finite element method. FD and LB methods on regular grids are known to be superior in terms of raw computing speed, but using such regular discretization represents an important limitation in dealing with complex geometries. Here, we concentrate on unstructured approaches which are less common in the GPU world. We employ a nonstandard FE approach which leverages an optimized edgeâbased data structure allowing a highly parallel implementation. Such technique is applied to the ‘convectionâdiffusion’ problem, which is often considered as a first step towards CFD because of similarities to the nonconservative form of the Navier–Stokes equations. In this regard, an existing highly optimized parallel OpenMP solver is ported to graphics hardware based on the OpenCL platform. The optimizations performed are discussed in detail. A number of benchmarks prove that the GPUâaccelerated OpenCL code consistently outperforms the OpenMP version
Perspective accurate splatting
We present a novel algorithm for accurate, high quality point rendering, which is based on the formulation of splatting using homogeneous coordinates. In contrast to previous methods, this leads to perspective correct splat shapes, avoiding artifacts such as holes caused by the affine approximation of the perspective projection. Further, our algorithm implements the EWA resampling filter, hence providing high image quality with anisotropic texture filtering. We also present an extension of our rendering primitive that facilitates the display of sharp edges and corners. Finally, we describe an efficient implementation of the entire point rendering pipeline using vertex and fragment programs of current GPUs
Heterogeneous CPU/GPU Memory Hierarchy Analysis and Optimization
In this master thesis, we propose a scheduling reordering for heterogeneous processors based on a hysteresis detector to give some fairness and speedup to the memory request threads taking advantage of the bank level parallelism at the memory system organization
- âŠ