2,940 research outputs found
Parallel Alpha-Beta Algorithm on the GPU
In the paper we present the parallel implementation of the alpha-beta algorithm running on the graphics processing unit (GPU). We compare the speed of the parallel player with the standard serial one using the game of reversi with boards of different sizes. We show that for small boards the level of available parallelism is insufficient for efficient GPU utilization, but for larger boards substantial speed-ups can be achieved on the GPU. The results indicate that the GPU-based alpha-beta implementation would be advantageous for similar games of higher computational complexity (e.g. hex and go) in their standard form
The GPU vs Phi Debate: Risk Analytics Using Many-Core Computing
The risk of reinsurance portfolios covering globally occurring natural
catastrophes, such as earthquakes and hurricanes, is quantified by employing
simulations. These simulations are computationally intensive and require large
amounts of data to be processed. The use of many-core hardware accelerators,
such as the Intel Xeon Phi and the NVIDIA Graphics Processing Unit (GPU), are
desirable for achieving high-performance risk analytics. In this paper, we set
out to investigate how accelerators can be employed in risk analytics, focusing
on developing parallel algorithms for Aggregate Risk Analysis, a simulation
which computes the Probable Maximum Loss of a portfolio taking both primary and
secondary uncertainties into account. The key result is that both hardware
accelerators are useful in different contexts; without taking data transfer
times into account the Phi had lowest execution times when used independently
and the GPU along with a host in a hybrid platform yielded best performance.Comment: A modified version of this article is accepted to the Computers and
Electrical Engineering Journal under the title - "The Hardware Accelerator
Debate: A Financial Risk Case Study Using Many-Core Computing"; Blesson
Varghese, "The Hardware Accelerator Debate: A Financial Risk Case Study Using
Many-Core Computing," Computers and Electrical Engineering, 201
Parallel computation of optimized arrays for 2-D electrical imaging surveys
Modern automatic multi-electrode survey instruments have made it possible to use non-traditional arrays to maximize the subsurface resolution from electrical imaging surveys. Previous studies have shown that one of the best methods for generating optimized arrays is to select the set of array configurations that maximizes the model resolution for a homogeneous earth model. The Sherman–Morrison Rank-1 update is used to calculate the change in the model resolution when a new array is added to a selected set of array configurations. This method had the disadvantage that it required several hours of computer time even for short 2-D survey lines. The algorithm was modified to calculate the change in the model resolution rather than the entire resolution matrix. This reduces the computer time and memory required as well as the computational round-off errors. The matrix–vector multiplications for a single add-on array were replaced with matrix–matrix multiplications for 28 add-on arrays to further reduce the computer time. The temporary variables were stored in the double-precision Single Instruction Multiple Data (SIMD) registers within the CPU to minimize computer memory access. A further reduction in the computer time is achieved by using the computer graphics card Graphics Processor Unit (GPU) as a highly parallel mathematical coprocessor. This makes it possible to carry out the calculations for 512 add-on arrays in parallel using the GPU. The changes reduce the computer time by more than two orders of magnitude. The algorithm used to generate an optimized data set adds a specified number of new array configurations after each iteration to the existing set. The resolution of the optimized data set can be increased by adding a smaller number of new array configurations after each iteration. Although this increases the computer time required to generate an optimized data set with the same number of data points, the new fast numerical routines has made this practical on commonly available microcomputers
- …