Search CORE

988 research outputs found

GPU optimizations for a production molecular docking code

Author: Landaverde Raphael J.
Publication venue: Boston University
Publication date: 01/01/2014
Field of study

Thesis (M.Sc.Eng.) -- Boston UniversityScientists have always felt the desire to perform computationally intensive tasks that surpass the capabilities of conventional single core computers. As a result of this trend, Graphics Processing Units (GPUs) have come to be increasingly used for general computation in scientific research. This field of GPU acceleration is now a vast and mature discipline. Molecular docking, the modeling of the interactions between two molecules, is a particularly computationally intensive task that has been the subject of research for many years. It is a critical simulation tool used for the screening of protein compounds for drug design and in research of the nature of life itself. The PIPER molecular docking program was previously accelerated using GPUs, achieving a notable speedup over conventional single core implementation. Since its original release the development of the CPU based PIPER has not ceased, and it is now a mature and fast parallel code. The GPU version, however, still contains many potential points for optimization. In the current work, we present a new version of GPU PIPER that attains a 3.3x speedup over a parallel MPI version of PIPER running on an 8 core machine and using the optimized Intel Math Kernel Library. We achieve this speedup by optimizing existing kernels for modern GPU architectures and migrating critical code segments to the GPU. In particular, we both improve the runtime of the filtering and scoring stages by more than an order of magnitude, and move all molecular data permanently to the GPU to improve data locality. This new speedup is obtained while retaining a computational accuracy virtually identical to the CPU based version. We also demonstrate that, due to the algorithmic dependencies of the PIPER algorithm on the 3D Fast Fourier Transform, our GPU PIPER will likely remain proportionally faster than equivalent CPU based implementations, and with little room for further optimizations. This new GPU accelerated version of PIPER is integrated as part of the ClusPro molecular docking and analysis server at Boston University. ClusPro has over 4000 registered users and more than 50000 jobs run over the past 4 years

Boston University Institutional Repository (OpenBU)

Comparison between Famous Game Engines and Eminent Games

Author: Mishra Prerna
Shrawankar Urmila
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 15/07/2021
Field of study

Nowadays game engines are imperative for building 3D applications and games. This is for the reason that the engines appreciably reduce resources for employing obligatory but intricate utilities. This paper elucidates about a game engine, popular games developed by these engines and its foremost elements. It portrays a number of special kinds of contemporary game developed by engines in the way of their aspects, procedure and deliberates their stipulations with comparison

Evaluation and tuning of the Level 3 CUBLAS for graphics processors

Author: Enrique S. Quintana-ortı́
Francisco D. Igual
Maribel Castillo
Rafael Mayo
Sergio Barrachina
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

The increase in performance of the last generations of graphics processors (GPUs) has made this class of plat-form a coprocessing tool with remarkable success in certain types of operations. In this paper we evaluate the perfor-mance of the Level 3 operations in CUBLAS, the implemen-tation of BLAS for NVIDIA R © GPUs with unified architec-ture. From this study, we gain insights on the quality of the kernels in the library and we propose several alternative im-plementations that are competitive with those in CUBLAS. Experimental results on a GeForce 8800 Ultra compare the performance of CUBLAS and the new variants

CiteSeerX

Enabling CUDA acceleration within virtual machines using rCUDA

Author: Duato José
Fernández Juan C.
Mayo Rafael
Peña Antonio J.
Quintana Ortí Enrique Salvador
Silla Federico
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/02/2012
Field of study

The hardware and software advances of Graphics Processing Units (GPUs) have favored the development of GPGPU (General-Purpose Computation on GPUs) and its adoption in many scientific, engineering, and industrial areas. Thus, GPUs are increasingly being introduced in high-performance computing systems as well as in datacenters. On the other hand, virtualization technologies are also receiving rising interest in these domains, because of their many benefits on acquisition and maintenance savings. There are currently several works on GPU virtualization. However, there is no standard solution allowing access to GPGPU capabilities from virtual machine environments like, e.g., VMware, Xen, VirtualBox, or KVM. Such lack of a standard solution is delaying the integration of GPGPU into these domains. In this paper, we propose a first step towards a general and open source approach for using GPGPU features within VMs. In particular, we describe the use of rCUDA, a GPGPU (General-Purpose Computation on GPUs) virtualization framework, to permit the execution of GPU-accelerated applications within virtual machines (VMs), thus enabling GPGPU capabilities on any virtualized environment. Our experiments with rCUDA in the context of KVM and VirtualBox on a system equipped with two NVIDIA GeForce 9800 GX2 cards illustrate the overhead introduced by the rCUDA middleware and prove the feasibility and scalability of this general virtualizing solution. Experimental results show that the overhead is proportional to the dataset size, while the scalability is similar to that of the native environment.Peer ReviewedPostprint (author's final draft