Search CORE

14,595 research outputs found

Parallel Computing on a PC Cluster

Author: Chang D.
Gregory E. B.
Lin Y.
Luo X. Q.
Wang Y. L.
Yang J. C.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2001
Field of study

The tremendous advance in computer technology in the past decade has made it possible to achieve the performance of a supercomputer on a very small budget. We have built a multi-CPU cluster of Pentium PC capable of parallel computations using the Message Passing Interface (MPI). We will discuss the configuration, performance, and application of the cluster to our work in physics.Comment: 3 pages, uses Latex and aipproc.cl

arXiv.org e-Print Archive

CiteSeerX

Crossref

Achieving High Speed CFD simulations: Optimization, Parallelization, and FPGA Acceleration for the unstructured DLR TAU Code

Author: Andres-Perez Esther
Caloto Aitor
Widhalm Markus
Publication venue
Publication date: 01/01/2009
Field of study

Today, large scale parallel simulations are fundamental tools to handle complex problems. The number of processors in current computation platforms has been recently increased and therefore it is necessary to optimize the application performance and to enhance the scalability of massively-parallel systems. In addition, new heterogeneous architectures, combining conventional processors with specific hardware, like FPGAs, to accelerate the most time consuming functions are considered as a strong alternative to boost the performance. In this paper, the performance of the DLR TAU code is analyzed and optimized. The improvement of the code efficiency is addressed through three key activities: Optimization, parallelization and hardware acceleration. At first, a profiling analysis of the most time-consuming processes of the Reynolds Averaged Navier Stokes flow solver on a three-dimensional unstructured mesh is performed. Then, a study of the code scalability with new partitioning algorithms are tested to show the most suitable partitioning algorithms for the selected applications. Finally, a feasibility study on the application of FPGAs and GPUs for the hardware acceleration of CFD simulations is presented

Institute of Transport Research:Publications

Graphic-Card Cluster for Astrophysics (GraCCA) -- Performance Tests

Author: Chien Chia-Hung
Chiueh Tzihong
Schive Hsi-Yu
Tsai Yu-Chih
Wong Shing-Kwong
Publication venue: 'Elsevier BV'
Publication date: 20/01/2008
Field of study

In this paper, we describe the architecture and performance of the GraCCA system, a Graphic-Card Cluster for Astrophysics simulations. It consists of 16 nodes, with each node equipped with 2 modern graphic cards, the NVIDIA GeForce 8800 GTX. This computing cluster provides a theoretical performance of 16.2 TFLOPS. To demonstrate its performance in astrophysics computation, we have implemented a parallel direct N-body simulation program with shared time-step algorithm in this system. Our system achieves a measured performance of 7.1 TFLOPS and a parallel efficiency of 90% for simulating a globular cluster of 1024K particles. In comparing with the GRAPE-6A cluster at RIT (Rochester Institute of Technology), the GraCCA system achieves a more than twice higher measured speed and an even higher performance-per-dollar ratio. Moreover, our system can handle up to 320M particles and can serve as a general-purpose computing cluster for a wide range of astrophysics problems.Comment: Accepted for publication in New Astronom

arXiv.org e-Print Archive

Crossref