Search CORE

3 research outputs found

Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

Author: Sun Xian-He
Publication venue
Publication date
Field of study

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper

NASA Technical Reports Server

Procesamiento paralelo : Balance de carga dinámico en algoritmo de sorting

Author: Naiouf Marcelo
Publication venue: 'Universidad Nacional de La Plata'
Publication date: 01/01/2004
Field of study

Algunas técnicas de sorting intentan balancear la carga mediante un muestreo inicial de los datos a ordenar y una distribución de los mismos de acuerdo a pivots. Otras redistribuyen listas parcialmente ordenadas de modo que cada procesador almacene un número aproximadamente igual de claves, y todos tomen parte del proceso de merge durante la ejecución. Esta Tesis presenta un nuevo método que balancea dinámicamente la carga basado en un enfoque diferente, buscando realizar una distribución del trabajo utilizando un estimador que permita predecir la carga de trabajo pendiente. El método propuesto es una variante de Sorting by Merging Paralelo, esto es, una técnica basada en comparación. Las ordenaciones en los bloques se realizan mediante el método de Burbuja o Bubble Sort con centinela. En este caso, el trabajo a realizar -en términos de comparaciones e intercambios- se encuentra afectada por el grado de desorden de los datos. Se estudió la evolución de la cantidad de trabajo en cada iteración del algoritmo para diferentes tipos de secuencias de entrada, n datos con valores de a n sin repetición, datos al azar con distribución normal, observándose que el trabajo disminuye en cada iteración. Esto se utilizó para obtener una estimación del trabajo restante esperado a partir de una iteración determinada, y basarse en el mismo para corregir la distribución de la carga. Con esta idea, el métoEs revisado por: http://sedici.unlp.edu.ar/handle/10915/9500Facultad de Ciencias Exacta

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Servicio de Difusión de la Creación Intelectual

The relation of scalability and execution time

Author: Xian-he Sun
Publication venue
Publication date: 01/09/1995
Field of study

Scalability has been used extensively as a de facto performance criterion for evaluating parallel algorithms and architectures. However, for many, scalability has theoretical interests only since it does not reveal execution time. In this paper, the relation between scalability and execution time is carefully studied. Results show that the isospeed scalability well characterizes the variation of execution time: smaller scalability leads to larger execution time, the same scalability leads to the same execution time, etc. Three algorithms from scienti c computing are implemented on an Intel Paragon and an IBM SP2 parallel computer. Experimental and theoretical results show that scalability is an important, distinct metric for parallel and distributed systems, and may be as important as execution time in a scalable parallel and distributed environment

CiteSeerX

NASA Technical Reports Server