3 research outputs found

    Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

    Get PDF
    Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper

    Procesamiento paralelo : Balance de carga din谩mico en algoritmo de sorting

    Get PDF
    Algunas t茅cnicas de sorting intentan balancear la carga mediante un muestreo inicial de los datos a ordenar y una distribuci贸n de los mismos de acuerdo a pivots. Otras redistribuyen listas parcialmente ordenadas de modo que cada procesador almacene un n煤mero aproximadamente igual de claves, y todos tomen parte del proceso de merge durante la ejecuci贸n. Esta Tesis presenta un nuevo m茅todo que balancea din谩micamente la carga basado en un enfoque diferente, buscando realizar una distribuci贸n del trabajo utilizando un estimador que permita predecir la carga de trabajo pendiente. El m茅todo propuesto es una variante de Sorting by Merging Paralelo, esto es, una t茅cnica basada en comparaci贸n. Las ordenaciones en los bloques se realizan mediante el m茅todo de Burbuja o Bubble Sort con centinela. En este caso, el trabajo a realizar -en t茅rminos de comparaciones e intercambios- se encuentra afectada por el grado de desorden de los datos. Se estudi贸 la evoluci贸n de la cantidad de trabajo en cada iteraci贸n del algoritmo para diferentes tipos de secuencias de entrada, n datos con valores de a n sin repetici贸n, datos al azar con distribuci贸n normal, observ谩ndose que el trabajo disminuye en cada iteraci贸n. Esto se utiliz贸 para obtener una estimaci贸n del trabajo restante esperado a partir de una iteraci贸n determinada, y basarse en el mismo para corregir la distribuci贸n de la carga. Con esta idea, el m茅toEs revisado por: http://sedici.unlp.edu.ar/handle/10915/9500Facultad de Ciencias Exacta

    The relation of scalability and execution time

    Get PDF
    Scalability has been used extensively as a de facto performance criterion for evaluating parallel algorithms and architectures. However, for many, scalability has theoretical interests only since it does not reveal execution time. In this paper, the relation between scalability and execution time is carefully studied. Results show that the isospeed scalability well characterizes the variation of execution time: smaller scalability leads to larger execution time, the same scalability leads to the same execution time, etc. Three algorithms from scienti c computing are implemented on an Intel Paragon and an IBM SP2 parallel computer. Experimental and theoretical results show that scalability is an important, distinct metric for parallel and distributed systems, and may be as important as execution time in a scalable parallel and distributed environment
    corecore