787 research outputs found

    HARP: A Dynamic Inertial Spectral Partitioner

    Get PDF
    Partitioning unstructured graphs is central to the parallel solution of computational science and engineering problems. Spectral partitioners, such recursive spectral bisection (RSB), have proven effecfive in generating high-quality partitions of realistically-sized meshes. The major problem which hindered their wide-spread use was their long execution times. This paper presents a new inertial spectral partitioner, called HARP. The main objective of the proposed approach is to quickly partition the meshes at runtime in a manner that works efficiently for real applications in the context of distributed-memory machines. The underlying principle of HARP is to find the eigenvectors of the unpartitioned vertices and then project them onto the eigerivectors of the original mesh. Results for various meshes ranging in size from 1000 to 100,000 vertices indicate that HARP can indeed partition meshes rapidly at runtime. Experimental results show that our largest mesh can be partitioned sequentially in only a few seconds on an SP2 which is several times faster than other spectral partitioners while maintaining the solution quality of the proven RSB method. A parallel WI version of HARP has also been implemented on IBM SP2 and Cray T3E. Parallel HARP, running on 64 processors SP2 and T3E, can partition a mesh containing more than 100,000 vertices into 64 subgrids in about half a second. These results indicate that graph partitioning can now be truly embedded in dynamically-changing real-world applications

    A New Paradigm for Parallel Adaptive Meshing Algorithms

    Full text link

    Decomposition of unstructured meshes for efficient parallel computation

    Get PDF

    Impact of Load Balancing on Unstructured Adaptive Grid Computations for Distributed-Memory Multiprocessors

    Get PDF
    The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a new dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view. Whenever the computational mesh is adapted, JOVE is activated to eliminate the load imbalance. JOVE has been implemented on an IBM SP2 distributed-memory machine in MPI for portability. Experimental results for two model meshes demonstrate that mesh adaption with load balancing gives more than a sixfold improvement over one without load balancing. We also show that JOVE gives a 24-fold speedup on 64 processors compared to sequential execution

    MRRR-based Eigensolvers for Multi-core Processors and Supercomputers

    Get PDF
    The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR or MR3 in short) - introduced in the late 1990s - is among the fastest methods. To compute k eigenpairs of a real n-by-n tridiagonal T, MRRR only requires O(kn) arithmetic operations; in contrast, all the other practical methods require O(k^2 n) or O(n^3) operations in the worst case. This thesis centers around the performance and accuracy of MRRR.Comment: PhD thesi

    High order resolution and parallel implementation on unstructured grids

    Get PDF
    The numerical solution of the two-dimensional inviscid Euler flow equations is given. The unstructured mesh is generated by the advancing front technique. A cell-centred upwind finite volume method has been adopted to discretize the Euler equations. Both explicit and point implicit time stepping algorithms are derived. The flux calculation using Roe's and Osher's approximate Riemann solvers are studied. It is shown that both the Roe and Osher's schemes produce an accurate representation of discontinuities (e.g. shock wave). It is also shown that better convergence performance has been achieved by the point implicit scheme than that by the explicit scheme. Validations have been done for subsonic and transonic flow over airfoils, supersonic flow past a compression corner and hypersonic flow past cylinder and blunt body geometries. An adaptive remeshing procedure is also applied to the numerical solution with the objective of getting improved results. The issue of high order reconstruction on unstructured grids has been discussed. The methodology of the Taylor series expansion is adopted. The calculation of the gradient at a reference point is carried out by the use of either the Green-Gauss integral formula or the least-square methods. Some recently developed limiter construction methods have been used and their performance has been demonstrated using the test example of the transonic flow over a RAE 2822 airfoil. It has been shown that similar pressure distributions are obtained by all limiters except for shock wave regions where the limiter is active. The convergence problem is illustrated by the mid-mod type limiter. It seems only the Venkatakrishnan limiter provides improved convergence. Other limiters do not appear to work as well as that shown in their original publications. Also the convergence history given by the least-square method appears better than that by the Green-Gauss method in the test
    corecore