770 research outputs found
Preconditioned Spectral Clustering for Stochastic Block Partition Streaming Graph Challenge
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is
demonstrated to efficiently solve eigenvalue problems for graph Laplacians that
appear in spectral clustering. For static graph partitioning, 10-20 iterations
of LOBPCG without preconditioning result in ~10x error reduction, enough to
achieve 100% correctness for all Challenge datasets with known truth
partitions, e.g., for graphs with 5K/.1M (50K/1M) Vertices/Edges in 2 (7)
seconds, compared to over 5,000 (30,000) seconds needed by the baseline Python
code. Our Python code 100% correctly determines 98 (160) clusters from the
Challenge static graphs with 0.5M (2M) vertices in 270 (1,700) seconds using
10GB (50GB) of memory. Our single-precision MATLAB code calculates the same
clusters at half time and memory. For streaming graph partitioning, LOBPCG is
initiated with approximate eigenvectors of the graph Laplacian already computed
for the previous graph, in many cases reducing 2-3 times the number of required
LOBPCG iterations, compared to the static case. Our spectral clustering is
generic, i.e. assuming nothing specific of the block model or streaming, used
to generate the graphs for the Challenge, in contrast to the base code.
Nevertheless, in 10-stage streaming comparison with the base code for the 5K
graph, the quality of our clusters is similar or better starting at stage 4 (7)
for emerging edging (snowballing) streaming, while the computations are over
100-1000 faster.Comment: 6 pages. To appear in Proceedings of the 2017 IEEE High Performance
Extreme Computing Conference. Student Innovation Award Streaming Graph
Challenge: Stochastic Block Partition, see
http://graphchallenge.mit.edu/champion
Scharz Preconditioners for Krylov Methods: Theory and Practice
Several numerical methods were produced and analyzed. The main thrust of the work relates to inexact Krylov subspace methods for the solution of linear systems of equations arising from the discretization of partial di#11;erential equa- tions. These are iterative methods, i.e., where an approximation is obtained and at each step. Usually, a matrix-vector product is needed at each iteration. In the inexact methods, this product (or the application of a preconditioner) can be done inexactly. Schwarz methods, based on domain decompositions, are excellent preconditioners for thise systems. We contributed towards their under- standing from an algebraic point of view, developed new ones, and studied their performance in the inexact setting. We also worked on combinatorial problems to help de#12;ne the algebraic partition of the domains, with the needed overlap, as well as PDE-constraint optimization using the above-mentioned inexact Krylov subspace methods
A matrix-free parallel solution method for the three-dimensional heterogeneous Helmholtz equation
The Helmholtz equation is related to seismic exploration, sonar, antennas,
and medical imaging applications. It is one of the most challenging problems to
solve in terms of accuracy and convergence due to the scalability issues of the
numerical solvers. For 3D large-scale applications, high-performance parallel
solvers are also needed. In this paper, a matrix-free parallel iterative solver
is presented for the three-dimensional (3D) heterogeneous Helmholtz equation.
We consider the preconditioned Krylov subspace methods for solving the linear
system obtained from finite-difference discretization. The Complex Shifted
Laplace Preconditioner (CSLP) is employed since it results in a linear increase
in the number of iterations as a function of the wavenumber. The preconditioner
is approximately inverted using one parallel 3D multigrid cycle. For parallel
computing, the global domain is partitioned blockwise. The matrix-vector
multiplication and preconditioning operator are implemented in a matrix-free
way instead of constructing large, memory-consuming coefficient matrices.
Numerical experiments of 3D model problems demonstrate the robustness and
outstanding strong scaling of our matrix-free parallel solution method.
Moreover, the weak parallel scalability indicates our approach is suitable for
realistic 3D heterogeneous Helmholtz problems with minimized pollution error.Comment: 25 pages, 15 figures, manuscript submitted to a special issue of
conference NMLSP202
Towards Efficient and Scalable Discontinuous Galerkin Methods for Unsteady Flows
openNegli ultimi anni, la crescente disponibilit`a di risorse computazionali ha contribuito alla diffusione della fluidodinamica computazionale per la ricerca e per la progettazione industriale. Uno degli approcci pi promettenti si basa sul metodo agli elementi finiti discontinui di Galerkin (dG).
Nellâambito di queste metodologie, il contributo della tesi e' triplice. Innanzi- tutto, il lavoro introduce un algoritmo di parallelizzazione ibrida MPI/OpenMP per lâutilizzo efficiente di risorse di super calcolo. In secondo luogo, propone strategie di soluzione efficienti, scalabili e con limitata allocazione di memoria per la soluzione di problemi complessi. Infine, confronta le strategie di soluzione introdotte con nuove tecniche di discretizzazione dette âibridizzabiliâ, su problemi riguardanti la soluzione delle equazioni di NavierâStokes non stazionarie.
Lâefficienza computazionale e' stata valutata su casi di crescente complessita' riguardanti la simulazione della turbolenza. In primo luogo, e' stata considerata la convezione naturale di Rayleigh-Benard e il flusso turbolento in un canale a numeri di Reynolds moderatamente alti. Le strategie di soluzione proposte sono risultate fino a cinque volte piu` veloci rispetto ai metodi standard allocando solamente il 7% della memoria. In secondo luogo, e' stato analizzato il flusso attorno ad una piastra piana con bordo arrotondato sottoposta a diversi livelli di turbolenza in ingresso. Nonostante la maggiore complessitĂ ' dovuta allâuso di elementi curvi ed anisotropi, lâalgoritmo proposto e' risultato oltre tre volte piu` veloce allocando il 15% della memoria rispetto ad un metodo standard. Concludendo, viene riportata la simulazione del âBoeing Rudimentary Landing Gearâ a Re = 10^6. In tutti i casi i risultati ottenuti sono in ottimo accordo con i dati sperimentali e con precedenti simulazioni numeriche pubblicate in letteratura.In recent years the increasing availability of High Performance Computing (HPC) resources strongly promoted the widespread of high fidelity simulations, such as the Large Eddy Simulation (LES), for industrial research and design. One of the most promising approaches to those kind of simulations is based on the discontinuous Galerkin (dG) discretization method.
The contribution of the thesis towards this research area is three-fold. First, the work introduces an efficient hybrid MPI/OpenMP parallelisation paradigm to fruitfully exploit large HPC facilities. Second, it reports efficient, scalable and memory saving solution strategies for stiff dG discretisations. Third, it compares those solution strategies, for the first time using the same numerical framework, to hybridizable discontinuous Galerkin (HDG) methods, including a novel implementation of a p-multigrid preconditioning approach, on unsteady flow problems involving the solution of the NavierStokes equations.
The improvements in computational efficiency have been evaluated on cases of growing complexity involving large eddy simulations of turbulent flows. First, the Rayleigh-Benard convection problem and the turbulent channel flow at moderately high Reynolds numbers is presented. The solution strategies proposed resulted up to five times faster than standard matrix-based methods while al- locating the 7% of the memory. A second family of test cases involve the LES simulation of a rounded leading edge flat plate under different levels of free-stream turbulence. Although the increased stiffness of the iteration matrix due to the use of curved and stretched elements, the solver resulted more than three times faster while allocating the 15% of the memory if compared to standard methods. Finally, the large eddy simulation of the Boeing Rudimentary Landing Gear at Re = 10^6 is reported. In all the cases, a remarkable agreement with experimental data as well as previous numerical simulations is documented.INGEGNERIA INDUSTRIALEopenFranciolini, Matte
- âŠ