1,824 research outputs found

    PT-Scotch: A tool for efficient parallel graph ordering

    Get PDF
    The parallel ordering of large graphs is a difficult problem, because on the one hand minimum degree algorithms do not parallelize well, and on the other hand the obtainment of high quality orderings with the nested dissection algorithm requires efficient graph bipartitioning heuristics, the best sequential implementations of which are also hard to parallelize. This paper presents a set of algorithms, implemented in the PT-Scotch software package, which allows one to order large graphs in parallel, yielding orderings the quality of which is only slightly worse than the one of state-of-the-art sequential algorithms. Our implementation uses the classical nested dissection approach but relies on several novel features to solve the parallel graph bipartitioning problem. Thanks to these improvements, PT-Scotch produces consistently better orderings than ParMeTiS on large numbers of processors

    State of the Art in the Optimisation of Wind Turbine Performance Using CFD

    Get PDF
    Wind energy has received increasing attention in recent years due to its sustainability and geographically wide availability. The efficiency of wind energy utilisation highly depends on the performance of wind turbines, which convert the kinetic energy in wind into electrical energy. In order to optimise wind turbine performance and reduce the cost of next-generation wind turbines, it is crucial to have a view of the state of the art in the key aspects on the performance optimisation of wind turbines using Computational Fluid Dynamics (CFD), which has attracted enormous interest in the development of next-generation wind turbines in recent years. This paper presents a comprehensive review of the state-of-the-art progress on optimisation of wind turbine performance using CFD, reviewing the objective functions to judge the performance of wind turbine, CFD approaches applied in the simulation of wind turbines and optimisation algorithms for wind turbine performance. This paper has been written for both researchers new to this research area by summarising underlying theory whilst presenting a comprehensive review on the up-to-date studies, and experts in the field of study by collecting a comprehensive list of related references where the details of computational methods that have been employed lately can be obtained

    Parallel algorithms for computational fluid dynamics on unstructured meshes

    Get PDF
    La simulació numèrica directa (DNS) de fluxos complexes és actualment una utopia per la majoria d'aplicacions industrials ja que els requeriments computacionals son massa elevats. Donat un flux, la diferència entre els recursos computacionals necessaris i els disponibles és cobreix mitjançant la modelització/simplificació d'alguns termes de les equacions originals que regeixen el seu comportament. El creixement continuat dels recursos computacionals disponibles, principalment en forma de super-ordinadors, contribueix a reduir la part del flux que és necessari aproximar. De totes maneres, obtenir la eficiència esperada dels nous super-ordinadors no és una tasca senzilla i, per aquest motiu, part de la recerca en el camp de la Mecànica de Fluids Computacional es centra en aquest objectiu. En aquest sentit, algunes contribucions s'han presentat en el marc d'aquesta tesis. El primer objectiu va ser el desenvolupament d'un codi de CFD de propòsit general i paral·lel, basat en la metodologia de volums finits en malles no estructurades, per resoldre problemes de multi-física. Aquest codi, anomenat TermoFluids (TF), té un disseny orientat a objectes i pensat per ser usat de forma altament eficient en els super-ordinadors actuals. Amb el temps, ha esdevingut pel grup una eina fonamental en projectes tant de recerca bàsica com d'interès industrial. En el context d'aquesta tesis, el treball s'ha focalitzat en el desenvolupament de dos de les llibreries més bàsiques de TermoFluids: i) La Basics Objects Library (BOL), que es una plataforma de software sobre la qual estan programades la resta de llibreries del codi, i que conté els mètodes algebraics i geomètrics fonamentals per la implementació paral·lela dels algoritmes de discretització, ii) la Linear Solvers Library (LSL), que conté un gran nombre de mètodes per resoldre els sistemes d'equacions lineals derivats de les discretitzacions. El primer capítol d'aquesta tesi conté les principals idees subjacents al disseny i la implementació de la BOL i la LSL, juntament amb alguns exemples i algunes aplicacions industrials. En els capítols posteriors hi ha una explicació detallada de solvers específics per algunes aplicacions concretes. En el segon capítol, es presenta un solver paral·lel i directe per la resolució de l'equació de Poisson per casos en els quals una de les direccions del domini té condicions d'homogeneïtat. En la simulació de fluxos incompressibles, l'equació de Poisson es resol almenys una vegada en cada pas de temps, convertint-se en una de les parts més costoses i difícils de paral·lelitzar del codi. El mètode que proposem és una combinació d'una descomposició directa de Schur (DDS) i una diagonalització de Fourier. La darrera descompon el sistema original en un conjunt de sub-sistemes 2D independents que es resolen mitjançant l'algorisme DDS. Atès que no s'imposen restriccions a les direccions no periòdiques del domini, aquest mètode és aplicable a la resolució de problemes discretitzats mitjançat l'extrusió de malles 2D no estructurades. L'escalabilitat d'aquest mètode ha estat provada amb èxit amb un màxim de 8192 CPU per malles de fins a ~10⁹ volums de control. En el darrer capitol capítol, es presenta un mètode de resolució per l'equació de Transport de Boltzmann (BTE). La estratègia emprada es basa en el mètode d'Ordenades Discretes i pot ser aplicat en discretitzacions no estructurades. El flux per a cada ordenada angular es resol amb un mètode de substitució equivalent a la resolució d'un sistema lineal triangular. La naturalesa seqüencial d'aquest procés fa de la paral·lelització de l'algoritme el principal repte. Diversos algorismes de substitució han estat analitzats, esdevenint una de les heurístiques proposades la millor opció en totes les situacions analitzades, amb excel·lents resultats. Els testos d'eficiència paral·lela s'han realitzat usant fins a 2560 CPU.Direct Numerical Simulation (DNS) of complex flows is currently an utopia for most of industrial applications because computational requirements are too high. For a given flow, the gap between the required and the available computing resources is covered by modeling/simplifying of some terms of the original equations. On the other hand, the continuous growth of the computing power of modern supercomputers contributes to reduce this gap, reducing hence the unresolved physics that need to be attempted with approximated models. This growth, widely relies on parallel computing technologies. However, getting the expected performance from new complex computing systems is becoming more and more difficult, and therefore part of the CFD research is focused on this goal. Regarding to it, some contributions are presented in this thesis. The first objective was to contribute to the development of a general purpose multi-physics CFD code. referred to as TermoFluids (TF). TF is programmed following the object oriented paradigm and designed to run in modern parallel computing systems. It is also intensively involved in many different projects ranging from basic research to industry applications. Besides, one of the strengths of TF is its good parallel performance demonstrated in several supercomputers. In the context of this thesis, the work was focused on the development of two of the most basic libraries that compose TF: I) the Basic Objects Library (BOL), which is a parallel unstructured CFD application programming interface, on the top of which the rest of libraries that compose TF are written, ii) the Linear Solvers Library (LSL) containing many different algorithms to solve the linear systems arising from the discretization of the equations. The first chapter of this thesis contains the main ideas underlying the design and the implementation of the BOL and LSL libraries, together with some examples and some industrial applications. A detailed description of some application-specific linear solvers included in the LSL is carried out in the following chapters. In the second chapter, a parallel direct Poisson solver restricted to problems with one uniform periodic direction is presented. The Poisson equation is solved, at least, once per time-step when modeling incompressible flows, becoming one of the most time consuming and difficult to parallelize parts of the code. The solver here proposed is a combination of a direct Schur-complement based decomposition (DSD) and a Fourier diagonalization. The latter decomposes the original system into a set of mutually independent 2D sub-systems which are solved by means of the DSD algorithm. Since no restrictions are imposed in the non-periodic directions, the overall algorithm is well-suited for solving problems discretized on extruded 2D unstructured meshes. The scalability of the solver has been successfully tested using up to 8192 CPU cores for meshes with up to 10 9 grid points. In the last chapter, a solver for the Boltzmann Transport Equation (BTE) is presented. It can be used to solve radiation phenomena interacting with flows. The solver is based on the Discrete Ordinates Method and can be applied to unstructured discretizations. The flux for each angular ordinate is swept across the computational grid, within a source iteration loop that accounts for the coupling between the different ordinates. The sequential nature of the sweep process makes the parallelization of the overall algorithm the most challenging aspect. Several parallel sweep algorithms, which represent different options of interleaving communications and calculations, are analyzed. One of the heuristics proposed consistently stands out as the best option in all the situations analyzed. With this algorithm, good scalability results have been achieved regarding both weak and strong speedup tests with up to 2560 CPUs

    Optimal Placement of Valves in a Water Distribution Network with CLP(FD)

    Full text link
    This paper presents a new application of logic programming to a real-life problem in hydraulic engineering. The work is developed as a collaboration of computer scientists and hydraulic engineers, and applies Constraint Logic Programming to solve a hard combinatorial problem. This application deals with one aspect of the design of a water distribution network, i.e., the valve isolation system design. We take the formulation of the problem by Giustolisi and Savic (2008) and show how, thanks to constraint propagation, we can get better solutions than the best solution known in the literature for the Apulian distribution network. We believe that the area of the so-called hydroinformatics can benefit from the techniques developed in Constraint Logic Programming and possibly from other areas of logic programming, such as Answer Set Programming.Comment: Best paper award at the 27th International Conference on Logic Programming - ICLP 2011; Theory and Practice of Logic Programming, (ICLP'11) Special Issue, volume 11, issue 4-5, 201

    LAPSES: A Recipe for High-Performance Adaptive Router Design

    Get PDF
    Earlier research has shown that adaptive routing can help in improving network performance. However, it has not received adequate attention in commercial routers mainly due to the additional hardware complexity, and the perceived cost and performance degradation that may result from this complexity. These concerns can be mitigated if one can design a cost-effective router that can support adaptive routing. This paper proposes a three step recipe — Look-Ahead routing, intelligent Path Selection, and an Economic Storage implementation, called the LAPSES approach — for cost-effective high performance pipelined adaptive router design. The first step, look-ahead routing, reduces a pipeline stage in the router by making table lookup and arbitration concurrent. Next, three new traffic-sensitive path selection heuristics (LRU, LFU and MAX-CREDIT) are proposed to select one of the available alternate paths. Finally, two techniques for reducing routing table size of the adaptive router are presented. These are called meta-table routing and economical storage. The proposed economical storage needs a routing table with only 9 and 27 entries for two and three dimensional meshes, respectively. All these design ideas are evaluated on a (16 16) mesh network via simulation. A fully adaptive algorithm and various traffic patterns are used to examine the performance benefits. Performance results show that the look-ahead design as well as the path selection heuristics boost network performance, while the economical storage approach turns out to be an ideal choice in comparison to full-table and meta-table options. We believe the router resulting from these three design enhancements can make adaptive routing a viable choice for interconnects.

    Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

    Get PDF
    Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM
    corecore