Search CORE

11 research outputs found

Volumetric image classification using homogeneous decomposition and dictionary learning: A study using retinal optical coherence tomography for detecting age-related macular degeneration

Author: Albarrak Abdulrahman
Coenen Frans
Zheng Yalin
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

University of Liverpool Repository

Crossref

A scalable software framework for solving PDEs on distributed octree meshes using finite element methods

Author: Lofquist Alec
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2018
Field of study

Tracking particle motion in inertial flows (especially in obstructed geometries) is a computationally daunting proposition. This is further complicated by that fact that the construction of migration maps for particles (as a function of particle location, flow conditions, and particle size) requires several thousands of simulations tracking individual particles. This calls for the development of an efficient, scalable approach for single particle tracking in fluids. We bring together three distinct elements to accomplish this: (a) a parallel octree based adaptive mesh generation framework, (b) a variational multiscale (VMS) based treatment that enables flow condition agnostic simulations (laminar or turbulent)~\cite{Bazilevs07b}, and (c) a variationally consistent immersed boundary method (IBM) to efficiently track moving particles in a background octree mesh~\cite{Xu:2015ig}. This project builds on our existing codes for adaptive meshing (\dendro) and finite elements (\talyfem). We present our adaptive meshing framework that is tailored for the immersed boundary method and experiments demonstrating the scalability of our code to over 1k compute nodes

Digital Repository @ Iowa State University (ISU)

Low-constant parallel algorithms for finite element simulations using linear octrees

Author: Christos Davatzikos
George Biros
Hari Sundar
Rahul S. Sampath
Santi S. Adavani
Publication venue: ACM Press
Publication date: 01/01/2007
Field of study

In this article we propose parallel algorithms for the construction of conforming finite-element discretization on linear octrees. Existing octree-based discretizations scale to billions of elements, but the complexity constants can be high. In our approach we use several techniques to minimize overhead: a novel bottom-up tree-construction and 2:1 balance constraint enforcement; a Golomb-Rice encoding for compression by representing the octree and element connectivity as an Uniquely Decodable Code (UDC); overlapping communication and computation; and byte alignment for cache efficiency. The cost of applying the Laplacian is comparable to that of applying it using a direct indexing regular grid discretization with the same number of elements. Our algorithm has scaled up to four billion octants on 4096 processors on a Cray XT3 at the Pittsburgh Supercomputing Center. The overall tree construction time is under a minute in contrast to previous implementations that required several minutes; the evaluation of the discretization of a variable-coefficient Laplacian takes only a few seconds. 1

CiteSeerX

Crossref

A generic finite element framework on parallel tree-based adaptive meshes

Author: Badia Santiago
Martín Alberto F.
Neiva Eric
Verdugo Francesc
Publication venue
Publication date: 01/01/2020
Field of study

We present highly scalable parallel distributed-memory algorithms and associated data structures for a generic finite element framework that supports h-adaptivity on computational domains represented as multiple connected adaptive trees—forest-of-trees—, thus providing multi-scale resolution on problems governed by partial differential equations.The framework is grounded on a rich representation of the adaptive mesh suitable for generic finite elements that is built on top of a low-level, light-weight forest-oftrees data structure handled by a specialized, highly parallel adaptive meshing engine. Along the way, we have identified the requirements that the forest-of-trees layer must fulfill to be coupled into our framework. Essentially, it must be able to describe neighboring relationships between cells in the adapted mesh (apart from hierarchical relationships) across the lower-dimensional objects at the boundary of the cells. Atop this two-layered mesh representation, we build the rest of data structures required for the numerical integration and assembly of the discrete system of linear equations.We consider algorithms that are suitable for both subassembled and fully-assembled distributed data layouts of linear system matrices. The proposed framework has been implemented within the FEMPAR scientific software library, using p4est as a practical forest-of-octrees demonstrator. A comprehensive strong scaling study of this implementation when applied to Poisson and Maxwell problems reveals remarkable scalability up to 32.2K CPU cores and 482.2M degrees of freedom. Besides, the implementation in FEMPAR of the proposed approach is up to 2.6 and 3.4 times faster than the state-of-the-art deal.II finite element software in the h-adaptive approximation of a Poisson problem with firstand second-order Lagrangian finite elements, respectively (excluding the linear solver step from the comparison)

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Scipedia

Buoyancy-driven flow and fluid-structure interaction with moving boundaries

Author: Xu Songzhe
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2018
Field of study

We deploy the residual-based variational multi-scale (VMS) method in the sense of large-eddy simulation (LES) in finite element method to buoyancy-driven flow in enclosures and consider an extensive range of Rayleigh number from laminar (

10^3

) to turbulent (

10^{10}

) in a 2D benchmark Rayleigh--B\\u27enard problem. 3D simulations for a laminar and a turbulent case are performed and comparisons including mean profiles as well as fluctuation profiles with other numerical and experimental results are successfully carried out. A weakly imposed boundary conditions method is employed for both velocity and temperature, and it produces reasonable results with a much coarser mesh compared with the traditional imposition of boundary conditions. This suggests that the VMS framework with the weak imposition of boundary conditions is a computationally efficient approach to model buoyancy-driven flows in complex indoor environments. In addition to the flow fields, we deploy the immersogeometric analysis (IMGA) method in the sense of the immersed boundary method (IBM) for objects moving in fluids onto an unstructured framework. The finite element formulation is stabilized by the VMS method in an unstructured background mesh. Weak imposition of boundary conditions is used to impose no-slip boundary condition on the immersed boundary. Adaptively refined quadrature rules are used to better capture the geometry of the immersed boundary and accurately integrate the background elements that intersect the immersed boundary. Treatment for the freshly-cleared nodes is considered. We assess the accuracy of the moving IMGA framework by analyzing object motion in a variety of flow structures, including freely dropping cylinder/sphere in viscous fluids and particle focusing in (un)obstructed channels. We show the quantities of interests are in good agreements with other analytical, numerical and experimental solutions. Advantages of this moving IMGA framework in computational cost and efficiency are indicated by the comparison with the body-fitted method using a commercial computational fluid dynamic (CFD) software. The framework of moving IMGA is capable to be deployed in applications of particle control and manipulation in microfluidic channels. The moving IMGA on the unstructured framework is further deployed to a scalable, adaptively refined, octree-based finite element approach for a better computational performance to track object motion. This enables using a parallel, hierarchically refined octree mesh as the background mesh, with a variationally consistent IMGA formulation on this background mesh. We integrate the unstructured framework of moving IMGA to the octree-based framework. We show good scaling results of the coupled framework on Stampede2, TACC. This illustrates the potential of the moving IMGA on the coupled framework to efficiently track complex particles in flows

Digital Repository @ Iowa State University (ISU)

Die Finite-Elemente-Methode mit dynamisch-adaptiven kartesischen Gittern

Author: Maier Benjamin
Publication venue
Publication date: 01/01/2015
Field of study

In dieser Arbeit wird ein zweidimensionales Strömungsproblem, beschrieben durch die Navier-Stokes-Gleichungen, auf einem dynamisch adaptiven Gitter mithilfe der Finite-Elemente-Methode berechnet. Es wird der komplette Ablauf der Berechnung anhand einer Implementierung vorgestellt. Als Datenstruktur werden Quadtrees verwendet, die mit einem bottom-up-Algorithmus nach Sundar et al. parallel erzeugt werden können. Basierend auf der Vorticity wird das Gitter während der Simulation verfeinert oder vergröbert. Es wird die parallele Skalierbarkeit untersucht und für ein reguläres Gitter ein Laufzeitvergleich mit einer Referenzimplementierung ohne Quadtrees durchgeführt

X10 for high-performance scientific computing

Author: Milthorpe Joshua John
Publication venue
Publication date: 01/01/2015
Field of study

High performance computing is a key technology that enables large-scale physical simulation in modern science. While great advances have been made in methods and algorithms for scientific computing, the most commonly used programming models encourage a fragmented view of computation that maps poorly to the underlying computer architecture. Scientific applications typically manifest physical locality, which means that interactions between entities or events that are nearby in space or time are stronger than more distant interactions. Linear-scaling methods exploit physical locality by approximating distant interactions, to reduce computational complexity so that cost is proportional to system size. In these methods, the computation required for each portion of the system is different depending on that portion’s contribution to the overall result. To support productive development, application programmers need programming models that cleanly map aspects of the physical system being simulated to the underlying computer architecture while also supporting the irregular workloads that arise from the fragmentation of a physical system. X10 is a new programming language for high-performance computing that uses the asynchronous partitioned global address space (APGAS) model, which combines explicit representation of locality with asynchronous task parallelism. This thesis argues that the X10 language is well suited to expressing the algorithmic properties of locality and irregular parallelism that are common to many methods for physical simulation. The work reported in this thesis was part of a co-design effort involving researchers at IBM and ANU in which two significant computational chemistry codes were developed in X10, with an aim to improve the expressiveness and performance of the language. The first is a Hartree–Fock electronic structure code, implemented using the novel Resolution of the Coulomb Operator approach. The second evaluates electrostatic interactions between point charges, using either the smooth particle mesh Ewald method or the fast multipole method, with the latter used to simulate ion interactions in a Fourier Transform Ion Cyclotron Resonance mass spectrometer. We compare the performance of both X10 applications to state-of-the-art software packages written in other languages. This thesis presents improvements to the X10 language and runtime libraries for managing and visualizing the data locality of parallel tasks, communication using active messages, and efficient implementation of distributed arrays. We evaluate these improvements in the context of computational chemistry application examples. This work demonstrates that X10 can achieve performance comparable to established programming languages when running on a single core. More importantly, X10 programs can achieve high parallel efficiency on a multithreaded architecture, given a divide-and-conquer pattern parallel tasks and appropriate use of worker-local data. For distributed memory architectures, X10 supports the use of active messages to construct local, asynchronous communication patterns which outperform global, synchronous patterns. Although point-to-point active messages may be implemented efficiently, productive application development also requires collective communications; more work is required to integrate both forms of communication in the X10 language. The exploitation of locality is the key insight in both linear-scaling methods and the APGAS programming model; their combination represents an attractive opportunity for future co-design efforts

The Australian National University

Large-scale tree-based unfitted finite elements for metal additive manufacturing

Author: Miranda Neiva Eric
Publication venue: Universitat Politècnica de Catalunya
Publication date: 07/10/2020
Field of study

This thesis addresses large-scale numerical simulations of partial differential equations posed on evolving geometries. Our target application is the simulation of metal additive manufacturing (or 3D printing) with powder-bed fusion methods, such as Selective Laser Melting (SLM), Direct Metal Laser Sintering (DMLS) or Electron-Beam Melting (EBM). The simulation of metal additive manufacturing processes is a remarkable computational challenge, because processes are characterised by multiple scales in space and time and multiple complex physics that occur in intricate three-dimensional growing-in-time geometries. Only the synergy of advanced numerical algorithms and high-performance scientific computing tools can fully resolve, in the short run, the simulation needs in the area. The main goal of this Thesis is to design a a novel highly-scalable numerical framework with multi-resolution capability in arbitrarily complex evolving geometries. To this end, the framework is built by combining three computational tools: (1) parallel mesh generation and adaptation with forest-of-trees meshes, (2) robust unfitted finite element methods and (3) parallel finite element modelling of the geometry evolution in time. Our numerical research is driven by several limitations and open questions in the state-of-the-art of the three aforementioned areas, which are vital to achieve our main objective. All our developments are deployed with high-end distributed-memory implementations in the large-scale open-source software project FEMPAR. In considering our target application, (4) temporal and spatial model reduction strategies for thermal finite element models are investigated. They are coupled to our new large-scale computational framework to simplify optimisation of the manufacturing process. The contributions of this Thesis span the four ingredients above. Current understanding of (1) is substantially improved with rigorous proofs of the computational benefits of the 2:1 k-balance (ease of parallel implementation and high-scalability) and the minimum requirements a parallel tree-based mesh must fulfil to yield correct parallel finite element solvers atop them. Concerning (2), a robust, optimal and scalable formulation of the aggregated unfitted finite element method is proposed on parallel tree-based meshes for elliptic problems with unfitted external contour or unfitted interfaces. To the author’s best knowledge, this marks the first time techniques (1) and (2) are brought together. After enhancing (1)+(2) with a novel parallel approach for (3), the resulting framework is able to mitigate a major performance bottleneck in large-scale simulations of metal additive manufacturing processes by powder-bed fusion: scalable adaptive (re)meshing in arbitrarily complex geometries that grow in time. Along the development of this Thesis, our application problem (4) is investigated in two joint collaborations with the Monash Centre for Additive Manufacturing and Monash University in Melbourne, Australia. The first contribution is an experimentally-supported thorough numerical assessment of time-lumping methods, the second one is a novel experimentally-validated formulation of a new physics-based thermal contact model, accounting for thermal inertia and suitable for model localisation, the so-called virtual domain approximation. By efficiently exploiting high-performance computing resources, our new computational framework enables large-scale finite element analysis of metal additive manufacturing processes, with increased fidelity of predictions and dramatical reductions of computing times. It can also be combined with the proposed model reductions for fast thermal optimisation of the manufacturing process. These tools open the path to accelerate the understanding of the process-to-performance link and digital product design and certification in metal additive manufacturing, two milestones that are vital to exploit the technology for mass-production.Aquesta tesi tracta la simulació a gran escala d'equacions en derivades parcials sobre geometries variables. L'aplicació principal és la simulació de procesos de fabricació additiva (o impressió 3D) amb metalls i per mètodes de fusió de llit de pols, com ara Selective Laser Melting (SLM), Direct Metal Laser Sintering (DMLS) o Electron-Beam Melting (EBM). La simulació d'aquests processos és un repte computacional excepcional, perquè els processos estan caracteritzats per múltiples escales espaitemporals i múltiples físiques que tenen lloc sobre geometries tridimensionals complicades que creixen en el temps. La sinèrgia entre algorismes numèrics avançats i eines de computació científica d'alt rendiment és la única via per resoldre completament i a curt termini les necessitats en simulació d'aquesta àrea. El principal objectiu d'aquesta tesi és dissenyar un nou marc numèric escalable de simulació amb capacitat de multiresolució en geometries complexes i variables. El nou marc es construeix unint tres eines computacionals: (1) mallat paral·lel i adaptatiu amb malles de boscs d'arbre, (2) mètodes d'elements finits immersos robustos i (3) modelització en paral·lel amb elements finits de geometries que creixen en el temps. Algunes limitacions i problemes oberts en l'estat de l'art, que són claus per aconseguir el nostre objectiu, guien la nostra recerca. Tots els desenvolupaments s'implementen en arquitectures de memòria distribuïda amb el programari d'accés obert FEMPAR. Quant al problema d'aplicació, (4) s'investiguen models reduïts en espai i temps per models tèrmics del procés. Aquests models reduïts s'acoplen al nostre marc computacional per simplificar l'optimització del procés. Les contribucions d'aquesta tesi abasten els quatre punts de dalt. L'estat de l'art de (1) es millora substancialment amb proves riguroses dels beneficis computacionals del 2:1 balancejat (fàcil paral·lelització i alta escalabilitat), així com dels requisits mínims que aquest tipus de mallat han de complir per garantir que els espais d'elements finits que s'hi defineixin estiguin ben posats. Quant a (2), s'ha formulat un mètode robust, òptim i escalable per agregació per problemes el·líptics amb contorn o interface immerses. Després d'augmentar (1)+(2) amb un nova estratègia paral·lela per (3), el marc de simulació resultant mitiga de manera efectiva el principal coll d'ampolla en la simulació de processos de fabricació additiva en llits de pols de metall: adaptivitat i remallat escalable en geometries complexes que creixen en el temps. Durant el desenvolupament de la tesi, es col·labora amb el Monash Centre for Additive Manufacturing i la Universitat de Monash de Melbourne, Austràlia, per investigar el problema d'aplicació. En primer lloc, es fa una anàlisi experimental i numèrica exhaustiva dels mètodes d'aggregació temporal. En segon lloc, es proposa i valida experimental una nova formulació de contacte tèrmic que té en compte la inèrcia tèrmica i és adequat per a localitzar el model, l'anomenada aproximació per dominis virtuals. Mitjançant l'ús eficient de recursos computacionals d'alt rendiment, el nostre nou marc computacional fa possible l'anàlisi d'elements finits a gran escala dels processos de fabricació additiva amb metalls, amb augment de la fidelitat de les prediccions i reduccions significatives de temps de computació. Així mateix, es pot combinar amb els models reduïts que es proposen per l'optimització tèrmica del procés de fabricació. Aquestes eines contribueixen a accelerar la comprensió del lligam procés-rendiment i la digitalització del disseny i certificació de productes en fabricació additiva per metalls, dues fites crucials per explotar la tecnologia en producció en massa.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa