259 research outputs found

    Scalable Parallel Delaunay Image-to-Mesh Conversion for Shared and Distributed Memory Architectures

    Get PDF
    Mesh generation is an essential component for many engineering applications. The ability to generate meshes in parallel is critical for the scalability of the entire Finite Element Method (FEM) pipeline. However, parallel mesh generation applications belong to the broader class of adaptive and irregular problems, and are among the most complex, challenging, and labor intensive to develop and maintain. In this thesis, we summarize several years of the progress that we made in a novel framework for highly scalable and guaranteed quality mesh generation for finite element analysis in three dimensions. We studied and developed parallel mesh generation algorithms on both shared and distributed memory architectures. In this thesis we present a novel two-level parallel tetrahedral mesh generation framework capable of delivering and sustaining close to 6000 of concurrent work units (cores). We achieve this by leveraging concurrency at two different granularity levels by using a hybrid message passing and multi-threaded execution model which is suitable to the hierarchy of the hardware architecture of the distributed memory clusters. An end-user productivity and scalability study was performed on up to 6000 cores, and indicated very good end-user productivity with about 300 million tets per second and about 3600 weak scaling speedup. Both of these results suggest that: compared to the best previous algorithm, we have seen an improvement of more than 7000 times in performance, measured in terms of speed (elements per second) by using about 180 times more CPUs, for geometries that are by many orders of magnitude more complex

    Parallel Anisotropic Unstructured Grid Adaptation

    Get PDF
    Computational Fluid Dynamics (CFD) has become critical to the design and analysis of aerospace vehicles. Parallel grid adaptation that resolves multiple scales with anisotropy is identified as one of the challenges in the CFD Vision 2030 Study to increase the capacity and capability of CFD simulation. The Study also cautions that computer architectures are undergoing a radical change and dramatic increases in algorithm concurrency will be required to exploit full performance. This paper reviews four different methods to parallel anisotropic grid generation. They cover both ends of the spectrum: (i) using existing state-of-the-art software optimized for a single core and modifying it for parallel platforms and (ii) designing and implementing scalable software with incomplete, but rapidly maturating functionality. A brief overview for each grid adaptation system is presented in the context of a telescopic approach for multilevel concurrency. These methods employ different approaches to enable parallel execution, which provides a unique opportunity to illustrate the relative behavior of each approach. Qualitative and quantitative metric evaluations are used to draw lessons for future developments in this critical area for parallel CFD simulation

    Extreme-Scale Parallel Mesh Generation: Telescopic Approach

    Get PDF
    In this poster we focus and present our preliminary results pertinent to the integration of multiple parallel Delaunay mesh generation methods into a coherent hierarchical framework. The goal of this project is to study our telescopic approach and to develop Delaunay-based methods to explore concurrency at all hardware layers using abstractions at (a) medium-grain level for many cores within a single chip and (b) coarse-grain level, i.e., sub-domain level using proper error metric- and application-specific continuous decomposition methods

    Parallel Anisotropic Unstructured Grid Adaptation

    Get PDF
    Computational fluid dynamics (CFD) has become critical to the design and analysis of aerospace vehicles. Parallel grid adaptation that resolves multiple scales with anisotropy is identified as one of the challenges in the CFD Vision 2030 Study to increase the capacity and capability of CFD simulation. The study also cautions that computer architectures are undergoing a radical change, and dramatic increases in algorithm concurrency will be required to exploit full performance. This paper reviews four different methods to parallel anisotropic grid adaptation. They cover both ends of the spectrum: 1) using existing state-of-the-art software optimized for a single core and modifying it for parallel platforms, and 2) designing and implementing scalable software with incomplete but rapidly maturing functionality. A brief overview for each grid adaptation system is presented in the context of a telescopic approach for multilevel concurrency. These methods employ different approaches to enable parallel execution, which provides a unique opportunity to illustrate the relative behavior of each approach. Qualitative and quantitative metric evaluations are used to draw lessons for future developments in this critical area for parallel CFD simulation

    A Unified Framework for Parallel Anisotropic Mesh Adaptation

    Get PDF
    Finite-element methods are a critical component of the design and analysis procedures of many (bio-)engineering applications. Mesh adaptation is one of the most crucial components since it discretizes the physics of the application at a relatively low cost to the solver. Highly scalable parallel mesh adaptation methods for High-Performance Computing (HPC) are essential to meet the ever-growing demand for higher fidelity simulations. Moreover, the continuous growth of the complexity of the HPC systems requires a systematic approach to exploit their full potential. Anisotropic mesh adaptation captures features of the solution at multiple scales while, minimizing the required number of elements. However, it also introduces new challenges on top of mesh generation. Also, the increased complexity of the targeted cases requires departing from traditional surface-constrained approaches to utilizing CAD (Computer-Aided Design) kernels. Alongside the functionality requirements, is the need of taking advantage of the ubiquitous multi-core machines. More importantly, the parallel implementation needs to handle the ever-increasing complexity of the mesh adaptation code. In this work, we develop a parallel mesh adaptation method that utilizes a metric-based approach for generating anisotropic meshes. Moreover, we enhance our method by interfacing with a CAD kernel, thus enabling its use on complex geometries. We evaluate our method both with fixed-resolution benchmarks and within a simulation pipeline, where the resolution of the discretization increases incrementally. With the Telescopic Approach for scalable mesh generation as a guide, we propose a parallel method at the node (multi-core) for mesh adaptation that is expected to scale up efficiently to the upcoming exascale machines. To facilitate an effective implementation, we introduce an abstract layer between the application and the runtime system that enables the use of task-based parallelism for concurrent mesh operations. Our evaluation indicates results comparable to state-of-the-art methods for fixed-resolution meshes both in terms of performance and quality. The integration with an adaptive pipeline offers promising results for the capability of the proposed method to function as part of an adaptive simulation. Moreover, our abstract tasking layer allows the separation of different aspects of the implementation without any impact on the functionality of the method

    Efficient Generating And Processing Of Large-Scale Unstructured Meshes

    Get PDF
    Unstructured meshes are used in a variety of disciplines to represent simulations and experimental data. Scientists who want to increase accuracy of simulations by increasing resolution must also increase the size of the resulting dataset. However, generating and processing a extremely large unstructured meshes remains a barrier. Researchers have published many parallel Delaunay triangulation (DT) algorithms, often focusing on partitioning the initial mesh domain, so that each rectangular partition can be triangulated in parallel. However, the comproblems for this method is how to merge all triangulated partitions into a single domain-wide mesh or the significant cost for communication the sub-region borders. We devised a novel algorithm --Triangulation of Independent Partitions in Parallel (TIPP) to deal with very large DT problems without requiring inter-processor communication while still guaranteeing the Delaunay criteria. The core of the algorithm is to find a set of independent} partitions such that the circumcircles of triangles in one partition do not enclose any vertex in other partitions. For this reason, this set of independent partitions can be triangulated in parallel without affecting each other. The results of mesh generation is the large unstructured meshes including vertex index and vertex coordinate files which introduce a new challenge \-- locality. Partitioning unstructured meshes to improve locality is a key part of our own approach. Elements that were widely scattered in the original dataset are grouped together, speeding data access. For further improve unstructured mesh partitioning, we also described our new approach. Direct Load which mitigates the challenges of unstructured meshes by maximizing the proportion of useful data retrieved during each read from disk, which in turn reduces the total number of read operations, boosting performance

    Real-Time High-Quality Image to Mesh Conversion for Finite Element Simulations

    Get PDF
    Technological Advances in Medical Imaging have enabled the acquisition of images accurately describing biological tissues. Finite Element (FE) methods on these images provide the means to simulate biological phenomena such as brain shift registration, respiratory organ motion, blood flow pressure in vessels, etc. FE methods require the domain of tissues be discretized by simpler geometric elements, such as triangles in two dimensions, tetrahedra in three, and pentatopes in four. This exact discretization is called a mesh . The accuracy and speed of FE methods depend on the quality and fidelity of the mesh used to describe the biological object. Elements with bad quality introduce numerical errors and slower solver convergence. Also, analysis based on poor fidelity meshes do not yield accurate results specially near the surface. In this dissertation, we present the theory and the implementation of both a sequential and a parallel Delaunay meshing technique for 3D and ---for the first time--- 4D space-time domains. Our method provably guarantees that the mesh is a faithful representation of the multi-tissue domain in topological and geometric sense. Moreover, we show that our method generates graded elements of bounded radius-edge and aspect ratio, which renders our technique suitable for Finite Element analysis. A notable feature of our implementation is speed and scalability. The single-threaded performance of our 3D code is faster than the state of the art open source meshing tools. Experimental evaluation shows a more than 82% weak scaling efficiency for up to 144 cores, reaching a rate of more than 14.3 million elements per second. This is the first 3D parallel Delaunay refinement method to achieve such a performance, on either distributed or shared-memory architectures. Lastly, this dissertation is the first to develop and examine the sequential and parallel high-quality and fidelity meshing of general space-time 4D multi-tissue domains

    Characterization and surface reconstruction of objects in tomographic images of composite materials

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaIn the scope of the project Tomo-GPU supported by FCT / MCTES the aim is to build an interactive graphical environment that allows a Materials specialist to define their own programs for analysis of 3D tomographic images. This project aims to build a tool to characterize and investigate the identified objects, where the user can define search criteria such as size, orientation, bounding boxes, among others. All this processing will be done on a desktop computer equipped with a graphics card with some processing power. On the proposed solution the modules for characterizing objects, received from the identification phase, will be implemented using some existing software libraries, most notably the CGAL library. The characterization modules with bigger execution times will be implemented using OpenCL and GPUs. With this work the characterization and reconstruction of objects and their research can now be done on conventional machines, using GPUs to accelerate the most time-consuming computations. After the conclusion of this thesis, new tools that will help to improve the current development cycle of new materials will be available for Materials Science specialists

    Finite Element Modeling Driven by Health Care and Aerospace Applications

    Get PDF
    This thesis concerns the development, analysis, and computer implementation of mesh generation algorithms encountered in finite element modeling in health care and aerospace. The finite element method can reduce a continuous system to a discrete idealization that can be solved in the same manner as a discrete system, provided the continuum is discretized into a finite number of simple geometric shapes (e.g., triangles in two dimensions or tetrahedrons in three dimensions). In health care, namely anatomic modeling, a discretization of the biological object is essential to compute tissue deformation for physics-based simulations. This thesis proposes an efficient procedure to convert 3-dimensional imaging data into adaptive lattice-based discretizations of well-shaped tetrahedra or mixed elements (i.e., tetrahedra, pentahedra and hexahedra). This method operates directly on segmented images, thus skipping a surface reconstruction that is required by traditional Computer-Aided Design (CAD)-based meshing techniques and is convoluted, especially in complex anatomic geometries. Our approach utilizes proper mesh gradation and tissue-specific multi-resolution, without sacrificing the fidelity and while maintaining a smooth surface to reflect a certain degree of visual reality. Image-to-mesh conversion can facilitate accurate computational modeling for biomechanical registration of Magnetic Resonance Imaging (MRI) in image-guided neurosurgery. Neuronavigation with deformable registration of preoperative MRI to intraoperative MRI allows the surgeon to view the location of surgical tools relative to the preoperative anatomical (MRI) or functional data (DT-MRI, fMRI), thereby avoiding damage to eloquent areas during tumor resection. This thesis presents a deformable registration framework that utilizes multi-tissue mesh adaptation to map preoperative MRI to intraoperative MRI of patients who have undergone a brain tumor resection. Our enhancements with mesh adaptation improve the accuracy of the registration by more than 5 times compared to rigid and traditional physics-based non-rigid registration, and by more than 4 times compared to publicly available B-Spline interpolation methods. The adaptive framework is parallelized for shared memory multiprocessor architectures. Performance analysis shows that this method could be applied, on average, in less than two minutes, achieving desirable speed for use in a clinical setting. The last part of this thesis focuses on finite element modeling of CAD data. This is an integral part of the design and optimization of components and assemblies in industry. We propose a new parallel mesh generator for efficient tetrahedralization of piecewise linear complex domains in aerospace. CAD-based meshing algorithms typically improve the shape of the elements in a post-processing step due to high complexity and cost of the operations involved. On the contrary, our method optimizes the shape of the elements throughout the generation process to obtain a maximum quality and utilizes high performance computing to reduce the overheads and improve end-user productivity. The proposed mesh generation technique is a combination of Advancing Front type point placement, direct point insertion, and parallel multi-threaded connectivity optimization schemes. The mesh optimization is based on a speculative (optimistic) approach that has been proven to perform well on hardware-shared memory. The experimental evaluation indicates that the high quality and performance attributes of this method see substantial improvement over existing state-of-the-art unstructured grid technology currently incorporated in several commercial systems. The proposed mesh generator will be part of an Extreme-Scale Anisotropic Mesh Generation Environment to meet industries expectations and NASA\u27s CFD visio
    • …
    corecore