1,193 research outputs found

    Large Deformation Diffeomorphic Metric Mapping And Fast-Multipole Boundary Element Method Provide New Insights For Binaural Acoustics

    Get PDF
    This paper describes how Large Deformation Diffeomorphic Metric Mapping (LDDMM) can be coupled with a Fast Multipole (FM) Boundary Element Method (BEM) to investigate the relationship between morphological changes in the head, torso, and outer ears and their acoustic filtering (described by Head Related Transfer Functions, HRTFs). The LDDMM technique provides the ability to study and implement morphological changes in ear, head and torso shapes. The FM-BEM technique provides numerical simulations of the acoustic properties of an individual's head, torso, and outer ears. This paper describes the first application of LDDMM to the study of the relationship between a listener's morphology and a listener's HRTFs. To demonstrate some of the new capabilities provided by the coupling of these powerful tools, we examine the classical question of what it means to ``listen through another individual's outer ears.'' This work utilizes the data provided by the Sydney York Morphological and Acoustic Recordings of Ears (SYMARE) database.Comment: Submitted as a conference paper to IEEE ICASSP 201

    Modelling Fluid Structure Interaction problems using Boundary Element Method

    Get PDF
    This dissertation investigates the application of Boundary Element Methods (BEM) to Fluid Structure Interaction (FSI) problems under three main different perspectives. This work is divided in three main parts: i) the derivation of BEM for the Laplace equation and its application to analyze ship-wave interaction problems, ii) the imple- mentation of efficient and parallel BEM solvers addressing the newest challenges of High Performance Computing, iii) the developing of a BEM for the Stokes system and its application to study micro-swimmers.First we develop a BEM for the Laplace equation and we apply it to predict ship-wave interactions making use of an innovative coupling with Finite Element Method stabilization techniques. As well known, the wave pattern around a body depends on the Froude number associated to the flow. Thus, we throughly investigate the robustness and accuracy of the developed methodology assessing the solution dependence on such parameter. To improve the performance and tackle problems with higher number of unknowns, the BEM developed for the Laplace equation is parallelized using OpenSOURCE tech- nique in a hybrid distributed-shared memory environment. We perform several tests to demonstrate both the accuracy and the performance of the parallel BEM developed. In addition, we explore two different possibilities to reduce the overall computational cost from O(N2) to O(N). Firstly we couple the library with a Fast Multiple Method that allows us to reach for higher order of complexity and efficiency. Then we perform a preliminary study on the implementation of a parallel Non Uniform Fast Fourier Transform to be coupled with the newly developed algorithm Sparse Cardinal Sine De- composition (SCSD).Finally we consider the application of the BEM framework to a different kind of FSI problem represented by the Stokes flow of a liquid medium surrounding swimming micro-organisms. We maintain the parallel structure derived for the Laplace equation even in the Stokes setting. Our implementation is able to simulate both prokaryotic and eukaryotic organisms, matching literature and experimental benchmarks. We finally present a deep analysis of the importance of hydrodynamic interactions between the different parts of micro-swimmers in the prevision of optimal swimming conditions, focusing our attention on the study of flagellated \u201crobotic\u201d composite swimmers

    GPU Acceleration of a Non-Standard Finite Element Mesh Truncation Technique for Electromagnetics

    Get PDF
    The emergence of General Purpose Graphics Processing Units (GPGPUs) provides new opportunities to accelerate applications involving a large number of regular computations. However, properly leveraging the computational resources of graphical processors is a very challenging task. In this paper, we use this kind of device to parallelize FE-IIEE (Finite Element-Iterative Integral Equation Evaluation), a non-standard finite element mesh truncation technique introduced by two of the authors. This application is computationally very demanding due to the amount, size and complexity of the data involved in the procedure. Besides, an efficient implementation becomes even more difficult if the parallelization has to maintain the complex workflow of the original code. The proposed implementation using CUDA applies different optimization techniques to improve performance. These include leveraging the fastest memories of the GPU and increasing the granularity of the computations to reduce the impact of memory access. We have applied our parallel algorithm to two real radiation and scattering problems demonstrating speedups higher than 140 on a state-of-the-art GPU.This work was supported in part by the Spanish Government under Grant TEC2016-80386-P, Grant TIN2017-82972-R, and Grant ESP2015-68245-C4-1-P, and in part by the Valencian Regional Government under Grant PROMETEO/2019/109

    Custom Integrated Circuits

    Get PDF
    Contains table of contents for Part III, table of contents for Section 1 and reports on eleven research projects.IBM CorporationMIT School of EngineeringNational Science Foundation Grant MIP 94-23221Defense Advanced Research Projects Agency/U.S. Army Intelligence Center Contract DABT63-94-C-0053Mitsubishi CorporationNational Science Foundation Young Investigator Award Fellowship MIP 92-58376Joint Industry Program on Offshore Structure AnalysisAnalog DevicesDefense Advanced Research Projects AgencyCadence Design SystemsMAFET ConsortiumConsortium for Superconducting ElectronicsNational Defense Science and Engineering Graduate FellowshipDigital Equipment CorporationMIT Lincoln LaboratorySemiconductor Research CorporationMultiuniversity Research IntiativeNational Science Foundatio

    Scalable Fast Multipole Methods on Heterogeneous Architecture

    Get PDF
    The N-body problem appears in many computational physics simulations. At each time step the computation involves an all-pairs sum whose complexity is quadratic, followed by an update of particle positions. This cost means that it is not practical to solve such dynamic N-body problems on large scale. To improve this situation, we use both algorithmic and hardware approaches. Our algorithmic approach is to use the Fast Multipole Method (FMM), which is a divide-and-conquer algorithm that performs a fast N-body sum using a spatial decomposition and is often used in a time-stepping or iterative loop, to reduce such quadratic complexity to linear with guaranteed accuracy. Our hardware approach is to use heterogeneous clusters, which comprised of nodes that contain multi-core CPUs tightly coupled with accelerators, such as graphics processors unit (GPU) as our underline parallel processing hardware, on which efficient implementations require highly non-trivial re-designed algorithms. In this dissertation, we fundamentally reconsider the FMM algorithms on heterogeneous architectures to achieve a significant improvement over recent/previous implementations in literature and to make the algorithm ready for use as a workhorse simulation tool for both time-dependent vortex flow problems and for boundary element methods. Our major contributions include: 1. Novel FMM data structures using parallel construction algorithms for dynamic problems. 2. A fast hetegenenous FMM algorithm for both single and multiple computing nodes. 3. An efficient inter-node communication management using fast parallel data structures. 4. A scalable FMM algorithm using novel Helmholz decomposition for Vortex Methods (VM). The proposed algorithms can handle non-uniform distributions with irregular partition shapes to achieve workload balance and their MPI-CUDA implementations are highly tuned up and demonstrate the state of the art performances

    GPU fast multipole method with lambda-dynamics features

    Get PDF
    A significant and computationally most demanding part of molecular dynamics simulations is the calculation of long-range electrostatic interactions. Such interactions can be evaluated directly by the naïve pairwise summation algorithm, which is a ubiquitous showcase example for the compute power of graphics processing units (GPUS). However, the pairwise summation has O(N^2) computational complexity for N interacting particles; thus, an approximation method with a better scaling is required. Today, the prevalent method for such approximation in the field is particle mesh Ewald (PME). PME takes advantage of fast Fourier transforms (FFTS) to approximate the solution efficiently. However, as the underlying FFTS require all-to-all communication between ranks, PME runs into a communication bottleneck. Such communication overhead is negligible only for a moderate parallelization. With increased parallelization, as needed for high-performance applications, the usage of PME becomes unprofitable. Another PME drawback is its inability to perform constant pH simulations efficiently. In such simulations, the protonation states of a protein are allowed to change dynamically during the simulation. The description of this process requires a separate evaluation of the energies for each protonation state. This can not be calculated efficiently with PME as the algorithm requires a repeated FFT for each state, which leads to a linear overhead with respect to the number of states. For a fast approximation of pairwise Coulombic interactions, which does not suffer from PME drawbacks, the Fast Multipole Method (FMM) has been implemented and fully parallelized with CUDA. To assure the optimal FMM performance for diverse MD systems multiple parallelization strategies have been developed. The algorithm has been efficiently incorporated into GROMACS and subsequently tested to determine the optimal FMM parameter set for MD simulations. Finally, the FMM has been incorporated into GROMACS to allow for out-of-the-box electrostatic calculations. The performance of the single-GPU FMM implementation, tested in GROMACS 2019, achieves about a third of highly optimized CUDA PME performance when simulating systems with uniform particle distributions. However, the FMM is expected to outperform PME at high parallelization because the FMM global communication overhead is minimal compared to that of PME. Further, the FMM has been enhanced to provide the energies of an arbitrary number of titratable sites as needed in the constant-pH method. The extension is not fully optimized yet, but the first results show the strength of the FMM for constant pH simulations. For a relatively large system with half a million particles and more than a hundred titratable sites, a straightforward approach to compute alternative energies requires the repetition of a simulation for each state of the sites. The FMM calculates all energy terms only a factor 1.5 slower than a single simulation step. Further improvements of the GPU implementation are expected to yield even more speedup compared to the actual implementation.2021-11-1

    Perceptually Driven Interactive Sound Propagation for Virtual Environments

    Get PDF
    Sound simulation and rendering can significantly augment a user‘s sense of presence in virtual environments. Many techniques for sound propagation have been proposed that predict the behavior of sound as it interacts with the environment and is received by the user. At a broad level, the propagation algorithms can be classified into reverberation filters, geometric methods, and wave-based methods. In practice, heuristic methods based on reverberation filters are simple to implement and have a low computational overhead, while wave-based algorithms are limited to static scenes and involve extensive precomputation. However, relatively little work has been done on the psychoacoustic characterization of different propagation algorithms, and evaluating the relationship between scientific accuracy and perceptual benefits.In this dissertation, we present perceptual evaluations of sound propagation methods and their ability to model complex acoustic effects for virtual environments. Our results indicate that scientifically accurate methods for reverberation and diffraction do result in increased perceptual differentiation. Based on these evaluations, we present two novel hybrid sound propagation methods that combine the accuracy of wave-based methods with the speed of geometric methods for interactive sound propagation in dynamic scenes.Our first algorithm couples modal sound synthesis with geometric sound propagation using wave-based sound radiation to perform mode-aware sound propagation. We introduce diffraction kernels of rigid objects,which encapsulate the sound diffraction behaviors of individual objects in the free space and are then used to simulate plausible diffraction effects using an interactive path tracing algorithm. Finally, we present a novel perceptual driven metric that can be used to accelerate the computation of late reverberation to enable plausible simulation of reverberation with a low runtime overhead. We highlight the benefits of our novel propagation algorithms in different scenarios.Doctor of Philosoph

    Multi-solver schemes for electromagnetic modeling of large and complex objects

    Get PDF
    The work in this dissertation primarily focuses on the development of numerical algorithms for electromagnetic modeling of large and complex objects. First, a GPU-accelerated multilevel fast multipole algorithm (MLFMA) is presented to improve the efficiency of the traditional MLFMA by taking advantage of GPU hardware advancement. The proposed hierarchical parallelization strategy ensures a high computational throughput for the GPU calculation. The resulting OpenMP-based multi-GPU implementation is capable of solving real-life problems with over one million unknowns with a remarkable speedup. The radar cross sections (RCS) of a few benchmark objects are calculated to demonstrate the accuracy of the solution. The results are compared with those from the CPU-based MLFMA and measurements. The capability and efficiency of the presented method are analyzed through the examples of a sphere, an aircraft, and a missile-like object. Compared with the 8-threaded CPU-based MLFMA, the OpenMP-CUDA-MLFMA method can achieve from 5 to 20 times total speedup. Second, an efficient and accurate finite element--boundary integral (FE-BI) method is proposed for solving electromagnetic scattering and radiation problems. A mixed testing scheme, in which the Rao-Wilton-Glisson and the Buffa-Christiansen functions are both employed as the testing functions, is first presented to improve the accuracy of the FE-BI method. An efficient absorbing boundary condition (ABC)-based preconditioner is then proposed to accelerate the convergence of the iterative solution. To further improve the efficiency of the total computation, a GPU-accelerated MLFMA is applied to the iterative solution. The RCSs of several benchmark objects are calculated to demonstrate the numerical accuracy of the solution and also to show that the proposed method not only is free of interior resonance corruption, but also has a better convergence than the conventional FE-BI methods. The capability and efficiency of the proposed method are analyzed through several numerical examples, including a large dielectric coated sphere, a partial human body, and a coated missile-like object. Compared with the 8-threaded CPU-based algorithm, the GPU-accelerated FE-BI-MLFMA algorithm can achieve a total speedup up to 25.5 times. Third, a multi-solver (MS) scheme based on combined field integral equation (CFIE) is proposed. In this scheme, an object is decomposed into multiple bodies based on its material property and geometry. To model bodies with complicated materials, the FE-BI method is applied. To model bodies with homogeneous or conducting materials, the method of moments is employed. Specifically, three solvers are integrated in this multi-solver scheme: the FE-BI(CFIE) for inhomogeneous objects, the CFIE for dielectric objects, and the CFIE for conducting objects. A mixed testing scheme that utilizes both the Rao-Wilton-Glisson and the Buffa-Christiansen functions is adopted to obtain a good accuracy of the proposed multi-solver algorithm. In the iterative solution of the combined system, the MLFMA is applied to accelerate computation and reduce memory costs, and an ABC-based preconditioner is employed to speed up the convergence. In the numerical examples, the individual solvers are first demonstrated to be well conditioned and highly accurate. Then the validity of the proposed multi-solver scheme is demonstrated and its capability is shown by solving scattering problems of electrically large missile-like objects. Fourth, a MS scheme based on Robin transmission condition (RTC) is proposed. Different from the FE-BI method that applies BI equations to truncate the FE domain, this proposed multi-solver scheme employs both FE and BI equations to model an object along with its background. To be specific, the entire computational domain consisting of the object and its background is first decomposed into multiple non-overlapping subdomains with each modeled by either an FE or BI equation. The equations in the subdomains are then coupled into a multi-solver system by enforcing the RTC at the subdomain interfaces. Finally, the combined system is solved iteratively with the application of an extended ABC-based preconditioner and the MLFMA. To obtain an accurate solution, both the Rao-Wilton-Glisson and the Buffa-Christiansen functions are employed as the testing functions to discretize the BI equations. This scheme is applied to a variety of benchmark problems and the scattering from an aircraft with a launched missile to demonstrate its accuracy, versatility, and capability. The proposed scheme is compared with the MS-CFIE to illustrate the differences between the two schemes. Fifth, to further improve the modeling capability, an accelerated MS method is developed on distributed computing systems to simulate the scattering from very large and complex objects. The parallelization strategy is to parallelize different subdomains individually, which is different from the parallelized domain decomposition methods, where the subdomains are handled in parallel. The multilevel fast multipole algorithm is parallelized to enable computation on many processors. The modeling strategy using the MS-RTC method is also discussed so that one can easily follow the guideline to model large and complex objects. Numerical examples are given to show the parallel efficiency of the proposed strategy and the modeling capability of the proposed method. Finally, the specific absorption rate (SAR) in a human head at 5G frequencies is simulated by taking advantage of the MS-RTC method. Based on the strong skin effect, the human head model is first simplified to reduce the computation cost. Then the MS-RTC method is applied to model the human head. Numerical examples show that the MS method is very efficient in solving electromagnetic fields in the human head and the simplified human head model can be used in the SAR simulation with an acceptable accuracy

    Generation of highly-energetic electrons in laser-plasma wakefields

    Get PDF
    • …
    corecore