117 research outputs found
Recommended from our members
Randomized Computations for Efficient and Robust Finite Element Domain Decomposition Methods in Electromagnetics
Numerical modeling of electromagnetic (EM) phenomenon has proved to become an effective and efficient tool in design and optimization of modern electronic devices, integrated circuits (IC) and RF systems. However the generality, efficiency and reliability/resilience of the computational EM solver is often criticised due to the fact that the underlying characteristics of the simulated problems are usually different, which makes the development of a general, \u27\u27black-box\u27\u27 EM solver to be a difficult task.
In this work, we aim to propose a reliable/resilient, scalable and efficient finite elements based domain decomposition method (FE-DDM) as a general CEM solver to tackle such ultimate CEM problems to some extent. We recognize the rank deficiency property of the Dirichlet-to-Neumann (DtN) operators involved in the previously proposed FETI-2 DDM formulation and apply such principle to improve the computational efficiency and robustness of FETI-2 DDM. Specifically, the rank deficient DtN operator is computed by a randomized computation method that was originally proposed to approximate matrix singular value decomposition (SVD). Numerical results show a up to 35\% run-time and 75% memory saving of the DtN operators computation can be achieved on a realistic example. Later, such rank deficiency principle is incorporated into a new global DDM preconditioner (W-FETI) that is inspired by the matrix Woodbury identity. Numerical study of the eigenspectrum shows the validity of the proposed W-FETI global preconditioner. Several industrial-scaled examples show significant iterative convergence advantage of W-FETI that uses 35%-80% matrix-vector-products (MxVs) than state-of-the-art DDM solvers
Numerical aspects of the PUFEM for efficient solution of Helmholtz problems
Conventional finite element methods (FEM) have been used for many years for the
solution of harmonic wave problems. To ensure accurate simulation, each wavelength
is discretised into around ten nodal points, with the finite element mesh being updated
for each frequency to maintain adequate resolution of the wave pattern. This
technique works well when the wavelength is long or the model domain is small.
However, when the converse applies and the wavelength is small or the domain of
interest is large, the finite element mesh requires a large number of elements, and
the procedure becomes computationally expensive and impractical.
The principal objective of this work is to accurately model two-dimensional Helmholtz
problems with the Partition of Unity Finite Element Method (PUFEM). This will
be achieved by applying the plane wave basis decomposition to the wave field. These
elements allow us to relax the traditional requirement of around ten nodal points per
wavelength and therefore solve Helmholtz wave problems without refining the mesh
of the computational domain at each frequency.
Various numerical aspects affecting the efficiency of the PUFEM are analysed in
order to improve its potential. The accuracy and effectiveness of the method are
investigated by comparing solutions for selected problems with available analytical
solutions or to high resolution numerical solutions using conventional finite elements.
First, the use of plane waves or cylindrical waves in the enrichment process is assessed
for wave scattering problems involving a rigid circular cylinder in both near field and
far field. In the far field, the cylindrical waves proved to be more effective in reducing
the computational effort. But given that the plane waves are simpler to analytically
integrate for straight edge elements, during the finite element assembling process,
they are retained for the remaining of the thesis.
The analysed numerical aspects, which may affect the PUFEM performance, include
the conjugated and unconjugated weighting, the geometry description, the use of
non-reflecting boundary conditions, and the h-, p- and q-convergence. To speed up
the element assembling process at high wave numbers, an exact integration procedure
is implemented.
The PUFEM is also assessed on multiple scattering problems, involving sets of circular
cylinders, and on exterior wave problems presenting singularities in the geometry
of the scatterer. Large and small elements, in comparison to the wavelength, are
used with both constant and variable numbers of enriching plane waves.
Last, the PUFEM resulting system is iteratively solved by using an incomplete lower
and upper based preconditioner. To further enhance the efficiency of the iterative
solution, the resulting system is solved into the wavelet domain.
Overall, compared to the FEM, the PUFEM leads to drastic reduction of the total
number of degrees of freedom required to solve a wave problem. It also leads to very
good performance when large elements, compared to the wavelength, are used with
high numbers of enriching plane waves, rather than small elements with low numbers
of plane waves. Due to geometry detail description, it is practical to use both large
and small elements. In this case, to keep the conditioning within acceptable limits
it is necessary to vary the number of enriching plane waves with the element size
Investigation of general-purpose computing on graphics processing units and its application to the finite element analysis of electromagnetic problems
In this dissertation, the hardware and API architectures of GPUs are investigated, and the corresponding acceleration techniques are applied on the traditional frequency domain finite element method (FEM), the element-level time-domain methods, and the nonlinear discontinuous Galerkin method. First, the assembly and the solution phases of the FEM are parallelized and mapped onto the granular GPU processors. Efficient parallelization strategies for the finite element matrix assembly on a single GPU and on multiple GPUs are proposed. The parallelization strategies for the finite element matrix solution, in conjunction with parallelizable preconditioners are investigated to reduce the total solution time. Second, the element-level dual-field domain decomposition (DFDD-ELD) method is parallelized on GPU. The element-level algorithms treat each finite element as a subdomain, where the elements march the fields in time by exchanging fields and fluxes on the element boundary interfaces with the neighboring elements. The proposed parallelization framework is readily applicable to similar element-level algorithms, where the application to the discontinuous Galerkin time-domain (DGTD) methods show good acceleration results. Third, the element-level parallelization framework is further adapted to the acceleration of nonlinear DGTD algorithm, which has potential applications in the field of optics. The proposed nonlinear DGTD algorithm describes the third-order instantaneous nonlinear effect between the electromagnetic field and the medium permittivity. The Newton-Raphson method is incorporated to reduce the number of nonlinear iterations through its quadratic convergence. Various nonlinear examples are presented to show the different Kerr effects observed through the third-order nonlinearity. With the acceleration using MPI+GPU under large cluster environments, the solution times for the various linear and nonlinear examples are significantly reduced
The Unified-FFT Method for Fast Solution of Integral Equations as Applied to Shielded-Domain Electromagnetics
Electromagnetic (EM) solvers are widely used within computer-aided design (CAD) to improve and ensure success of circuit designs. Unfortunately, due to the complexity of Maxwell\u27s equations, they are often computationally expensive. While considerable progress has been made in the realm of speed-enhanced EM solvers, these fast solvers generally achieve their results through methods that introduce additional error components by way of geometric approximations, sparse-matrix approximations, multilevel decomposition of interactions, and more. This work introduces the new method, Unified-FFT (UFFT). A derivative of method of moments, UFFT scales as O(N log N), and achieves fast analysis by the unique combination of FFT-enhanced matrix fill operations (MFO) with FFT-enhanced matrix solve operations (MSO).
In this work, two versions of UFFT are developed, UFFT-Precorrected (UFFT-P) and UFFT-Grid Totalizing (UFFT-GT). UFFT-P uses precorrected FFT for MSO and allows the use of basis functions that do not conform to a regular grid. UFFT-GT uses conjugate gradient FFT for MSO and features the capability of reducing the error of the solution down to machine precision. The main contribution of UFFT-P is a fast solver, which utilizes FFT for both MFO and MSO. It is demonstrated in this work to not only provide simulation results for large problems considerably faster than state of the art commercial tools, but also to be capable of simulating geometries which are too complex for conventional simulation. In UFFT-P these benefits come at the expense of a minor penalty to accuracy.
UFFT-GT contains further contributions as it demonstrates that such a fast solver can be accurate to numerical precision as compared to a full, direct analysis. It is shown to provide even more algorithmic efficiency and faster performance than UFFT-P. UFFT-GT makes an additional contribution in that it is developed not only for planar geometries, but also for the case of multilayered dielectrics and metallization. This functionality is particularly useful for multi-layered printed circuit boards (PCBs) and integrated circuits (ICs). Finally, UFFT-GT contributes a 3D planar solver, which allows for current to be discretized in the z-direction. This allows for similar fast and accurate simulation with the inclusion of some 3D features, such as vias connecting metallization planes
Multi-solver schemes for electromagnetic modeling of large and complex objects
The work in this dissertation primarily focuses on the development of numerical algorithms for electromagnetic modeling of large and complex objects.
First, a GPU-accelerated multilevel fast multipole algorithm (MLFMA) is presented to improve the efficiency of the traditional MLFMA by taking advantage of GPU hardware advancement. The proposed hierarchical parallelization strategy ensures a high computational throughput for the GPU calculation. The resulting OpenMP-based multi-GPU implementation is capable of solving real-life problems with over one million unknowns with a remarkable speedup. The radar cross sections (RCS) of a few benchmark objects are calculated to demonstrate the accuracy of the solution. The results are compared with those from the CPU-based MLFMA and measurements. The capability and efficiency of the presented method are analyzed through the examples of a sphere, an aircraft, and a missile-like object. Compared with the 8-threaded CPU-based MLFMA, the OpenMP-CUDA-MLFMA method can achieve from 5 to 20 times total speedup.
Second, an efficient and accurate finite element--boundary integral (FE-BI) method is proposed for solving electromagnetic scattering and radiation problems. A mixed testing scheme, in which the Rao-Wilton-Glisson and the Buffa-Christiansen functions are both employed as the testing functions, is first presented to improve the accuracy of the FE-BI method. An efficient absorbing boundary condition (ABC)-based preconditioner is then proposed to accelerate the convergence of the iterative solution. To further improve the efficiency of the total computation, a GPU-accelerated MLFMA is applied to the iterative solution. The RCSs of several benchmark objects are calculated to demonstrate the numerical accuracy of the solution and also to show that the proposed method not only is free of interior resonance corruption, but also has a better convergence than the conventional FE-BI methods. The capability and efficiency of the proposed method are analyzed through several numerical examples, including a large dielectric coated sphere, a partial human body, and a coated missile-like object. Compared with the 8-threaded CPU-based algorithm, the GPU-accelerated FE-BI-MLFMA algorithm can achieve a total speedup up to 25.5 times.
Third, a multi-solver (MS) scheme based on combined field integral equation (CFIE) is proposed. In this scheme, an object is decomposed into multiple bodies based on its material property and geometry. To model bodies with complicated materials, the FE-BI method is applied. To model bodies with homogeneous or conducting materials, the method of moments is employed. Specifically, three solvers are integrated in this multi-solver scheme: the FE-BI(CFIE) for inhomogeneous objects, the CFIE for dielectric objects, and the CFIE for conducting objects. A mixed testing scheme that utilizes both the Rao-Wilton-Glisson and the Buffa-Christiansen functions is adopted to obtain a good accuracy of the proposed multi-solver algorithm. In the iterative solution of the combined system, the MLFMA is applied to accelerate computation and reduce memory costs, and an ABC-based preconditioner is employed to speed up the convergence. In the numerical examples, the individual solvers are first demonstrated to be well conditioned and highly accurate. Then the validity of the proposed multi-solver scheme is demonstrated and its capability is shown by solving scattering problems of electrically large missile-like objects.
Fourth, a MS scheme based on Robin transmission condition (RTC) is proposed. Different from the FE-BI method that applies BI equations to truncate the FE domain, this proposed multi-solver scheme employs both FE and BI equations to model an object along with its background. To be specific, the entire computational domain consisting of the object and its background is first decomposed into multiple non-overlapping subdomains with each modeled by either an FE or BI equation. The equations in the subdomains are then coupled into a multi-solver system by enforcing the RTC at the subdomain interfaces. Finally, the combined system is solved iteratively with the application of an extended ABC-based preconditioner and the MLFMA. To obtain an accurate solution, both the Rao-Wilton-Glisson and the Buffa-Christiansen functions are employed as the testing functions to discretize the BI equations. This scheme is applied to a variety of benchmark problems and the scattering from an aircraft with a launched missile to demonstrate its accuracy, versatility, and capability. The proposed scheme is compared with the MS-CFIE to illustrate the differences between the two schemes.
Fifth, to further improve the modeling capability, an accelerated MS method is developed on distributed computing systems to simulate the scattering from very large and complex objects. The parallelization strategy is to parallelize different subdomains individually, which is different from the parallelized domain decomposition methods, where the subdomains are handled in parallel. The multilevel fast multipole algorithm is parallelized to enable computation on many processors. The modeling strategy using the MS-RTC method is also discussed so that one can easily follow the guideline to model large and complex objects. Numerical examples are given to show the parallel efficiency of the proposed strategy and the modeling capability of the proposed method.
Finally, the specific absorption rate (SAR) in a human head at 5G frequencies is simulated by taking advantage of the MS-RTC method. Based on the strong skin effect, the human head model is first simplified to reduce the computation cost. Then the MS-RTC method is applied to model the human head. Numerical examples show that the MS method is very efficient in solving electromagnetic fields in the human head and the simplified human head model can be used in the SAR simulation with an acceptable accuracy
Scalable domain decomposition methods for finite element approximations of transient and electromagnetic problems
The main object of study of this thesis is the development of scalable and robust solvers based on domain decomposition (DD) methods for the linear systems arising from the finite element (FE) discretization of transient and electromagnetic problems.
The thesis commences with a theoretical review of the curl-conforming edge (or Nédélec) FEs of the first kind and a comprehensive description of a general implementation strategy for h- and p- adaptive elements of arbitrary order on tetrahedral and hexahedral non-conforming meshes. Then, a novel balancing domain decomposition by constraints (BDDC) preconditioner that is robust for multi-material and/or heterogeneous problems posed in curl-conforming spaces is presented. The new method, in contrast to existent approaches, is based on the definition of the ingredients of the preconditioner according to the physical coefficients of the problem and does not require spectral information. The result is a robust and highly scalable preconditioner that preserves the simplicity of the original BDDC method.
When dealing with transient problems, the time direction offers itself an opportunity for further parallelization. Aiming to design scalable space-time solvers, first, parallel-in-time parallel methods for linear and non-linear ordinary differential equations (ODEs) are proposed, based on (non-linear) Schur complement efficient solvers of a multilevel partition of the time interval. Then, these ideas are combined with DD concepts in order to design a two-level preconditioner as an extension to space-time of the BDDC method. The key ingredients for these new methods are defined such that they preserve the time causality, i.e., information only travels from the past to the future. The proposed schemes are weakly scalable in time and space-time, i.e., one can efficiently exploit increasing computational resources to solve more time steps in (approximately) the same time-to-solution.
All the developments presented herein are motivated by the driving application of the thesis, the 3D simulation of the low-frequency electromagnetic response of High Temperature Superconductors (HTS). Throughout the document, an exhaustive set of numerical experiments, which includes the simulation of a realistic 3D HTS problem, is performed in order to validate the suitability and assess the parallel performance of the High Performance Computing (HPC) implementation of the proposed algorithms.L’objecte principal d’estudi d’aquesta tesi és el desenvolupament de solucionadors escalables i robustos basats en mètodes de descomposició de dominis (DD) per a sistemes lineals que sorgeixen en la discretització mitjançant elements finits (FE) de problemes transitoris i electromagnètics.
La tesi comença amb una revisió teòrica dels FE d’eix (o de Nédélec) de la primera família i una descripció exhaustiva d’una estratègia d’implementació general per a elements h- i p-adaptatius d’ordre arbitrari en malles de tetraedres i hexaedres noconformes.
Llavors, es presenta un nou precondicionador de descomposició de dominis balancejats per restricció (BDDC) que és robust per a problemes amb múltiples materials i/o heterogenis definits en espais curl-conformes. El nou mètode, en contrast amb els enfocaments existents, està basat en la definició dels ingredients del precondicionador segons els coeficients físics del problema i no requereix informació espectral. El resultat és un precondicionador robust i escalable que preserva la simplicitat del mètode original BDDC.
Quan tractem amb problemes transitoris, la direcció temporal ofereix ella mateixa l’oportunitat de seguir explotant paral·lelisme. Amb l’objectiu de dissenyar precondicionadors en espai-temps, primer, proposem solucionadors paral·lels en temps per equacions diferencials lineals i no-lineals, basats en un solucionador eficient del complement de Schur d’una partició multinivell de l’interval de temps. Seguidament, aquestes idees es combinen amb conceptes de DD amb l’objectiu de dissenyar precondicionadors com a extensió a espai-temps dels mètodes de BDDC. Els ingredients clau d’aquests nous mètodes es defineixen de tal manera que preserven la causalitat del temps, on la informació només viatja de temps passats a temps futurs. Els esquemes proposats són dèbilment escalables en temps i en espai-temps, és a dir, es poden explotar eficientment recursos computacionals creixents per resoldre més passos de temps en (aproximadament) el mateix temps transcorregut de càlcul.
Tots els desenvolupaments presentats aquí són motivats pel problema d’aplicació de la tesi, la simulació de la resposta electromagnètica de baixa freqüència dels superconductors d’alta temperatura (HTS) en 3D. Al llarg del document, es realitza un conjunt exhaustiu d’experiments numèrics, els quals inclouen la simulació d’un problema de HTS realista en 3D, per validar la idoneïtat i el rendiment paral·lel de la implementació per a computació d’alt rendiment dels algorismes proposatsPostprint (published version
Recommended from our members
Numerical and experimental modelling of microwave applicators
This thesis presents a time domain finite element method for the solution of microwave
heating problems. This is the first time that this particular technique has been applied
to microwave heating. It is found that the standard frequency domain finite element
method is unsuitable for analysing multimode applicators containing food-like materials
due to a severe ill-conditioning of the matrix equations. The field distribution in multimode
applicators loaded with low loss materials is found to be very sensitive to small
frequency changes. Several solutions at different frequencies are therefore required to
characterise the behaviour of the loaded applicator. The time domain finite element
method is capable of producing multiple solutions at different frequencies when used
with Gaussian pulse excitation; it is therefore ideally suited to the analysis of multimode
applicators. A brief survey of the methods available for the solution of the linear
equations is provided. The performance of these techniques with both the frequency
domain and time domain finite element methods is then studied.
Single mode applicators are also analysed and it is found that the frequency domain
method is superior in these cases. Comparisons are given between the calculated results
and experimental data for both single mode and multimode systems. The importance
of experimental verification being stressed.
The choice of element type is an important consideration for the finite element
method. Three basic types of element are considered; nodal, Whitney edge elements
and linear edge elements. Comparisons of the errors with these elements show that
Whitney elements produce a consistently lower error when post-processing is used to
smooth the solution.
The coupled thermal-electromagnetic problem is investigated with many difficulties
being identified for the application to multimode cavity problems
SCEE 2008 book of abstracts : the 7th International Conference on Scientific Computing in Electrical Engineering (SCEE 2008), September 28 – October 3, 2008, Helsinki University of Technology, Espoo, Finland
This report contains abstracts of presentations given at the SCEE 2008 conference.reviewe
- …