71 research outputs found
A cholesky-based SGM-MLFMM for stochastic full-wave problems described by correlated random variables
In this letter, the multilevel fast multipole method (MLFMM) is combined with the polynomial chaos expansion (PCE)-based stochastic Galerkin method (SGM) to stochastically model scatterers with geometrical variations that need to be described by a set of correlated random variables (RVs). It is demonstrated how Cholesky decomposition is the appropriate choice for the RVs transformation, leading to an efficient SGM-MLFMM algorithm. The novel method is applied to the uncertainty quantification of the currents induced on a rough surface, being a classic example of a scatterer described by means of correlated RVs, and the results clearly demonstrate its superiority compared to the non-intrusive PCE methods and to the standard Monte Carlo method
Weak Scalability Analysis of the Distributed-Memory Parallel MLFMA
Distributed-memory parallelization of the multilevel fast multipole algorithm (MLFMA) relies on the partitioning of the internal data structures of the MLFMA among the local memories of networked machines. For three existing data partitioning schemes (spatial, hybrid and hierarchical partitioning), the weak scalability, i.e., the asymptotic behavior for proportionally increasing problem size and number of parallel processes, is analyzed. It is demonstrated that none of these schemes are weakly scalable. A nontrivial change to the hierarchical scheme is proposed, yielding a parallel MLFMA that does exhibit weak scalability. It is shown that, even for modest problem sizes and a modest number of parallel processes, the memory requirements of the proposed scheme are already significantly lower, compared to existing schemes. Additionally, the proposed scheme is used to perform full-wave simulations of a canonical example, where the number of unknowns and CPU cores are proportionally increased up to more than 200 millions of unknowns and 1024 CPU cores. The time per matrix-vector multiplication for an increasing number of unknowns and CPU cores corresponds very well to the theoretical time complexity
A Well-Scaling Parallel Algorithm for the Computation of the Translation Operator in the MLFMA
This paper investigates the parallel, distributed-memory computation of the translation operator with L + 1 multipoles in the three-dimensional Multilevel Fast Multipole Algorithm (MLFMA). A baseline, communication-free parallel algorithm can compute such a translation operator in O(L) time, using O(L-2) processes. We propose a parallel algorithm that reduces this complexity to O(log L) time. This complexity is theoretically supported and experimentally validated up to 16 384 parallel processes. For realistic cases, the implementation of the proposed algorithm proves to be up to ten times faster than the baseline algorithm. For a large-scale parallel MLFMA simulation with 4096 parallel processes, the runtime for the computation of all translation operators during the setup stage is reduced from roughly one hour to only a few minutes
Improved polynomial chaos discretization schemes to integrate interconnects into design environments
- …