VoxCap: FFT-Accelerated and Tucker-Enhanced Capacitance Extraction
  Simulator for Voxelized Structures by Wang, Mingyu et al.
TMTT-2020-06-0637 1
 
Abstract—VoxCap, a fast Fourier transform (FFT)-accelerated 
and Tucker-enhanced integral equation simulator for capacitance 
extraction of voxelized structures, is proposed. The VoxCap solves 
the surface integral equations (SIEs) for conductor and dielectric 
surfaces with three key attributes that make the VoxCap highly 
CPU and memory efficient for the capacitance extraction of the 
voxelized structures: (i) VoxCap exploits the FFTs for 
accelerating the matrix-vector multiplications during the iterative 
solution of linear system of equations arising due to the 
discretization of SIEs. (ii) During the iterative solution, VoxCap 
uses a highly effective and memory-efficient preconditioner that 
reduces the number of iterations significantly. (iii) VoxCap 
employs Tucker decompositions to compress the block Toeplitz 
and circulant tensors, requiring the largest memory in the 
simulator. By doing so, it reduces the memory requirement of 
these tensors from hundreds of gigabytes to a few megabytes and 
the CPU time required to obtain Toeplitz tensors from tens of 
minutes (even hours) to a few seconds for very large scale 
problems. VoxCap is capable of accurately computing 
capacitance of arbitrarily shaped and large-scale voxelized 
structures on a desktop computer. VoxCap’s accuracy, efficiency, 
and capability are demonstrated through capacitance extraction 
of various large-scale structures, including the parallel meander 
lines discretized by more than a hundred million panels and 
analyzed on a commodity desktop computer.  
Index Terms—Capacitance extraction, electrostatic analysis, 
fast Fourier Transform (FFT), fast simulators, Tucker 
decomposition, surface integral equation (SIE), voxelized 
structures. 
I.  INTRODUCTION 
ecent developments in three-dimensional (3D) printing 
technology have allowed the designers to produce their 
own prototypes conveniently and reduce the prototype 
development time dramatically. Today’s 3D printers build the 
integrated circuits/packages/components voxel by voxel (i.e., 
cube by cube) [1], [2]. Furthermore, virtual fabrication 
 
Manuscript received June 20, 2020. This work was supported by Ministry of 
Education, Singapore, under grant AcRF TIER 1-2018-T1-002-077 (RG 
176/18), and the Nanyang Technological University under a Start-Up Grant. 
(Corresponding author: Abdulkadir C. Yucel.) 
M. Wang, C. Qian and A. C. Yucel are with the School of Electrical and 
Electronic Engineering, Nanyang Technological University, Singapore 
639798. (e-mails: mingyu003@e.ntu.edu.sg, cqian@ntu.edu.sg, 
acyucel@ntu.edu.sg). 
J. K. White is with the Department of Electrical Engineering and Computer 
Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. 
(email: white@mit.edu) 
environments for the process modeling of the semiconductors 
and microelectromechanical systems (MEMS) leverage voxels 
[3], [4]. By using these environments, the designers can 
iteratively explore their designs; they can alter their designs and 
check whether their designs meet certain design specifications. 
During the design process of 3D printed integrated circuits as 
well as the semiconductor devices and MEMS via the virtual 
fabrication environments, the designers are in need of fast and 
accurate parameter extraction simulators that can be operated 
on the structures discretized by voxels (i.e., voxelized 
structures). Unfortunately, the current literature does not have 
enough studies on the development of such simulators: An 
inductance extraction simulator for voxelized structures, called 
VoxHenry [5], was recently proposed. That said, there exists no 
study on the capacitance extraction for the voxelized structures 
so far. 
The literature abounds with the capacitance extraction 
simulators based on finite-difference [6]-[8], finite-element 
[9]-[11], integral equation [12]-[20] methods. While these 
methods are no panacea, none of them is developed for / 
directly applicable to the voxelized structures and can readily 
extract capacitances by only using the voxel coordinates 
provided from the virtual fabrication environment. When used 
for voxelized structures, all current simulators require the 
users’ intervention such as the inclusion of artificial absorbing 
boundaries or generation of complicated mesh. Furthermore, 
existing integral equation simulators (leveraging multipole 
expansions [19], pre-corrected fast Fourier transform (FFT) 
[20], or H/H2 matrices [15]) were developed to be operated on 
the meshed structures with the non-uniform elements, which 
are not (necessarily) residing on a structured grid. These 
simulators do not exploit the distinct features of the voxelized 
structure with uniform elements residing on a structured grid. 
Thereby, they are not sufficiently efficient when applied to the 
voxelized structures. Aside from that, these simulators cannot 
be easily coupled with the VoxHenry to extract impedances of 
the voxelized structures. To this end, a capacitance extraction 
simulator that can exploit all distinct features of voxelized 
structures, be used in conjunction with voxel-based virtual 
fabrication environments without users’ intervention, and be 
easily coupled with the VoxHenry simulator is called for. 
In this study, a capacitance extraction simulator for 
voxelized structures, called VoxCap, is proposed. The VoxCap 
VoxCap: FFT-Accelerated and Tucker-Enhanced 
Capacitance Extraction Simulator for Voxelized 
Structures 
Mingyu Wang, Student Member, IEEE, Cheng Qian, Jacob K. White, Fellow, IEEE, and Abdulkadir C. 
Yucel, Senior Member, IEEE 
R
TMTT-2020-06-0637 2
solves the surface integral equations (SIEs) for the conductor 
and dielectric surfaces after discretizing the charges on these 
surfaces via piecewise constant basis functions and obtaining a 
linear system of equations (LSE) via Galerkin testing. It 
leverages the following three features, which make the VoxCap 
highly CPU and memory-efficient while solving the LSE and 
obtaining the capacitances of voxelized structures:  
1. The VoxCap exploits the FFTs to accelerate the 
matrix-vector multiplications (MVMs) during the 
iterative solution of LSE.  
2.  The VoxCap uses an effective and memory-efficient 
block-diagonal-diagonal preconditioner to ensure the 
fast convergence of the iterative solution. 
3. The VoxCap leverages Tucker decompositions to 
dramatically reduce the memory requirement of the 
block circulant tensors as well as the CPU time required 
to obtain Toeplitz tensors. 
All these features make the contribution of this study threefold, 
as described below. The FFTs have been widely applied to 
integral equation simulators to accelerate the MVMs by 
exploiting the translationally-invariance property of the integral 
kernels sampled on the structured grids [5], [21]-[24]. Although 
it was applied to the 2D static problems in [25], there exists no 
study providing the implementation of the FFT acceleration for 
the capacitance extraction of 3D voxelized structures in the 
literature, to the best of our knowledge. Along with the FFT 
accelerations, a memory-efficient preconditioning technique 
hybridizing block-diagonal and diagonal preconditioners is 
proposed in this study by showing its effectiveness compared to 
those of traditional diagonal and block-diagonal 
preconditioners. Moreover, Tucker decompositions are 
proposed for the first time in this study to compress the block 
circulant and Toeplitz tensors of static integral kernels, which 
require the largest memory in the simulator. The proposed 
Tucker decomposition reduces the memory requirement of 
these tensors more than four orders of magnitude (from 
tens/hundreds of gigabytes to megabytes). The 
Tucker-compressed Toeplitz tensors are obtained once and for 
all and stored on hard-disk during the installation stage of the 
simulator. During the setup stage of each execution of the 
VoxCap, the compressed Toeplitz tensors are read from the 
hard disk in seconds, resized/updated by accounting for the 
sizes of computational domain and voxels, and used to obtain 
the block circulant tensors. Doing so reduces the setup stage of 
the proposed simulator from tens of minutes (even hours) to a 
few seconds for large-scale problems. During the iterative 
solution stage of the VoxCap, the compressed circulant tensors 
are restored/decompressed to their original format one-by-one 
and used in MVMs. Doing so significantly reduces the overall 
memory cost of the proposed simulator while imposing 
negligible computational overhead arising from the 
restoration/decompression operation.  
The accuracy and efficiency of the proposed VoxCap 
simulator are demonstrated (and compared with FastCap [19] 
when applicable) through its application to the capacitance 
extraction of various structures including a dielectric-coated 
perfect electric conductor (PEC) sphere, a parallel interconnect 
structure, a crossing bus structure, and a parallel meander line 
structure. The capability of the VoxCap simulator for solving 
very large-scale problems on a desktop computer is shown by 
its application to the capacitance extraction of a very 
large-scale parallel meander line structure discretized by more 
than one hundred million panels. Moreover, the memory saving 
achieved by and computational overhead imposed by the 
Tucker enhancement are extensively quantified for 3D and 
2D-like structures. The numerical results show that the 
proposed VoxCap outperforms the FastCap for many practical 
scenarios comprising densely packed interconnects. In 
particular, the VoxCap requires 23x and 47x less memory as 
well as 11x and 11.9x less CPU time compared to FastCap for 
the same level of accuracy in the analyses of parallel 
interconnect structure and parallel meander line structure, 
respectively. For the analysis of dielectric-coated cube 
discretized by 400 voxels along x-, y-, and z- directions, the 
Tucker enhancement reduces the memory requirement of the 
circulant tensors more than 10,000x while imposing a 
negligible computational overhead, one-sixth of the CPU time 
required for one convolution. For the same problem, the Tucker 
enhancement allows achieving more than 4655x speed-up for 
obtaining the Toeplitz tensors. Moreover, the scaling of the 
memory requirement of the Tucker-compressed tensors with 
respect to increasing computational domain size is found to be 
sub-linear, while that of the memory requirement of the original 
circulant and Toeplitz tensors are linear. It should be noted here 
that a presentation, which sketches the basic principles of the 
proposed simulator, has been given at an earlier symposium 
[26]. The detailed information on the simulator is presented for 
the first time in this paper. 
The proposed VoxCap simulator requires ( log )t tO N N  
computational and ( )tO N  memory resources, where tN  
denotes the number of voxels. Although these cost estimates 
appear to be not favorable compared to ( )O N  computational 
and memory cost estimates of the FastCap, the proposed 
VoxCap simulator is much faster and less memory demanding 
compared to the FastCap for many practical scenarios, where 
N  is the number of boundary panels. This is because the 
multiplicative factors inherent in the cost estimates of the 
VoxCap are much smaller than those in the cost estimates of the 
FastCap. To this end, especially when tN  is comparable to N  
(as in practical interconnect scenarios in Sections III.C, III.D, 
and III.E), the VoxCap is faster than FastCap. On the other 
hand, when tN  is much larger than N  (as in sphere example 
in Section III.A), the FastCap is expected to be faster than 
VoxCap. Note that these facts valid for the FFT-accelerated 
simulators were also discussed in [20]. 
II. FORMULATION 
In this section, the SIEs solved by VoxCap, their 
discretization, and resulting LSE are explained first. Next, the 
implementation details of the FFTs for the MVMs required 
during the iterative solution of LSE are provided. Then the 
proposed preconditioner for reducing the number of iterations 
during the iterative solution of LSE is described. Finally, the 
proposed Tucker decomposition/compression scheme is 
expounded. 
TMTT-2020-06-0637 3
A. SIEs and Their Discretization 
Let cS  and dS  denote PEC and dielectric surfaces with the 
charge densities c  and d , respectively [Fig. 1(a)]. The 
dielectric surface with outward normal nˆd  separates the 
dielectric region with permittivity d  from the background 
medium with permittivity b . The PEC and dielectric surfaces 
are assumed to be enclosed by a bounding box [Fig. 1(b)] that 
consists of tN  voxels with the edge size v , residing on a 
structured grid; t x y zN N N N   , where xN , yN , and zN  
denote the number of voxels along x-, y-, and z-directions, 
respectively. In this setting, the capacitance of the structure is 
computed by solving SIEs [27], which read 
          , , ,r r r r r r r r
c d
c d c
S S
G dS G dS S            (1) 
 
   
,
r r
rd b d
d d
S
n n
 
  
 
 
 , (2) 
where 0( , ) 1/ (4 | |)r r r rG     is the free-space Green’s 
function, r  and r  denote the observer and source points on 
the surfaces, respectively, and 0  is the permittivity of free 
space. While ( )r  denotes the potential on the surface, ( )r  
and ( )r  represent the potentials while approaching to the 
surface from dielectric and background media, respectively. 
(Note: c  of the PEC surfaces embedded in a dielectric 
medium is to be scaled with the permittivity of the dielectric 
medium, as discussed below.) To solve (1) and (2), c  and d  
are discretized using the piecewise constant basis functions lw  
defined on cN  conductor and dN  dielectric panels as  
        , ,
1 1
,
c c d
c
N N N
c l c l d l d l
l l N
w w   

  
     r r r r   (3) 
where ( ) 1rlw    for r lS  ; ( ) 0rlw   , otherwise. lS  denotes 
the surface of thl  panel and ,c l  and ,d l are the unknown 
coefficients of the charge density on the PEC and dielectric 
panels, respectively. Substituting (3) into (1)-(2), applying 
Galerkin testing to the resulting equations with ( )rkw , 
1, ,k N  , and evaluating limits for (2) when l k  [28] 
yields an N N  LSE as ( c dN N N  ) 
 
0
0
ρV P
ρI E
c
d
                       
 , (4) 
where ,1 ,2 ,[ , , , ]ρ c
T
c c c c N    , ,1 ,2 ,[ , , , ]ρ d
T
d d d d N     
are the charge coefficient vectors, P  and E  matrices with 
dimensions cN N  and dN N  relate the potentials and 
electric fields tested on the panels with the charges, 
respectively. The entries of Pkl , 1, , ck N  , 1, ,l N  , are   
      , ,P r r r r
k l
kl k lS S
w w G dS dS       (5) 
and those of klE  and diagonal matrix klI , 1, , dk N  , 
ck k N    , 1, ,l N  , are obtained for non-overlapping and 
overlapping panels, respectively, via 
      ,
k l
kl k lS S
k
w w G dS dS
n  
   
  E r r r r  , (6) 
    0( ) 2 ( )kl k d b d bA       I .  (7) 
Here kA   and / kn    represent the area of kS   and partial 
derivative along the normal to kS  , respectively. k k kA V , 
1, , ck N  , k  is the potential applied to the panel on the 
conductors. For near panel interactions, the integrals in (5) and 
(6) are evaluated via analytical formulae in Appendix C of [29] 
and Appendix A, respectively. For far panel interactions, those 
integrals are evaluated using numerical quadrature and 
differentiation [28]. (Note: Although the method is explained 
on a single dielectric medium, it is straightforwardly extended 
for multi-dielectric media by properly setting the permittivity 
values in (7).)  
To compute the self and mutual capacitances of m  
conductors, a unit potential is applied to each conductor 
separately while the potential of the remaining conductors is set 
to zero, and the LSE in (4) is solved for charge densities. The 
resulting m  number of V  and ρc  vectors are stored in 
matrices as 1 2[ , , , ]V V V Vm   and 1 2[ , , , ]ρ ρ ρ ρmc c c c  , 
which are then used to compute m m  capacitance matrix C as  
 C V ρT c . (8) 
For the conductor embedded in a dielectric medium, its surface 
charge is converted to free charge density by multiplying it with 
0/d   and used in (8) [28].  
The LSE in (4) is iteratively solved for ρc  and ρd  via a 
sequence of MVMs. During the iterative solution, the 
computational cost of MVMs and the memory requirement of 
the system matrices scale with 2( )O N ; the former and the 
latter are reduced to ( log )t tO N N  and ( )tO N , respectively, 
via the FFT method explained next. Such computational cost 
and memory requirements are highly favorable for the 
capacitance extraction of the densely packed interconnects 
[20]. 
 
Fig. 1. An example scenario: (a) A structure consisting of interconnects in a 
dielectric substrate and (b) its discretization via voxels.  
B. FFT Acceleration 
In the FFT acceleration technique, the multiplications of 
matrices P  and E  with the vectors ρc  and ρd  are considered 
by grouping the panel interactions with respect to their unit 
normal as  
TMTT-2020-06-0637 4
 
, , ,
, , ,
, , ,
, , ,
, , ,
, , ,
0
0
0
0
0
0
V P P P
I E E E
ρ
V P P P
ρ
E E EI ρ
V P P P
E E EI
x x x x y x z
x x x y x z
x
y y x y y y z
y
y x y y y z
z
z z x z y z z
z x z y z z
                                                             
, (9) 
where [ ; ]ρ ρ ρx x xc d , [ ; ]ρ ρ ρ
y y y
c d , and [ ; ]ρ ρ ρ
z z z
c d  are the 
vectors of the charge coefficients of the panels with unit normal 
pointing along x-, y-, and z- directions. Here the block matrix 
,P y x , for example, stores the potentials generated by the 
charges on the panels with unit normal pointing along 
x-direction and tested on panels with unit normal pointing along 
the y-direction. The full block matrices are multiplied with 
charge coefficient vectors as  
 , ,,C P ρ D E ρ            (10) 
where , { , , }x y z   . Note that the multiplications of ρ  with 
the diagonal matrix I  in (9) are computationally cheap and 
performed traditionally. However, the multiplications of ρ  
with the full block matrices ,P   and ,E   are 
computationally expensive and thereby accelerated via FFT 
technique. In this technique, the multiplications are performed 
by taking into account the panels of tN  voxels instead of N  
boundary panels on the PEC and dielectric surfaces. To do that, 
first, ( 1)x y zN N N   , ( 1)x y zN N N   , and 
( 1)x y zN N N    numbers of voxel panels with unit normal 
pointing along x-, y-, and z-directions are identified, 
respectively. These voxel panels lie on structured grids and 
thereby their interactions are characterized by block Toeplitz 
tensors. The block Toeplitz tensors ,   and ,   are used to 
obtain block circulant tensors ,   and ,   corresponding to 
blocks ,P   and ,E  , , { , , }x y z   . The procedure to 
obtain the block Toeplitz tensors and the circulant tensors are 
explained in Appendix B. The block circulant tensors are 
Fourier transformed and stored as ,   and ,  during the 
setup stage of the simulator. During the iterative solution stage, 
they are used to perform the MVMs corresponding to each 
block as  
   , ,* , *IFFT IFFT                  , (11) 
where *  denotes the tensor-tensor multiplication, {}IFFT   is 
the inverse FFT operator, and { }FFT   . The tensor 
  with the dimensions 2( 1) 2( 1) 2( 1)x y zN N N      is 
filled by the samples from ρ  and zeros [30]. The results of 
MVMs, C  and D , are obtained from the entries of   and 
 .  
A couple of notes regarding the computation of circulant 
tensors and their deployment in FFT operations are in order: 
1) Although the dimensions of each circulant tensor vary due to 
the different numbers of voxel panels aligned along x-, y-, and 
z-directions, the dimensions of all circulant tensors are enlarged 
to 2( 1) 2( 1) 2( 1)x y zN N N      by carefully padding zeros, 
as explained in Appendix B. Such zero-padding enables to 
reduce the number of FFT operations from nine to three while 
computing  and the number of IFFT operations from nine to 
three while computing each of   and  , { , , }x y z  .  
2) By exploiting the symmetry and invoking the properties of 
Fourier transform [25], some of the FFT’ed circulant tensors 
are obtained from others via complex conjugation as  
     , , , , , ,, ,y x x y z x x z z y y zconj conj conj             . (12) 
During tensor-tensor multiplication in (11), the conjugation is 
performed on the fly and thereby the memory is not required to 
store ,y x , ,z x , and ,z y . 
3) The directions of the panels’ unit normal should be carefully 
taken into account while obtaining the entries of D  from  . 
The entries of D  for boundary panels with unit normal 
pointing along positive x-, y-, and z- directions are directly 
retrieved from the entries of  , whereas those for boundary 
panels with unit normal pointing along negative x-, y-, and z- 
directions are obtained from the entries of   after flipping 
the signs of entries. 
C. Tucker Enhancement 
The proposed VoxCap simulator uses Tucker 
decompositions to compress the Toeplitz tensors ,   and 
,  , , { , , }x y z   , once and for all during the installation 
stage. During the setup stage of the simulator’s execution, the 
compressed Toeplitz tensors are read from the hard disk and 
used to obtain the circulant tensors. Doing so allows reducing 
the setup stage time of the simulator from tens of minutes (and 
even hours) to seconds, as shown in Table I of Section III.B.3. 
To do that, during the installation stage of the simulator, the 
Toeplitz tensors ,   and ,   are computed for a large 
computational domain (say 1,000x y zN N N   ) by setting 
1 mv  , compressed by Tucker decompositions, and stored 
on the hard disk. During the setup stage of each execution, the 
compressed tensors requiring megabytes of memory are read 
from the hard disk and restored to their original format. The 
restored Toeplitz tensors are resized with respect to the 
dimensions of the computational domain required for the 
structure being analyzed. Next, the restored Toeplitz tensors are 
multiplied with scaling factors related to the voxel size. These 
scaling factors, derived in Appendix C, are 3( )v  and 2( )v  
for ,   and ,  , respectively. Finally, the restored 
Toeplitz tensors are used in the embedding procedure explained 
in Appendix B to obtain circulant tensors.  
In addition, the proposed VoxCap simulator leverages 
Tucker decompositions to reduce the memory requirement of 
the FFT’ed circulant tensors ,   and ,  , , { , , }x y z   . 
Doing so allows reducing the memory footprint of the simulator 
around a factor of three and half for the capacitance extraction 
of dielectric coated PEC structures, as shown in numerical 
TMTT-2020-06-0637 5
example in Section III.A. To do that, during the setup stage of 
the simulator, all ,   and ,  , { , , }x y z    are 
compressed by Tucker decomposition. During the iterative 
solution stage, each compressed tensor is 
restored/decompressed to its original format one-by-one. The 
computational penalty associated with the restoration operation 
is negligible compared to one convolution operation perform 
during MVMs, as shown in the numerical results section. It 
should be noted that the abovementioned Tucker enhancement 
strategies have recently been applied to the data structure 
involving the magneto-quasi-static kernels in VoxHenry as 
well [31].  
The Toeplitz and FFT’ed circulant tensors are compressed 
by Tucker decompositions as [24], [32]-[34]  
 1 2 31 2 3   U U U  ,  (13) 
where the tensor   with dimensions 1 2 3D D D   represents 
,  , ,  , ,  , or ,  , , { , , }x y z   ,   is the core 
tensor with dimensions 1 2 3r r r  , 
iU , 1, ,3i   , represents 
the factor matrices with dimensions i iD r , and ir  is the 
multilinear rank pertinent to thi  dimension. A tensor is Tucker 
compressible in case 31 2 3 1 2 31( ) i iir r r D r D D D  . The 
symbol i , 1, ,3i   , stands for i  mode matrix-tensor 
multiplication. The procedure to obtain the   and iU  for 
given tolerance, tol , is provided in [32], [33].  Typically, tol  
is set to 810  in this study to achieve high compressibility 
without sacrificing from the accuracy, unless stated otherwise.  
D.  Block-Diagonal-Diagonal Preconditioner 
The proposed VoxCap simulator uses a 
block-diagonal-diagonal preconditioner to ensure the rapid 
convergence of the iterative solution of (4). To this end, during 
each MVM, the block diagonal preconditioner BDR  and 
diagonal preconditioner DR  are applied to (4) as  
 
0
0
cBD BD
dD D
                                   
ρVR R P
ρIR R E
  (14) 
The entries of the diagonal matrix DR  are directly obtained 
from the inverse of the diagonal matrix klI . The blocks of 
block-diagonal matrix BDR  are formed by (i) splitting the 
bounding box enclosing the structure into small boxes, (ii) 
computing the interactions between basis functions in each 
small box separately, and (iii) assigning the inverses of the 
submatrices storing the interactions as blocks of BDR . Each 
small box comprises of vxN , vyN , and vzN  voxels along x-, y-, 
and z-directions. For an example PEC interconnect scenario, 
this partitioning and the basis functions/panels used to form the 
blocks are shown in Fig.2 ( 16vx vyN N  and 1vzN   ). Two 
important points regarding to the construction and storage of 
the blocks are in order: (i) The interactions between basis 
functions in each small box are directly obtained from one-time 
generated circulant tensors ,   and ,  , , { , , }x y z    
for a small box consisting of vxN , vyN , and vzN  voxels. This 
yields significant computational saving while constructing the 
preconditioner. Note that the circulant tensors store all possible 
basis function interactions for a small box. By using these 
tensors, the computational time required to compute (5) for 
similar basis function pairs in blocks are avoided. (ii) Many 
blocks in BDR  are the replicates of each other for a voxelized 
structure. To this end, only unique blocks are stored and the 
memory requirement of the preconditioner is minimized.  
 
 
Fig. 2. The partitioning of a voxelized structures (left) via three and two small 
boxes along x- and y-directions (right). Each block of BDR  is formed by the 
interactions between boundary panels (indicated by red color) within each 
small box; the boundary panels with unit normal pointing along z-direction are 
excluded from the right figure for illustration purposes.  
III. NUMERICAL RESULTS 
In this section, several numerical examples that show the 
applicability, memory and CPU efficiency, and accuracy of the 
proposed VoxCap simulator are presented. In the following 
analysis, when applicable, the capacitance values obtained by 
VoxCap and FastCap are compared with each other or those 
obtained by an analytical formula. The discrepancy between the 
results is quantified through the relative error defined as 
| ( ) / |err F F F    , where F  and F  denote the capacitance 
values obtained by VoxCap/FastCap and analytical formula, 
respectively. Furthermore, the charge distributions on the 
structures are plotted in the logarithmic (dB) scale after all 
charge values are normalized by the maximum charge value on 
the structure and the logarithm of normalized values are 
multiplied by twenty. The proposed VoxCap simulator was 
implemented in Matlab, while the FastCap simulator executes a 
C code. The linear system of equations in (4) is iteratively 
solved by generalized minimal residual method (GMRES) with 
a restart every 35 iterations until the relative residual error 
(RRE) reaches to 410 , unless stated otherwise. Likewise, 
FastCap also uses GMRES with the same restart to reach to the 
same RRE; it uses an overlapped diagonal preconditioner to 
reduce the number of iterations [28]. All simulations are 
executed on an Intel Xeon Gold 6412 CPU with 384 GB RAM.  
A. The Dielectric-Coated PEC Sphere  
First, the proposed VoxCap simulator and FastCap 
simulator (with 4th order multipole expansion) are used to 
obtain the self-capacitance of a 0.25 m - radius PEC sphere 
coated by a dielectric shell with relative permittivity r  and 
radius of 0.5 m  [Fig. 3(a)]. The structure is discretized by 
TMTT-2020-06-0637 6
voxels of size {0.05,0.025,0.02, 0.01} mv  , which gives 
rise to {8,000, 64,000, 125,000, 1,000,000}tN   and 
{2,376, 9, 480, 14,760, 59,016}N  , respectively. For 
2r  and all voxel sizes, the self-capacitance values 
computed by the VoxCap and FastCap simulators are compared 
with those obtained by the analytical formula, which reads as 
04 / (( ) ( ))r d c d c r cC r r r r r     , where cr  and dr  are the 
radii of PEC sphere and dielectric shell, respectively; the 
relative difference between results err  is plotted [Fig. 3(a)]. 
Apparently, while err  decreases with increasing 1/ v , the 
accuracy achieved by both VoxCap and FastCap simulators is 
nearly the same. Their accuracy stagnates at the same level due 
to staircase approximation to the spherical shape, which clearly 
appears in the normalized charge distributions plots for the 
structure discretized with {0.025,0.01} mv   [Fig. 3(b)]. For 
the simulations of the structures discretized with 
{0.05,0.025,0.02, 0.01} mv  ,  the proposed VoxCap and 
FastCap simulators required CPU time of 
1.54,  10.30,  11.75,  1{ 55.52} s  and   
0.48,  2.02,  3.51,  1{ 4.26} s , iterations of , 7, 1{7 0, 10} and 
, 7,{7 7, 9} during the iterative solution, and the memory of 
{8, 52, 97, 681} MB  and {135, 612, 992, 3009} MB,  
respectively. Note that the Tucker enhancement reduces the 
memory requirement from 2,440 MB to 681 MB (more than a 
factor of three and half) for the analysis of the structure 
discretized with 0.01 mv   while imposing a computational 
penalty of 33%. It is expected that the VoxCap is much more 
memory efficient compared to FastCap while the FastCap is 
faster than VoxCap for this validation example with 
well-separated panels and high /tN N  ratio. It is shown below 
in the examples with densely-packed interconnects and low 
/tN N  ratios that the VoxCap is indeed much faster than 
FastCap for practical scenarios.  
Next, r  is swept from 2 to 72 10  with one-decade 
increment at each simulation of the structure discretized by 
0.05 mv  [Fig. 3(c)]. For each simulation, an RRE of 810  
is achieved during the iterative solution of (4) to avoid any 
inaccuracy causing from the iterative solution. In case such 
RRE is not achieved, accurate results can not be obtained for 
large r  values. Clearly, the proposed VoxCap produces 
accurate results for all r  values, while FastCap yields 
inaccurate results with increasing permittivity, as indicated in 
[35]. Finally, the effectiveness of the proposed preconditioner 
is examined for the structure discretized by 0.01 mv   and 
r  set to 2 . For this case, the proposed VoxCap simulator 
iteratively solves (4) without using a preconditioner and with 
using the proposed block-diagonal-diagonal  preconditioner, a 
diagonal preconditioner, and block-diagonal preconditioner. 
The blocks in proposed and block-diagonal preconditioners are 
formed by setting 10vx vy vzN N N    . For all these cases, 
RRE at each iteration is plotted [Fig. 3(d)]. The number of 
iterations required to reach 810  is 22, 27, 41 and 142 when the 
proposed preconditioner, the block diagonal preconditioner, the 
diagonal preconditioner, and no preconditioner are applied, 
respectively. Needless to say, the proposed preconditioner 
outperforms all other preconditioners and yields the fastest 
convergence. Furthermore, it requires 17.04 MB memory, a 
quarter of 70.26 MB memory requirement of the conventional 
block-diagonal preconditioner.  
 
 
 
 
Fig. 3. Dielectric-coated PEC sphere example: (a) The err in the 
self-capacitance computed by VoxCap and FastCap with decreasing voxel size  
(for 2r  ). (b) The normalized charge distribution on the structure when 
0.025 mv   (left) and 0.01 mv   (right). (Note: The panels with 
x-coordinate greater than 0.85 m are removed for illustration purposes.) (c) The 
err  in self-capacitance obtained by VoxCap and FastCap with increasing r  
(for 0.05 mv  ). (d) The RRE history of the LSE’s iterative solution when 
the proposed preconditioner, no preconditioner, and conventional 
preconditioners are applied. 
TMTT-2020-06-0637 7
 
B. Numerical Tests on Tucker Enhancement 
In this part, the memory saving achieved by and 
computational overhead introduced by Tucker decomposition 
are investigated while compressing and restoring the circulant 
tensors. In addition, the computational saving achieved by 
obtaining the Toeplitz tensors from their compressed 
representations is demonstrated. To this end, the memory 
saving is quantified via compression ratio (CR), which is the 
ratio of memory requirement of original tensors to the memory 
requirement of the compressed tensor. Furthermore, the 
computational overhead (CO) is quantified by taking the ratio 
of the time for restoring the tensor to the time for performing 
one convolution with that tensor. 
    1) A dielectric coated PEC plate: First, a dielectric coated 
PEC plate with varying width and length is considered to test 
the performance of Tucker compression/decompression for a 
2D-like structure [Fig. 4 (a)]. The structure fully enclosed by 
the computational domain has the length and width varying 
from 100 μm  to 1500 μm  while 1 μmv  . For the 
structure with varying dimensions, Tucker decomposition is 
used to compress the circulant tensors; the CR achieved by and 
CO imposed by the Tucker decomposition are plotted in Figs. 
4(a) and (b). Clearly, CR increases with increasing tolerance 
and the structure size (or number of voxels tN ). Tucker 
enhancement achieves more than 700x reduction (for 
410tol  ) in the memory of circulant tensors for the largest 
structure. Moreover, CO decreases with increasing 
computational domain size and reaches to 0.05 for the largest 
structure size, which shows that the computational penalty 
associated with the tensor restoration operation is negligible. 
This negligible penalty is due to the fact the multilinear ranks of 
the compressed tensors remain nearly constant with increasing 
computational domain size. Fig. 4(c) demonstrates the change 
of the maximum multilinear rank of ,x x  (as an example) with 
increasing tN  for different tol  values. Clearly, the maximum 
rank is a nearly constant function of tN  and changes between 
10 and 30. Fig. 4(d) shows the memory requirement of the 
original tensors as well as the compressed tensors with 
increasing structure size. Needless to say, the memory of 
original tensors scales with ( )tO N  while the memory of 
compressed tensors scales with 0.44( log( ))t tO N N .  
 
 
 
 
Fig 4. The performance of Tucker compression/decompression on a 2D-like 
dielectric-coated PEC plate. (a) CR achieved by and (b) CO introduced by the 
Tucker enhancement with increasing tN . (c) The maximum multilinear rank of 
,x x  and (d) the memory scaling of the original and compressed FFT’ed 
circulant tensors with increasing tN . 
 
    2) A dielectric coated cube: Next, a dielectric coated PEC 
cube with varying edge length is considered for testing the 
performance of Tucker compression/decompression for a 3D 
structure [Fig. 5(a)]. The cube fully enclosed by the 
computational domain has the edge length changed from 
63 μm to 400 μm  while 1 μmv  . For the structure with 
varying edge length, the circulant tensors are compressed by 
Tucker decompositions. The CR achieved by and CO 
introduced by Tucker compression and decompression are 
plotted in Figs. 5(a) and (b), respectively. Again, the CR 
increases with increasing tolerance and structure size (or 
number of voxels tN ). Tucker enhancement achieves more 
than 10000x (four orders of magnitude) reduction (for 
410tol  ) in the memory of circulant tensors for the largest 
structure. Compared to dielectric coated PEC plate case, this 
dramatic reduction is expected as the tensor methods yield 
much better compression for the tensors with many elements 
along all three dimensions compared to the tensors with many 
elements along two dimensions and less elements in the 
remaining dimension, as in dielectric coated PEC plate case. 
Moreover, CO decreases with increasing structure size and 
reaches to 0.154 for the largest structure size for different tol  
TMTT-2020-06-0637 8
values. This again shows that the computational penalty 
associated with the tensor restoration operation is negligible. 
Fig. 5(c) shows the change of the maximum multilinear rank of 
,x x  with increasing tN  for different tol  values. Clearly, the 
maximum rank negligibly increases in the range from 10 to 20. 
Fig. 5(d) demonstrates the memory requirement of the original 
tensors as well as the compressed tensors with increasing 
structure size. While the memory of original tensors scales with 
( )tO N , the memory of compressed tensors scales with 
0.26( log( ))t tO N N , which is better than 
0.44( log( ))t tO N N  
obtained for the dielectric-coated PEC plate.  
 
 
 
 
Fig. 5. The performance of Tucker compression/decompression on a 3D 
dielectric-coated PEC cube. (a) CR achieved by and (b) CO introduced by the 
Tucker enhancement with increasing tN . (c) The maximum multilinear rank of 
,x x  and (d) the memory scaling of the original and compressed FFT’ed 
circulant tensors with increasing tN . 
 
3) Performance on Toeplitz Tensor: Next, the computational 
time saving achieved by obtaining Toeplitz tensors from their 
Tucker-compressed versions is quantified. To this end, the 
dielectric coated cube in the previous analysis is considered. Its 
edge length is changed from 100 μm  to 400 μm  with 
increment of 100 μm  while 1 μmv  . For the analysis of 
cube with different edge lengths, the memory requirement of 
the original Toeplitz tensors and Tucker-compressed Toeplitz 
tensors are tabulated in Table I. Furthermore, the CPU time 
required to generate original Toeplitz tensor and the CPU time 
required to obtain the Toeplitz tensor from Tucker-compressed 
Toeplitz tensors are provided. Apparently, the compressed 
tensors stored on hard disk require a few MBs memory for 
810tol   while the original ones require around 1 GB memory 
for the largest structure. While the CPU time to obtain the 
Toeplitz tensors is around 1.3 hours for the largest structure, the 
total CPU time to obtain the Toeplitz tensors from compressed 
tensors (including the CPU time for reading from harddisk and 
restoration from the compressed tensors) is 1.04 second for the 
same structure. By obtaining the Toeplitz tensors from Tucker 
compressed ones, 4655x speed-up is achieved for the largest 
structure.  
TABLE I 
MEMORY AND CPU TIME REQUIREMENT FOR OBTAINING TOEPLITZ TENSORS 
BY COMPUTATION AND FROM TUCKER-COMPRESSED TENSORS  
Edge 
length of 
cube 
( μm ) 
Memory of 
original 
Toeplitz 
tensors (MB) 
Memory for 
compressed 
tensors (MB) 
CPU time to 
generate original 
Toeplitz tensors 
(s) 
CPU time to obtain 
Toeplitz tensors 
from compressed 
tensors (s) 
100 15.72 1.57 85.27 0.21 
200 123.91 2.76 616.39 0.58 
300 416.12 3.91 1858.59 0.822 
400 983.91 5.10 4841.59 1.04 
 
C. Parallel Interconnects 
In this practical example, a five by twelve parallel 
interconnect array embedded in a dielectric substrate ( 7r  ) 
with 0.02 mm spacing from the surfaces of the substrate is 
considered [Fig. 6(a)]. The width, length, and height of each 
interconnect are 0.08, 0.8, and 0.02 mm, respectively, while the 
center-to-center distance between interconnects is 0.1 mm. The 
discretization of the structure with 0.005 mmv   yields 
1,780,800tN   and 481,024N   (note that / 3.7tN N  ). 
The VoxCap with 10vx vy vzN N N    and FastCap with 2nd, 
4th, and 6th order multipole expansions are used to obtain the 
capacitance matrix for the structure. Figs. 6(a)-(c) compare the 
values in the first column of the capacitance matrix obtained by 
the VoxCap and FastCap with 2nd, 4th, and 6th order multipole 
expansions. Clearly, only the FastCap with 6th order multipole 
expansion provides accurate results while the results obtained 
by the FastCap with 2nd and 4th order multipole expansions are 
highly inaccurate. The maximum relative differences between 
results of VoxCap and FastCap with 2nd, 4th, and 6th order 
multipole expansions are 15.96, 7.54, and 0.15, respectively. 
For this example, VoxCap and FastCap with 2nd, 4th, and 6th 
order multipole expansions required 2413, 2843, 2064, and 
2063 iterations during the iterative solution, respectively. Fig. 
6(c) also demonstrates the normalized charge distribution 
TMTT-2020-06-0637 9
obtained by VoxCap when the structure is excited from the 45th 
conductor (the conductors are numbered from left to right 
starting from left bottom corner). Table II compares the 
memory and computational time requirements of VoxCap 
simulator and FastCap simulator with 2nd, 4th, and 6th order 
multipole expansions. For the same level of accuracy, the 
VoxCap with Tucker enhancement required 23x less memory 
and 11x less CPU time compared to the FastCap with 6th order 
multipole expansion. In this example, Tucker enhancement 
achieves more than 502x reduction in the memory of circulant 
tensors. Furthermore, it reduces the memory requirement of the 
simulator from 5.54 GB to 2.48 GB by a factor 2.23 while 
increasing the CPU time requirement by 10%. 
 
 
 
Fig. 6. Parallel interconnect example. The comparison of the values in the first 
column of the capacitance matrix obtained by the VoxCap and the FastCap with 
(a) 2nd order and (b) 4th order multipole expansions, as well as (c) 6th order 
multipole expansions and normalized charge distribution on the structure. 
(Note: For the illustration purposes, the panels on the dielectric substrate with 
x-coordinates larger than 0.83 mm were removed.) 
TABLE II 
MEMORY AND CPU TIME REQUIREMENTS OF VOXCAP AND FASTCAP 
SIMULATORS FOR PARALLEL INTERCONNECT EXAMPLE 
 Memory 
requirement (MB) 
CPU time 
requirement (s) 
VoxCap (w/out Tucker) 5,536 4,613 
VoxCap (w/ Tucker) 2,473 5,080 
FastCap w/ 2nd order expansion 17,200 14,647 
FastCap w/ 4th order expansion 33,400 29,352 
FastCap w/ 6th order expansion 56,900 57,644 
 
D. Crossing Buses  
      In this example, a crossing bus structure with six layers of 
dielectrics [36] is considered [Fig. 7(a)]. In this structure, 2nd, 
4th, and 6th layers consist of a 40 nm-thick dielectric with 
5.0r  , while the 1st, 3rd, and 5th layers, each of which houses 
15 buses, are filled with 220 nm-thick dielectric with 2.6r  . 
Moreover, the buses in the 3rd layer are coated by a 10 nm-thick 
dielectric with 3.7r  . The width, height, and length of each 
bus are 70 nm, 140 nm, and 2,030 nm, respectively, while their 
center-to-center distance is 140 nm. The distance between the 
leftmost/rightmost buses and the surfaces of the dielectrics is 70 
nm. The structure is discretized by 10 nmv  . This 
discretization yields 3,672,942tN   and 849,847N   
( / 4.32tN N  ). VoxCap with 10vx vy vzN N N    and 
FastCap with 2nd and 4th order multipole expansions are used to 
obtain the capacitance matrix for this structure. Figs. 7(a)-(b) 
compare the values in the first column of the capacitance matrix 
obtained by VoxCap and FastCap with 2nd and 4th order 
multipole expansions. Fig. 7(b) also shows the normalized 
charge distribution obtained by VoxCap when the structure is 
excited from the foremost conductor (with the center 
coordinates of  (1085,105,590) nm) in the 5th layer. Clearly, 
FastCap with 4th order multipole expansion provides accurate 
results while the results obtained by FastCap with 2nd order 
multipole expansions are inaccurate. The maximum relative 
differences between results of VoxCap and FastCap with 2nd 
and 4th order multipole expansions are 1.0069 and 0.091, 
respectively. For this example, VoxCap and FastCap with 2nd 
and 4th order multipole expansions required 1546, 2512, and 
1637 iterations, respectively. Table III compares the memory 
and computational time requirements of VoxCap simulator and 
FastCap simulator with 2nd and 4th order multipole expansions. 
For the same level of accuracy, the VoxCap with Tucker 
enhancement required 23x less memory and 10x less CPU time 
compared to the FastCap with 4th order multipole expansion. In 
this example, Tucker enhancement achieves more than 801x 
reduction in the memory of circulant tensors. Furthermore, it 
reduces the memory requirement of the simulator from 10.37 
GB to 3.75 GB by a factor 2.76 while increasing the CPU time 
requirement by 29.8%. Although VoxCap permits efficient and 
accurate analysis of the structures consisting of multiple 
thin/thick layers and conductors, its performance could decline 
for the analysis of the structures with very thin 
dielectrics/conductors and multiscale features due to increasing 
/tN N  ratio, as do all voxel-based techniques. 
 
 
TMTT-2020-06-0637 10 
 
 
Fig. 7. Crossing bus example. The comparison of the values in the first column 
of the capacitance matrix obtained by the VoxCap and the FastCap with (a) 2nd 
order and (b) 4th order multipole expansions and the normalized charge 
distribution on the structure. (Note: For illustration purposes, the panels on the 
dielectric substrate with x-coordinates larger than 2105 nm, y-coordinates less 
than 65 nm, z-coordinates larger than 750 nm were removed.). 
 
TABLE III 
MEMORY AND CPU TIME REQUIREMENTS OF VOXCAP AND FASTCAP 
SIMULATORS FOR CROSSING BUS EXAMPLE 
 Memory 
requirement (MB) 
CPU time 
requirement (s) 
VoxCap (w/out Tucker) 10,372 5,729 
VoxCap (w/ Tucker) 3,755 7,437 
FastCap w/ 2nd order expansion 36,500 43,396 
FastCap w/ 4th order expansion 86,900 77,021 
 
E. Parallel Meander Lines  
Next, the parallel meander line structure is considered to 
check the accuracy and capabilities of the proposed VoxCap 
simulator [Fig. 8]. The structure is formed by the interconnects 
with the width bw , length bl , height bh , and spacing d . There 
exist n  number of interconnects in each of n  number of layers. 
For the first analysis, after setting 0.5 mmb bw h  , 
0.1 mmd  , 10 mmbl  , and 20n  , the structure is 
discretized by 0.1 mmv  , which yields 1, 416,100tN   
and 808,600N   (note that / 1.7513tN N  ). For this case, 
the values in the first column of the capacitance matrix obtained 
by VoxCap with 5vx vy vzN N N    and FastCap with 2nd, 4th, 
and 6th order multipole expansions are compared in Figs. 
9(a)-(c). Furthermore, Fig. 9(c) also shows the normalized 
charge distribution on the structure obtained by VoxCap. The 
results obtained by FastCap with 6th order of multipole 
expansion perfectly match with the results obtained by 
VoxCap. The maximum relative differences between results 
obtained by VoxCap and FastCap with 2nd, 4th, and 6th order 
multipole expansions are 0.30, 0.048, and 0.011, respectively. 
The VoxCap and the FastCap with 2nd, 4th, and 6th order 
multipole expansions obtained the results after 1343, 600, 600,  
and 259 iterations, respectively. The memory and CPU time 
requirements of both simulators are tabulated in Table IV. For 
the same level of accuracy, the VoxCap required 47x less 
memory and 11.9x less CPU time compared to the FastCap 
with 6th order multipole expansion. For this problem, Tucker 
enhancement achieves more than 844x reduction (for 
410tol  ) in the memory of circulant tensors. Furthermore, it 
reduces the memory requirement of the simulator from 2.74 GB 
to 1.92 GB by a factor 1.43 while increasing the CPU time 
requirement 27%. 
 
 
Fig. 8. Parallel meander line structure.  
 
TABLE IV 
MEMORY AND CPU TIME REQUIREMENTS OF VOXCAP AND FASTCAP 
SIMULATORS FOR PARALLEL MEANDER LINE EXAMPLE 
 Memory 
requirement (MB) 
CPU time 
requirement (s) 
VoxCap (w/out Tucker) 2,744 1,906 
VoxCap (w/ Tucker) 1,918 2,426 
FastCap w/ 2nd order expansion 22,400 5,077 
FastCap w/ 4th order expansion 46,800 12,246 
FastCap w/ 6th order expansion 90,400 28,808 
 
Finally, a very large-scale parallel meander line structure is 
formed by setting 0.5 mmb bw h  , 0.1 mmd  , 
50 mmbl  , and 100n  . The discretization of this structure 
by 0.1 mmv   yields 179,400,500tN  and 
100, 203,000N   ( / 1.79tN N  ). For this example, the 
detailed breakdown of the memory and CPU usage of the 
VoxCap simulator is provided in Table V. It is apparent in 
Table V that the Tucker decomposition reduces the memory 
requirement of the circulant tensors from 132 GB to 6.8627 MB 
(for 410tol  ); it achieves a CR more than 19,000 while 
imposing a CO around 0.01. While compressing circulant 
tensors via Tucker decomposition, the required peak memory is 
94.6 GB. Compared to the memory requirement of one 
circulant tensor 22 GB, the memory penalty is 72.6GB. 
Furthermore, by reading the Tucker compressed Toeplitz 
tensors and restoring them, the VoxCap reduces the time for 
obtaining circulant tensors from 3832.63 seconds to 83.1 
seconds; it achieves 46x speed-up. Tucker decomposition 
reduces the memory requirement (peak memory) from 350 GB 
to 240 GB by a factor 1.46 while introducing a computational 
penalty of 25%. When the number of voxels in each small box 
for the preconditioner is increased from 10vx vy vzN N N    
to 20vx vy vzN N N   , the memory requirement of the 
preconditioner is increased from 320 MB to 19.5 GB, but the 
solution time is reduced nearly by a factor of 1.8 (from 47,620 
secs to 26,773 secs) (Table V). Fig. 9(d) shows the values in the 
one column of the capacitance matrix obtained by the VoxCap 
TMTT-2020-06-0637 11 
simulator. Note that the FastCap cannot be executed for this 
large-scale example due to its high memory cost. Fig. 9(d) also 
presents the normalized charge distribution on the structure by 
zooming into the opposite corners of the structure.  
 
 
  
 
 
Fig. 9. Parallel meander lines example. For 20n   and 10bl  mm, the 
comparison of the values in the first column of the capacitance matrix obtained 
by the VoxCap and the FastCap with (a) 2nd order and (b) 4th order multipole 
expansions, as well as (c) 6th order multipole expansions and normalized charge 
distribution on the structure. (d) For 100n   and 50bl  mm, the values in the 
first column of the capacitance matrix obtained by the VoxCap and the 
normalized charge distribution on the structure and on its left lower and right 
upper corners.  
 
TABLE V 
THE DETAILED BREAKDOWN OF THE CPU TIME AND MEMORY USAGE FOR THE 
PARALLEL MEANDER LINE. UNITS FOR MEMORY AND CPU TIME ARE MB AND 
S. 
CPU time for pre-processing 1365.8 
Memory of original ,   132,099.6 
Memory of compressed ,   6.8627 
CPU time for reading compressed Toeplitz 
tensors from hard disk 0.10129 
CPU time for restoring compressed Toeplitz 
tensors and embedding circulant tensors 83 
CPU time for filling Toeplitz tensors and 
embedding circulant tensors 3,832.63 
CPU time for compressing the circulant tensors 1,295.53 
CPU time for fast Fourier transforming the 
circulant tensors  262.47 
CPU time / memory for the preconditioner 
( 10vx vy vzN N N   ) 195.33 / 320.27 
CPU time / memory for the preconditioner 
( 20vx vy vzN N N   ) 245.88 / 19,500.1 
Number of iterations / CPU time for the solution 
when bottom conductor is excited (for 
10vx vy vzN N N   ) 
242 / 47,620.00 
Number of iterations / CPU time for the solution 
when bottom conductor is excited (for 
20vx vy vzN N N   ) 
157 / 26,773.02 
 
IV. CONCLUSION 
In this paper, VoxCap, a Tucker-enhanced and  
FFT-accelerated SIE simulator for electrostatic analysis and 
capacitance extraction of voxelized structures, was introduced. 
The VoxCap solves SIEs after discretizing the charge densities 
on panels by piecewise constant basis functions, applying 
Galerkin testing, and obtaining an LSE. The proposed VoxCap 
simulator uses FFTs to accelerate the MVMs during the 
iterative solution of LSE. Furthermore, it makes use of a highly 
effective and memory-efficient block-diagonal-diagonal 
preconditioner to reduce the number of iterations. It exploits 
Tucker decompositions to reduce its setup time and memory 
footprint. The proposed VoxCap simulator can solve problems 
with hundreds of million unknowns on a desktop computer. For 
many practical scenarios comprising densely packed 
interconnects, it is much faster and memory efficient compared 
to the FastCap. For example, for a parallel bus scenario 
considered in the numerical example section, the VoxCap is 
11x faster than the FastCap. At the same time, the VoxCap 
requires 23x less memory compared to the FastCap for the 
same level of accuracy. 
 
APPENDIX A 
ANALYTICAL EXPRESSION FOR THE RESULT OF INTEGRAL IN (6) 
The analytical expression for the result of the integral in (5) 
is given in Appendix C of [29]. However, no analytical 
expression for the result of the integral in (6) was found in the 
literature. To this end, we derived the analytical expression of 
the result of the integral in (6) by computing the derivative of 
the analytical expressions in [29] and provide here. For the 
parallel panel interactions, the result of the integral in (6), I , is 
obtained as 
TMTT-2020-06-0637 12 
 
    
   
 
2 24 4
1 1
2 2
2 2 2
3 2
2 2
2 2
0.5
1
r r
0.5
log
( )
2 2
log
3 6
b
arctan +
1
k mm k
k m k km km
m k
k k km
m km km
km k m
m m km
km
k m k m
k m
km kmk m
k m
km k m
km
a z b z
I
a
b z a z
a z a r
b + r r
zr a b z
b z b + r z
r
a b a ba z
r z ra ba b
zr a b
z r





 
 
 
 

   

       
 
       
  

,





  (15) 
where 1 2 1s ea x x  , 2 2 1e ea x x  , 3 2 1e sa x x  , 
4 2 1s sa x x  , 1 2 1s eb y y  , 2 2 1b e ey y  , 3 2 1b e sy y  , 
4 2 1b s sy y  , 2 1z z z    , 2 2 2km k mr a b z   , and 
3710  . The geometrical quantities in these expressions are 
given in Fig. A(a). For the orthogonal interactions, the result of 
the integral in (6), I , is obtained as 
 
 
 
2 24 2 2
1 1 1
2 2 4 2 2 2
2
2 2 4 2 3
2 2
(3 )
1
(3 ) 3( )( ) log
6 ( )
log
( )
( )
3 6 ( ) 2
m k l m k m
k m l m kml
k l l k l kml m kml m kml
kml m kml
k m l
k m k kml
kml k kml
m kml l k m k m
kml kml k l km
b a b
I
b r
a c c a c r b r b r
r b r
a b c
a b a r
r a r
b r c a b a b
r r a c r



 
  
   

     
 
   

  


2 2
2 2
2 2 2 2
( )
( )( 2arctan( )) ,
2 ( )( )
l m l
k l k m l kml l k m
l kmlkml k l m l
b c
a c a b c r c a b
c rr a c b c


     
  (16) 
where 1 2 1s ea x x  , 2 2 1e ea x x  , 3 2 1e sa x x  , 
4 2 1s sa x x  , 1 2 1sb y y  , 2 2 1eb y y  , 1 2 1ec z z  , 
2 2 1sc z z  , 
2 2 2
kml k m lr a b c   , and 
3710  . The 
geometrical quantities in these expressions are provided in Fig. 
A(b). 
 
Fig. A.  The sketches show the geometrical quantities used in (15) and (16) for 
evaluating the interactions between (a) parallel and (b) orthogonal panels. 
APPENDIX B 
BLOCK TOEPLITZ AND CIRCULANT TENSORS  
The block Toeplitz tensors ,   and ,   corresponding 
to the block matrices , P  and , E  in (9) are computed to 
obtain the block circulant tensors ,   and ,  , respectively. 
The procedure to obtain ,   and ,  is given in Algorithm 
1. In this procedure, the source basis function is defined on one 
of the three orthogonal panels of the first voxel with index 
(1,1,1) , which has unit normal pointing along x-, y-, or 
z-direction. Next, the testing functions are defined on voxel 
panels with unit normal pointing along x-, y- and z-directions. 
By setting the basis function on one orthogonal panel and 
sweeping over the testing functions, the entries of ,   and 
,   are obtained via evaluating the integrals in (5) and (6) for 
given basis function-testing function pair, respectively. 
  
TABLE VI 
THE VALUE OF xP , yP , AND zP  FOR DIFFERENT ( , )   PAIRS 
 
{ , }   xP  yP  zP  
{ , }x x  1xN    yN  zN  
{ , }y y  xN  1yN   zN  
{ , }z z  xN  yN  1zN   
{ , }x y , { , }y x  2xN   2yN   zN  
{ , }x z , { , }z x  2xN   yN  2zN   
{ , }y z , { , }z y  xN  2yN   2zN   
 
TABLE VII 
THE VALUE OF xS , yS , AND zS  FOR DIFFERENT ( , )   PAIRS 
 
{ , }   xS  yS  zS  
{ , }x x , { , }y y , { , }z z   0 0 0 
{ , }x y  / 2v  / 2v  0 
{ , }x z  / 2v  0 / 2v  
{ , }y z  0 / 2v  / 2v  
{ , }y x  / 2v  / 2v  0 
{ , }z x  / 2v  0 / 2v  
{ , }z y  0 / 2v  / 2v  
 
The block circulant tensors ,   and ,   are obtained by 
properly embedding the block Toeplitz tensors ,   and 
,  , respectively. To this end, the procedure explained in the 
Appendix B of [21] is followed. The signs of the Toeplitz 
blocks used to construct the blocks of circulant tensors are 
assigned as in Table VIII. Note that this table corresponds to 
Table 5 of [21] and the blocks in circulant tensors are labeled by 
L, M, N, LM, LN, MN, LMN, as in [21].  
 
 
 
 
 
TMTT-2020-06-0637 13 
 
TABLE VIII 
THE SIGNS OF BLOCK TOEPLITZ TENSOR FOR CONSTRUCTING EACH BLOCK IN 
EACH CIRCULANT TENSOR 
 
Block 
Toeplitz tensor L M N LM LN MN LMN 
,x x , ,y y , ,z z , 
,x y , ,x z , or ,y z  
+ + + + + + + 
,x x , ,x y , ,x z  - + + - - + - 
,y x , ,y y , ,y z  + - + - + - - 
,z x , ,z y , ,z z  + + - + - - - 
 
The embedding process generates the block circulant 
tensors ,   and ,   with dimensions given in Table IX. 
These dimensions are enlarged to 
2( 1) 2( 1) 2( 1)x y zN N N      by proper zero-padding to 
reduce the number of FFT and IFFT operations, as mentioned 
above. The locations of tensor entries for padding zeros are 
provided in Table X. 
 
TABLE IX 
THE ORIGINAL DIMENSIONS OF ,   AND ,   
 
Superscript 
of tensor 
{ , }   
Dimensions 
Superscript 
of tensor 
{ , }   
Dimensions 
{ , }x x  2( 1) 2 2x y zN N N    { , }x y , { , }y x  2( 1) 2( 1) 2x y zN N N     
{ , }y y  2 2( 1) 2x y zN N N    { , }x z , { , }z x  2( 1) 2 2( 1)x y zN N N     
{ , }z z  2 2 2( 1)x y zN N N    { , }y z , { , }z y  2 2( 1) 2( 1)x y zN N N     
 
 
TABLE X 
LOCATIONS OF ENTRIES IN ,   AND ,  FOR ZERO PADDING  
 
Superscript of 
tensor,{ , }   1
st dimension 2nd dimension 3rd dimension 
{ , }x x  - ( 1) : ( 2)y yN N   ( 1) : ( 2)z zN N   
{ , }y y  ( 1) : ( 2)x xN N   - ( 1) : ( 2)z zN N   
{ , }z z  ( 1) : ( 2)x xN N   ( 1) : ( 2)y yN N   - 
{ , }x y , { , }y x  - - ( 1) : ( 2)z zN N   
{ , }x z , { , }z x  - ( 1) : ( 2)y yN N   - 
{ , }y z ,{ , }z y  ( 1) : ( 2)x xN N   - - 
 
APPENDIX C 
SCALING FACTORS 
To use the same Toeplitz tensors ,   and ,   stored on 
the hard disk for given computational domains with different 
voxel sizes, those are to be computed and stored for unit voxel 
size 1v  . For the given voxel size of the problem under 
investigation, the Toeplitz tensors should be multiplied by the 
scaling factors after those are read from the hard disk. These 
scaling factors are derived by the change of variables to make 
the integrals in (5) and (6) independent from v . To this end, 
consider the parallel panel configuration demonstrated in Fig. 
A(a). For this configuration, the integral in (5) is written as   
 
1 1 2 2
1 1 2 2
2 2 2
1
4 ( ) ( ) ( )
e e e e
s s s s
y x y x
0 y x y x
dx dy dxdyI
πε x x' y y' z z
 

        , (17) 
where 1z z  and 2z z  . The integrals on primed coordinates 
are evaluated on the source panel (top panel in Fig. A(a)) for the 
basis function while those on the non-primed coordinates are 
evaluated on the observer panel (bottom panel in Fig. A(a)) for 
testing function. In the voxelized setting, the source and 
observer panels sit on grid points with indices ( , , )x y zm m m     
and ( , , )x y zm m m . To this end, first the bounds are changed as 
2s xx m v  ,  2 ( 1)e xx m v   , 2s yy m v  ,  
2 ( 1)e yy m v   ,  1s xx m v  , 1 ( 1)e xx m v   , 1s yy m v  , 
and 1 ( 1)e yy m v   . Then the variables and constants are 
changed as x a v   , y b v   , 1z c v  , x a v  , 
y b v  , and 2z c v   in (17), which yields 
  1 13 1 1
2 2 2
.
4 ( ) ( ) ( )
y yx x
y x y x
m mm m
0 m m m m
v da db dadbI
πε a a' b b' c c'
   
 
  

         (18) 
Apparently, the integral of (18) is independent of voxel size and 
thereby scaling factor is 3v for the parallel panel interactions. 
Similarly, the scaling factor is obtained as 3v  for the 
orthogonal panel interactions by applying the same procedure. 
A similar procedure is applied for the integral in (6). The 
scaling factor 2v  is obtained for the parallel and orthogonal 
panel interactions. 
REFERENCES 
 
 
Algorithm 1 Procedure for generating ,   and ,  . 
Preprocessing: For given ( , )   pair and v  , 
- Set xP , yP , and zP  by using Table VI and panels’ unit normal 
- Set the panel center for basis function as ( , , )x y zS S SS  by 
using Table VII. 
Execution: 
,
,
1:
1:
1:
Set ( , , )
Set the panel center for testing function as ( 1)
Evaluate (5) for given and to obtain or
Evaluate (6) for given and to obtain
x x
y y
z z
x y z
m P
m P
m P
m m m
v
 
 




  
m
m
for do
for do
for do
m
O m
S O
S O
end for
end for
end for


 
   
 
TMTT-2020-06-0637 14 
[1] R. Bahr, B. Tehrani, and M. M. Tentzeris, "Exploring 3-D printing for 
new applications: Novel inkjet- and 3-D-printed millimeter-wave 
components, interconnects, and systems," IEEE Microw. Mag., vol. 19, 
no. 1, pp. 57-66, Jan.-Feb 2018. 
[2] B. Lu, H. Lan, and H. Liu, "Additive manufacturing frontier: 3D printing 
electronics," Opto-Electronic Advances, vol. 1, no. 1, pp. 170004-1 - 
170004-10, 2018. 
[3] G. Sun, X. Zhao, and G. Lu, "Voxel-based modeling and rendering for 
virtual MEMS fabrication process," presented at the IEEE/RSJ Int. Conf. 
on Intelligent Robots and Systems, Beijing, 2006.  
[4] L. R. C. Coventor SEMulator3D. Available: 
https://www.coventor.com/products/semulator3d/ 
[5] A. C. Yucel, I. P. Georgakis, A. G. Polimeridis, H. Bagci, and J. K. White, 
"VoxHenry: FFT-accelerated inductance extraction for voxelized 
geometries," IEEE Trans. Microw. Theory Techn., vol. 66, no. 4, pp. 
1723–1735, Apr. 2018. 
[6] M. Naghed and I. Wolff, "A three-dimensional finite-difference 
calculation of equivalent capacitances of coplanar waveguide 
discontinuities," presented at the IEEE International Digest on 
Microwave Symposium, Dallas, TX, USA, May, 1990.  
[7] A. H. Zemanian, "A finite-difference procedure for the exterior problem 
inherent in capacitance computations for VLSI interconnections," IEEE 
Trans. Electron Devices, vol. 35, no. 7, pp. 985-992, July 1988. 
[8] Z. Zhu and W. Hong, "A generalized algorithm for the capacitance 
extraction of 3D VLSI interconnects," IEEE Trans. Microw. Theory 
Technol., vol. 47, no. 10, pp. 2027 - 2030, Oct. 1999. 
[9] G. Chen, H. Zhu, T. Cui, Z. Chen, X. Zeng, and W. Cai, "ParAFEMCap: 
A parallel adaptive finite-element method for 3-D VLSI interconnect 
capacitance extraction," IEEE Trans. Microw. Theory Techn., vol. 60, no. 
2, pp. 218–231, Feb. 2012. 
[10] R. Sabelka and S. Selberherr, "A finite element simulator for three 
dimensional analysis of interconnect structures," Microelectron.J., vol. 
32, no. 2, pp. 163-171, Feb. 2001. 
[11] W. Proskurowski and O. Widlund, "A Finite Element-Capacitance Matrix 
Method for the Neumann Problem for Laplace’s Equation," SIAM J. Sci. 
and Stat. Comput., vol. 1, no. 4, pp. 410–425, July 2006. 
[12] Alex Heldring, Juan M. Rius, JosÉ Maria Tamayo, and J. Parron, 
"Compressed Block-Decomposition Algorithm for Fast Capacitance 
Extraction," IEEE Trans. Comput. Aided Design Integr. Circuits Syst., 
vol. 27, no. 2, pp. 265 - 271, Feb. 2008. 
[13] S. Kapur and D. Long, "N-body problems: IES3: Efficient electrostatic 
and electromagnetic simulation," IEEE Comput. Sci. Eng., vol. 5, no. 4, 
pp. 60-67, Oct.-Dec. 1998. 
[14] W. Chai and D. Jiao, "An LU Decomposition Based Direct Integral 
Equation Solver of Linear Complexity and Higher-Order Accuracy for 
Large-Scale Interconnect Extraction," IEEE Trans. Adv. Packag., vol. 33, 
no. 4, pp. 794-803, Nov. 2010. 
[15] W. Chai and D. Jiao, "Dense Matrix Inversion of Linear Complexity for 
Integral-Equation-Based Large-Scale 3-D Capacitance Extraction," IEEE 
Trans. Microw. Theory Technol, vol. 59, no. 10, pp. 2404–2421, Oct. 
2011. 
[16] W. Shi, J. Liu, N. Kakani, and T. Yu, "A fast hierarchical algorithm for 
three-dimensional capacitance extraction," IEEE Trans. Comput.-Aided 
Design Integr. Circuits Syst., vol. 21, no. 3, pp. 330–336, Mar. 2002. 
[17] D. Gope and V. Jandhyala, "PILOT: a fast algorithm for enhanced 3D 
parasitic capacitance extraction efficiency," Microwave Opt. Tech. Lett., 
vol. 41, no. 3, pp. 169-173, May 2004. 
[18] D. Gope and V. Jandhyala, "Oct-tree-based multilevel low-rank 
decomposition algorithm for rapid 3-D parasitic extraction," IEEE Trans. 
Comput. Aided Design Integr. Circuits Syst., vol. 23, no. 11, pp. 
1575–1580, Nov. 2004. 
[19] K. Nabors and J. White, "FastCap: A multipole accelerated 3-D 
capacitance extraction program," IEEE Trans. Computer-Aided Design, 
vol. 11, no. 11, pp. 1447-1459, Nov. 1991. 
[20] J. R. Phillips and J. K. White, "A precorrected-FFT method for 
electrostatic analysis of complicated 3-D structures," IEEE Trans. 
Computer-Aided Design, vol. 16, no. 10, pp. 1059-1072, Oct. 1997. 
[21] A. G. Polimeridis, J. F. Villena, L. Daniel, and J. K. White, "Stable 
FFT-JVIE solvers for fast analysis of highly inhomogeneous dielectric 
objects," J. Comp. Phys., vol. 269, pp. 280-296, 2014. 
[22] M. F. Catedra, E. Gago, and L. Nuno, "A numerical scheme to obtain the 
RCS of three-dimensional bodies of resonant size using the conjugate 
gradient method and the fast Fourier transform," IEEE Trans. Antennas 
Propag., vol. 37, no. 5, pp. 528–537, May 1989. 
[23] H. Gan and W. C. Chew, "A discrete BCG-FFT algorithm for solving 3D 
inhomogeneous scatterer problems," J. Electromagn. Waves Appl., vol. 9, 
pp. 1339–1357, Apr. 1995. 
[24] A. C. Yucel, W. Sheng, C. Zhou, Y. Liu, H. Bagci, and E. Michielssen, 
"An FMM-FFT accelerated SIE simulator for analyzing EM wave 
propagation in mine environments loaded with conductors," IEEE 
Journal on Multiscale and Multiphysics Comp. Techn., vol. 3, pp. 3-15, 
2018. 
[25] M. F. Catedra, R. F. Torres, J. Basterrechea, and E. Gago, The CG-FFT 
Method: Application of Signal Processing Techniques to 
Electromagnetics. Norwood, MA: Artech House, 1995. 
[26] M. Wang, C. Qian, and A. C. Yucel, "Tucker-enhanced VoxCap 
simulator for electrostatic analysis of voxelized structures," in Proc. IEEE 
MTT-S Int. Conf. on Num.  EM and Multiphysics Modeling and Opt., 
Boston 2019. 
[27] K Nabors and J. White, "Multipole-accelerated capacitance extraction 
algorithms for 3-D structures with multiple dielectrics," IEEE Trans. 
Circuits Syst., vol. 39, no. 11, pp. 946-954, Nov. 1992. 
[28] K. S. Nabors, "Efficient Three-Dimensional Capacitance Calculation," 
Ph.D, Massachusetts Institute of Technology, Cambridge, MA, USA, 
1993. 
[29] A. E. Ruehli, G. Antonini, and L. Jiang, Circuit Oriented Electromagnetic 
Modeling Using the PEEC Techniques. John Wiley & Sons, 2017. 
[30] A. C. Yucel, "Uncertainty quantification for electromagnetic analysis via 
efficient collocation methods," Ph.D. dissertation, Univ. Michigan, Ann 
Arbor, MI, USA, , 2013. 
[31] M. Wang et al., "An FFT-accelerated and Tucker-enhanced inductance 
extraction for voxelized superconducting structures," to be presented at 
the CNC-USNC/URSI National Radio Sci. Meet., Montreal, CA, 2020.  
[32] L. D. Lathauwer, B. D. Moor, and J. Vandewalle, "A multilinear singular 
value decomposition," SIAM J. Matrix Anal. Appl., vol. 21, no. 4, pp. 
1253-1278, 2000. 
[33] A. C. Yucel, L. J. Gomez, and E. Michielssen, "Compression of 
translation operator tensors in FMM-FFT accelerated SIE solvers via 
Tucker decomposition," IEEE Antennas Wireless Propag. Lett., vol. 16, 
pp. 2667-2670, 2017. 
[34] C. Qian, Z. Chen, and A. C. Yucel, "Tensor decompositions for reducing 
the memory requirement of translation operator tensors in FMM-FFT 
accelerated IE solvers," presented at the Proc Applied Computational EM 
Society (ACES) Symp., Miami, FL, USA, April, 2019.  
[35] Y. Massoud, J. Wang, and J. White, "Accurate inductance extraction with 
permeable materials using qualocation," presented at the Pmceedings of 
the International Conference on Modeling and Sinrulation of 
Micmsystems, April 1999.  
[36] W. Yu, H. Zhuang, C. Zhang, G. Hu, and Z. Liu, "RWCap: A floating 
random walk solver for 3-D capacitance extraction of VLSI 
interconnects," IEEE Trans. Computer-Aided Design, vol. 32, no. 3, pp. 
353–366, March 2013. 
 
 
Mingyu Wang received the B.S. degree in 
electrical engineering and automation from 
Northeast Forestry University, China, in 
2016 and M.S. in electronics from Nanyang 
Technological University, Singapore, in 
2018. He is currently working toward the 
Ph.D. degree at the school of Electrical and 
Electronic Engineering, Nanyang 
Technological University, Singapore and works in 
Computational Electromagnetics Group. 
   His research is focused on computational electromagnetics 
and fast parameter extraction of voxelized interconnects and 
circuits. 
 
 Cheng Qian received the B.S. and Ph.D. 
degree in electronics engineering from 
Nanjing University of Science and 
Technology, Jiangsu, China, in 2009 and 
2015. From 2016 to 2018, he was a 
TMTT-2020-06-0637 15 
Research Associate with the Department of Applied Physics, 
The Hong Kong Polytechnic University, Hong Kong. Since 
2018, he has been a Post-Doctoral Researcher with the School 
of Electrical and Electronic Engineering, Nanyang 
Technological University, Singapore. His current research 
interests include computational electromagnetics and nonlinear 
plasmonics. 
 
 
Jacob K. White (F’08) received his B.S. 
in Electrical Engineering and Computer 
Science (EECS) from the Massachusetts 
Institute of Technology (MIT) in 1980, 
and his S.M. and Ph. D. in EECS from the 
University of California, Berkeley in 1983 
and 1985, respectively.  
After two years at the IBM T. J. Watson research center, he 
joined EECS at MIT as an Analog Devices Career 
Development Assistant Professor in 1987, became a 
Presidential Young Investigator in 1988, was an associate 
editor for the IEEE Transactions on Computer-Aided Design 
from 1992 until 1996, was a member of the Spectre/SpectreRF 
development team from 1989 until 1999, chaired the 
International Conference on Computer-Aided Design in 1999, 
served as an Associate Director of MIT's Research Laboratory 
of Electronics from 2001 until 2006, was an academic research 
fellow at Ansoft/Ansys from 2010 until 2016, and served as the 
MIT EECS co-education officer from 2011 until 2014. He 
became an IEEE fellow in 2008 for his group's work on fast 
interconnect analysis (e.g. FastHenry), and shared the 2013 A. 
R. Newton Technical Impact Award in EDA with Keith Nabors 
for their fast capacitance extraction program FastCap. Jacob 
White is currently the C. H. Green Professor in EECS at MIT, 
where he is researching simulation and optimization techniques 
for problems in medical technology, nano-photonics, and 
electrical circuits and interconnect; and experimenting with 
blended computation- and maker-centric strategies for teaching 
control, machine learning, and electromagnetics. 
 
Abdulkadir C. Yucel (M’19-SM’20) received the B.S. degree 
in electronics engineering (Summa Cum Laude) from Gebze 
Institute of Technology, Kocaeli, Turkey, in 2005, and the M.S. 
and Ph.D. degrees in electrical engineering from the University 
of Michigan, Ann Arbor, MI, USA, in 2008 and 2013, 
respectively. 
From September 2005 to August 2006, he worked as a 
Research and Teaching Assistant at Gebze Institute of 
Technology. From August 2006 to April 2013, he was a 
Graduate Student Research Assistant at the University of 
Michigan. Between May 2013 and December 2017, he worked 
as a Postdoctoral Research Fellow at the University of 
Michigan, Massachusetts Institute of Technology, and King 
Abdullah University of Science and Technology. Since 2018, 
he has been working as an Assistant Professor at the School of 
Electrical and Electronic Engineering, Nanyang Technological 
University, Singapore.  
Dr. Yucel received the Fulbright Fellowship in 2006, 
Electrical Engineering and Computer Science Departmental 
Fellowship of the University of Michigan in 2007, and Student 
Paper Competition Honorable Mention Award at IEEE AP-S in 
2009. He has been serving as an Associate Editor for the 
International Journal of Numerical Modelling: Electronic 
Networks, Devices and Fields and as a reviewer for various 
technical journals. His research interests include various 
aspects of computational electromagnetics with emphasis on 
analytical and numerical electromagnetic modelling and the 
applications of uncertainty quantification and deep/machine 
learning techniques to the electromagnetic analyses.  
 
 
 
 
