Transmission line inspires a new distributed algorithm to solve linear
  system of circuit by Wei, Fei & Yang, Huazhong
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
1
 
Abstract—Transmission line, or wire, is always troublesome to 
integrated circuits designers, but it could be helpful to parallel 
computing researchers. This paper proposes the Virtual 
Transmission Method (VTM), which is a new distributed and 
stationary iterative algorithm to solve the linear system extracted 
from circuit. It tears the circuit by virtual transmission lines to 
achieve distributed computing. For the symmetric positive 
definite (SPD) linear system, VTM is proved to be convergent. For 
the unsymmetrical linear system, numerical experiments show 
that VTM is possible to achieve better convergence property than 
the traditional stationary algorithms. VTM could be accelerated 
by some preconditioning techniques, and the convergence speed of 
VTM is fast when its preconditioner is properly chosen. 
 
Index Terms—Circuit simulation, distributed computing, 
numerical analysis, transmission line, wire.  
 
I. INTRODUCTION 
RANSMISSION line, or interconnect, or wire, is one of the 
main determinants of chip performance, because of the 
long interconnect latency and large power dissipation [37]. 
However, the circuit could not work without transmission lines, 
because the transmission line plays an irreplaceable role to 
make the distributed circuit stable and scalable [13]. In this 
paper, transmission lines are used to assist the distributed 
computation of sparse linear system of circuit. 
   The basic problem of the distributed circuit simulation is to 
distributedly solve the sparse linear system extracted from 
circuit [34, 41, 46]. Generally speaking, the linear system of 
circuit is unsymmetrical, because of the existence of controlled 
sources. Restricted to the resistor-capacitor (RC) network, the 
linear system would be symmetrical [39]. 
Many numerical algorithms are employed to solve the linear 
system from circuit. KLU is an optimized LU factorization 
algorithm for circuit, and it is broadly used in commercial and 
non-commercial circuit simulators [49]. KLU is sequential. 
SuperLU is an outstanding distributed algorithm to solve the 
general unsymmetrical sparse linear system on a large number 
of processors, and it has been widely adopted for circuit 
simulation [15]. Xyce is a parallel circuit simulator designed 
for supercomputers, and it uses the Block Jacobi 
preconditioned GMRES method [50, 4, 3]. Schur complement 
method is a non-overlapping domain decomposition method 
making use of the master-slave model. It is simple to be 
comprehended and implemented, and was frequently used [29, 
65, 20]. [7] presents a parallel direct/iterative mixed approach, 
based on the Schur complement method and preconditioned 
GMRES method. [41] adopted the preconditioned overlapping 
domain decomposition method, which is supported by PETSc, 
a distributed scientific computing package [1]. The highlight of 
[41] is that it improves the matrix property by exploring the 
characters of transistor device models. 
The symmetric linear system of circuit is mainly extracted 
from the RC networks. Solving RC network is the key for the 
power grid analysis and thermal analysis [30, 26]. Meanwhile, 
the sparse linear system generated by the finite element method 
from elliptic partial differential equations (PDE) is equivalent 
to a resistor network.  
The simulation of power grid has been intensively 
researched in the past 10 years, and a variety of numerical 
algorithms had been employed to solve the RC network, such 
as stationary iterative methods [67], Krylov-subspace iterative 
method [12],  algebraic multigrid (AMG) [31, 69], coupled 
iterative/direct method [32] and FFT method [48]. [42] 
proposed a stochastic method, and this method was further 
expanded as a preconditioning method for the diagonal 
dominant matrix [43].  
To compute the RC network on the parallel computing 
platform, [68] makes use of the domain decomposition method 
(DDM). [11] adopts a distributed matrix inversion algorithm. 
The potential of GPU to solve the RC network was illustrated in 
[6, 16, 17, 48].  
VTM is a new distributed and stationary iterative algorithm 
to solve the sparse linear systems [59]. VTM is similar to Block 
Jacobi or Overlapped Block Jacobi method [56, 66].  
VTM is inspired by the behavior of transmission line [40, 13]. 
It inserts virtual transmission lines into the circuit to achieve 
distributed computing [58, 59]. The idea of VTM is derived 
from the pseudo transient method [57]. Pseudo transient 
method inserts pseudo capacitors or inductors into the circuit to 
approve the convergence property of DC analysis. Similarly, 
VTM inserts pseudo transmission lines into the circuit to 
approve the convergence property of distributed simulation. 
Inserting virtual or pseudo electrical element to assist the circuit 
simulation is a common approach, and the advantage is that the 
physical property of circuit could be utilized during numerical 
computation. 
VTM draws a parallel between distributed computing of 
linear system and distributed simulation of linear circuit, and it 
interprets the relationship between numerical analysis and 
circuit analysis. To solve a sparse linear system in parallel is 
equalized to simulate a linear circuit in parallel, and vice versa, 
Transmission Line Inspires A New Distributed 
Algorithm to Solve the Linear System of Circuit 
Fei Wei, Huazhong Yang, Senior Member, IEEE 
T 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
2
we might comprehend the distributed numerical algorithm from 
the viewpoint of circuit theory and microwave network.  
To partition a circuit, node tearing and branch tearing are two 
important methods [64, 47]. This paper proposes the third way, 
which is called wire tearing. Tearing the circuit by wires is not a 
new idea, since all the subcircuits are connected by wires, but 
we first apply this concept into the distributed numerical 
algorithms [58, 59]. Wire tearing can be considered as the 
insertion of transmission line to connect the torn nodes [61]. 
The advantage of wire tearing over the branch tearing and node 
tearing is that it does not bring in any extra energy, because all 
the wires are passive [27].  
As a distributed numerical algorithm, VTM is similar to 
overlapped Block Jacobi or Block SOR method [2, 45]. We 
have proved that VTM is convergent to solve arbitrarily-large 
SPD linear system on arbitrary number of processors [59, 62]. 
VTM could also be considered as a new type of algebraic 
domain decomposition method [55]. VTM is similar to the 
additive Schwarz method with Robin condition, i.e. 
Schwarz-Robin method [35, 21, 22, 36]. Additive Schwarz 
method is mainly used to solve the sparse linear system from 
elliptic partial differential equations (PDE), and VTM is used to 
solve the general sparse linear system. The partitioning method 
for VTM, i.e. wire tearing, is different from the traditional 
partitioning methods for algebraic additive Schwarz method 
[22, 36]. Furthermore, the preconditioning technique for VTM 
is more flexible than Optimized Schwarz Method [21]. 
This paper is organized as follows. Section 2 presents a brief 
introduction to the linear system of circuit. Section 3 introduces 
the mathematical description of transmission line. Section 4 
proposes the physical background of VTM. Section 5 describes 
how to partition the circuit by wire tearing. Section 6 details the 
algorithm of VTM. Section 7 presents the convergence theory. 
Section 8 provides a simple example. Section 9 discusses the 
preconditioning techniques for VTM. Numerical experiments 
are shown in Section 10. We conclude this work in Section 11. 
In the appendix, we present a proof for the convergence theory.  
II. LINEAR SYSTEM OF CIRCUIT 
In virtue of the nodal analysis, one circuit could be described 
by a sparse linear system [39]. 
Ax = b                                       (1) 
Here x represents the nodes’ voltages, and b is the current 
sources flowing into the nodes. A is the coefficient matrix, 
which might be symmetrical, or unsymmetrical, depending on 
the property of circuit. 
If the circuit contains no voltage sources but only current 
sources, we call it current-driven circuit. To construct the linear 
system of current-driven circuit, we only need to use the nodal 
analysis technique, and do not need the modified nodal analysis 
technique [39]. If there were voltage sources in the circuit, they 
might be equalized into current sources, according to the 
Norton equivalent theory [18].  
 
Lemma 1: For a current-driven linear resistor network, its 
coefficient matrix, A, is weakly diagonally dominant, and thus 
symmetrical-non-negative-definite (SNND). Further, if there is 
at least one resistor connecting to the ground, A is strictly 
diagonally dominant, and thus symmetric-positive-definite 
(SPD). 
The physical insight of this lemma is that the energy of a 
resistor network is always non-negative. 
III. TRANSMISSION LINE 
Transmission line is an important element in electrical and 
microwave engineering. It is also called cable, wire or 
interconnect in different contexts.  
The circuit diagram of the transmission line is illustrated in 
Fig. 1. The analytical description of the lossless or ideal 
transmission line is the wave equation [40, 13]. 
2 2
2 2
( , ) ( , )u x t u x tLC
x t
∂ ∂
=
∂ ∂
                      (2) 
Here L  is the inductance per unit length, C  is the capacitance 
per unit length.  
The time domain mathematical description of the lossless 
transmission line is called Transmission Delay Equations, as 
shown in (3) [40, 8, 59]. 
1 1 2 2
2 2 1 1
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
u t Z i t u t Z i t
u t Z i t u t Z i t
τ τ
τ τ
+ ⋅ = − − ⋅ −
+ ⋅ = − − ⋅ −
          (3) 
Here u1(t) and u2(t) represent the port voltages. i1(t) and i2(t) 
represent port inflow currents. t is the time variable. τ is the 
propagation delay. Z is the characteristic impedance, which is 
positive.  
Fig. 1.  Transmission line. (A) The circuit diagram of the transmission line. (B) 
The equivalent circuit of the lossless transmission line.  
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
3
/Z L C=  
The propagation delay of the transmission line is: 
l LCτ =  
Where l  is the length of the transmission line.  
The iterative formula of Transmission Delay Equations is 
similar to the Robin boundary condition in Partial Differential 
Equations and Schwarz-Robin method [35, 21]. There are two 
main differences: First, the normal derivative in the robin 
condition is replaced by a current; Second, (3) expresses the 
time delay explicitly. 
   For a number of coupled transmission lines, (3) could be 
re-expressed in the matrix-vector form: 
⋅ ⋅
⋅ ⋅
1 1 2 2
2 2 1 1
u (t) + Z i (t) = u (t - τ) - Z i (t - τ)
u (t) + Z i (t) = u (t - τ) - Z i (t - τ)         (4) 
Here Z is the characteristic impedance matrix of the 
transmission lines. If there is no coupling among transmission 
lines, Z is diagonal; if there is coupling, Z is a sparse matrix. 
   (4) could be re-expressed as (5): 
+ ⋅ − + ⋅
+ ⋅ − + ⋅
1 1 2 2
2 2 1 1
i (t) W u (t) = i (t - τ) W u (t - τ)
i (t) W u (t) = i (t - τ) W u (t - τ)                  (5) 
Here W is the characteristic admittance matrix of the wires. 
-1W = Z  
Associated with the principle of parallel computer, we come 
to realize that, there is conceptual similarity between the 
distributed computer and the distributed circuit, as shown in 
Fig. 2 [61]. 
First, different circuits are connected by wires, which is 
similar to the situation where different processors are 
connected by the digital data link. 
Second, the wire has transmission delay, and there is also 
communication delay between the processors. 
As the result, if we assimilate each circuit to a processor, and 
consider the wire as the data link between processors, then the 
distributed circuit would be kind of a distributed computer.  
IV. PHYSICAL BACKGROUND 
In this section, we present an observation from circuit and 
microwave network. This lemma presents the stability property 
of the resistor network with wires. Fig. 3 presents an illustration 
for this lemma. Lemma 2 could be validated in SPICE [38, 44].  
Lemma 2: A linear resistor network with wires inside never 
oscillates unendingly, and it will finally settle down to the 
steady state, which is the steady state of the resistor network 
eliminating all the wires. 
The physical insight of this lemma is that, the energy of a 
resistor network is limited. Divergence means infinite energy, 
which is impossible for the resistor network. 
Lemma 2 is also valid for the resistor-capacitor network. But 
for the general circuit with wires, it is not always valid. A 
distributed circuit with wires inside might go unconvergent. 
The stability is depending on the property of circuit and the 
characteristic impedance of the wires. 
Fig. 3.  Illustration of the stability theory of resistor network with wires. (a) The resistor network with wires inside. (b) The resistor network without wires. (c) The 
voltage of node 2a and node 2b in (a) woud converge to the steady state, which is the voltage of node 2 in (b). 
Fig. 2.  Similarity between the distribute circuit and distributed computer. (a) 
Distributed circuit, two subcircuits are connected by a wire. (b) Distributed 
computer, two processors are connected by a digital data link. 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
4
V. WIRE TEARING 
To solve the sparse linear system of a circuit, originally there 
is no transmission line in it, but we might figure out some way 
to insert transmission lines into the circuit by splitting the nodes. 
This partitioning technique is called wire tearing.  
By means of wire tearing, the original circuit is turned into a 
distributed circuit. It is partitioned into a number of separate 
sub-circuits by virtual transmission line. Later, we locate each 
sub-circuit into one processor, and use the digital data link to 
imitate the behavior of the transmission line. As the result, the 
distributed circuit is simulated in a distributed way.  
According to Lemma 2, the resistor network with wires 
would finally go to the steady state, which is exactly the steady 
state of the resistor network without wires. This lemma 
explains the reason that VTM could always be convergent to 
solve the resistor network. For the sparse linear system of a 
general circuit, VTM might be unconvergent.  
There are 4 steps to perform wire tearing for the circuit G . 
Γ  denotes the set of all the nodes in G . 
Step 1. Set the splitting interface  interfaceΓ ⊆ Γ . V is called 
interfacial node if  interfaceV ∈Γ ; otherwise, V is called inner 
node. 
Step 2. Split each interfacial node into a pair of nodes, which 
are called twin nodes. 
Step 3. Split the resistors and current sources connected to 
each interfacial node. 
Step 4. Connect each pair of twin nodes by a length of 
lossless wire. Add inflow currents to the twin nodes. 
 
Example 1. Fig. 4 illustrates an example, in which a simple 
resistor network is split into two sub-circuits.  
First we define the set of inner nodes in Sub-circuit 1 as 
1,innerΓ , the set of inner nodes in Sub-circuit 2 as 2,innerΓ . 
By reordering the rows, the sparse linear system of circuit (1) 
could be re-formatted as below by row reordering:  
         
=              
1 2
1 11 1
2 2 2 2
u fC E E
y gF D 0
F 0 D y g
                    (6) 
Here u is the voltage vector of interfaceΓ . y1 and y2 are the 
voltage vector of 1,innerΓ  and 2,innerΓ , respectively. 
After wire tearing, the interfacial nodes are split, and 
interfaceΓ  is split into two sets, the twin nodes in Sub-circuit 1, 
1,twinΓ , and the twin nodes in Sub-circuit 2, 2,twinΓ . The 
system of sub-circuit 1 is: 
   
1
       
= +             
1 1 1 1 1
1 1 1
C E u f i
F D y g 0
                  (7) 
1u  represents the voltage vector of 1,twinΓ . 1i  represents the 
currents flowing into 1,twinΓ . 
After wire tearing, the system of sub-circuit 2 is: 
2 2 2 2 2
2 2 2
       
= +             2
C E u f i
F D y g 0
                 (8) 
Here 2u  represents the voltages vector of 2,twinΓ . 2i  
represents the currents flowing into each node of 2,twinΓ .  
  Also we have, 
Fig. 4.  Illustration of the wire tearing. (a) The original resistor network, and node 2 is set to be the interfacial node. (b) Split 
node 2 into a pair of twin nodes 2a and 2b, and insert a length of transmission line between them. 1/R2=1/R2a+1/R2b. I2=I2a+I2b. 
(c) Replace the virtual transmission line with its equivalent circuit, and the original resistor network is partitioned into two 
sub-networks.  
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
5
1 2
1 2
C + C = C
f + f = f
 
1u , 1i , 2u , 2i  are called boundary variables, and they 
satisfy the Transmission Delay Equations (5). 
+ ⋅ − + ⋅
+ ⋅ − + ⋅
1 1 2 2
2 2 1 1
i (t) W u (t) = i (t - τ) W u (t - τ)
i (t) W u (t) = i (t - τ) W u (t - τ)             
   If we set the boundary condition as below:  
1 2
1 2
u (t) = u (t) = u(t)
i (t) = -i (t) = i(t)
 
Then (6) could be re-obtained from (7) and (8).  
The above splitting technique is called level-one wire tearing, 
since the interfacial node is split once. If the interfacial node 
was split for more than one times, this splitting technique is 
called multilevel wire tearing [59]. 
VI. VIRTUAL TRANSMISSION METHOD 
After partitioning of the circuit, there are 3 steps to perform 
the parallel computing by Virtual Transmission Method.  
Step 1. Replace the lossless transmission line with its 
equivalent circuit.  
Step 2. Load each sub-circuit into a processor. 
Step 3. Use the data link to transfer the previous interfacial 
variables from one processor to another by message 
passing, and perform the distributed computing. 
Example 2. Continuing with Example 1, the analytic 
expression of sub-circuit 1 is obtained by merging (5) and (7):      
                    
+ ⋅ − + ⋅
1 1 1 1 1
1 1 1 1
1 1 2 2
C E u (t) f i (t)
= +
F D y (t) g 0
i (t) W u (t) = i (t - τ) W u (t - τ)
      (9) 
Eliminate 1i (t) , and we obtain: 
+         
+ ⋅   
= − ⋅ + ⋅
1 1 1
1 1 1
1 2 2
1
1 1 2 2
C W E u (t)
=
F D y (t)
f W u (t - τ) - i (t - τ)
g
i (t) W u (t) W u (t - τ) - i (t - τ)
        (10) 
 
Set all the delays of transmission lines to be 1, then, 
1 1
1 2 2
1
1 1
1 1 2 2
k k k
k
k k k k
− −
− −
+    + ⋅           
= − ⋅ + ⋅
1 1 1
1 1 1
C W E u f W u - i
=
F D y g
i W u W u - i
 (11) 
 
(11) is an SPD sparse linear system, and it could be solved by 
any sequential algorithms, such as cholesky factorization, 
algebraic multigrid method (AMG) and conjugate gradient 
method (CG).  
Similarly, the analytic expression of sub-circuit 2 is obtained 
by merging (5) and (8): 
2 2 2 2 2
2 2 2 2
2 2 1 1
                    
+ ⋅ − + ⋅
C E u (t) f i (t)
= +
F D y (t) g 0
i (t) Z u (t) = i (t - τ) W u (t - τ)
     (12) 
 
Eliminate 2i (t) , and we obtain: 
2 2 2
2 2 2
2 1 1
2
2 2 1 1
+         
+ ⋅   
= − ⋅ + ⋅
C W E u (t)
=
F D y (t)
f W u (t - τ) - i (t - τ)
g
i (t) W u (t) W u (t - τ) - i (t - τ)
  (13) 
If all the delays of lines are 1, we obtain: 
2 2 2
2 2 2
1 1
2 1 1
2
1 1
2 2 1 1
k
k
k k
k k k k
− −
− −
+         
 + ⋅  
= − ⋅ + ⋅
C W E u
=
F D y
f W u - i
g
i W u W u - i
           (14) 
  Later, we locate (11) on Processor 1, and locate (14) on 
Processor 2. After that, we set the initial values for boundary 
variables: 
0 0
1 2
0 0
1 2
u = u = 0
i = i = 0
 
At last, we do the distributed computing.  
VII. CONVERGENCE THEORY 
Theorem 1 (Convergence Theory): Assume a linear resistor 
network is partitioned into N sub-circuits by wire tearing, if the 
characteristic impedance matrix of the virtual transmission lines 
is symmetrical-positive-definite (SPD), then VTM converges at 
the answer to the resistor network. 
This conclusion is supported by Lemma 2, and the 
mathematical proof is given in the appendix.  
According to the proof of convergence theory, we know that 
the convergence speed of VTM is depending on the choice of 
the characteristic admittance matrix W. So W is also called the 
preconditioner for VTM.  
VIII. PRECONDITIONING 
In this section, we discuss the preconditioning techniques for 
VTM. We first define the input admittance matrix of a circuit or 
microwave network [18, 13]. From the view of numerical 
analysis, input admittance matrix is an alias of Schur 
complement matrix associated with the interface variables u [45, 
68].  
 
Example 3. Refer to (7), for Sub-circuit 1, the Schur 
complement matrix associated with the interface variables u1 is: 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
6
1 1=
-1
1 1 1S C - E D F  
S1 is also the input admittance matrix associated with the 
interface variables u1 in Sub-circuit 1. 
Similarly, in Sub-circuit 2, the Schur complement matrix 
associated with the interface variables u2 is: 
2 2 2 2 2=
-1S C - E D F  
S2 is also the input admittance matrix associated with the 
interface variables u2 in Sub-circuit 2. 
Then, the voltage reflection matrix on the interface of 
Sub-circuit 1 is defined as: 
( ) ( ) 11 1 −= −1T W S W + S  
The voltage reflection matrix on the interface of Sub-circuit 2 
is: 
( )( ) 12 2 2 −= −T W S W + S  
According to the convergence proof in the appendix, the 
convergence factor of VTM is: 
( )CF ρ= 1 2T T  
Here ( )ρ A  is the spectral radius of the square matrix A. 
Consequently, we know that the convergence speed of VTM is 
depending on the choice of the characteristic admittance matrix 
W. This viewpoint is identical to the impedance matching 
technique in microwave engineering [13]. 
 In the ideal case, if we set the preconditioner, 
                   = 1W S  or 2S  
Then, 
            ( ) 0ρ =1 2T T  
As the result, only 1 iteration is needed to achieve convergence. 
Based on the knowledge of microwave network, we know that, if 
one port is precisely matched, there would be no energy reflection 
on this port and the network would settle down after 1 reflection. 
In practice, it is impossible to obtain the accurate input 
admittance matrix of a large circuit, because the dimension of D1 
or D2 might be very large, and it is impossible to obtain 
1
1
−D  and 
1
2
−D . So we need to figure out some practical way to approximate 
W to make 1≈W S  or 2S .  
If the circuit is a resistor network, first we need to assure that 
W is SPD. Continue with Example 2, there are several ways to 
choose W. 
1. , 0α α= ⋅ >W I . I is the identity matrix.  
2. ( ) ( )2diag or diag= 1W C C . We call it diagonal 
preconditioner. 
3. 1 2or=W C C . This is called overlapped block 
preconditioner. It is similar to the overlapped block Jacobi 
method.  
4. 2orα α= ⋅ ⋅1W C C , 0α > . This is called weighted 
overlapped block preconditioner (WOB).  
5. 1 2orW S S  . 1S  or 2S  are called the partial input 
admittance matrix of the large circuit, which are defined as 
below:  
1 1=
-1
1 1 1S C - E D F     
2 2 2 2 2=
-1S C - E D F     
Here 
   
1 1
1 1
C E
F D

   is a submatrix of 
   
1 1
1 1
C E
F D
. Similarly, 
2 2
2 2
   
C E
F D

   is a submatrix of 
2 2
2 2
   
C E
F D
.  
   This preconditioning technique is called Schur complement 
approximation method (SCA). Usually 1S  or 2S  is dense 
matrix, and we need to drop the small elements inside the matrix. 
After that, the fill-in pattern of W is similar to C1 or C2. 
   If the circuit is a general circuit with controlled sources, i.e. 
the linear system of circuit is unsymmetrical, the above 5 
preconditioning techniques could also be used, but VTM might 
be unconvergent.  
We have to say that, it is still an open problem to choose the 
preconditioner for the unsymmetrical linear system. In the next 
section, we illustrate a simple example to show the potential of 
VTM to solve the unsymmetrical system. 
IX. EXAMPLE 
For the symmetrical linear system, a simple example could 
be found in [59]. Here we illustrate another example to solve 
the unsymmetrical linear system of the circuit. Fig. 5 proposes a 
tightly-coupled circuit with 4 operational amplifiers connected 
end-to-end. The linear system of this circuit is: 
1 21 1 1
2 42 2 2
13 3 3 3
34 4 4 4
0 0
0 0
0 0
0 0
g g u I
g g u I
g g u I
g g u I
               
=              
            (15) 
Here we set: 
1 2 3 4 1g g g g= = = =  
13 34 42 21 10g g g g= = = − =  
The loop gain of this circuit is: 
      1 1 1 1 413 34 42 21 1 3 4 2 10Gain g g g g g g g g
− − − −
= = −  
Then we partition this linear system by wire tearing, 
 
As the result, we obtain 2 sub-systems: 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
7
1 21 1 1
2 2 2 2
13 3 3 3 2
0
0 0
0
a a a a
a a a b
g g u I
g u I i
g g u I i
          
= +          +     
           (16) 
 
2 42 2 2 2
3 3 3 3
34 4 4 4
0
0 0
0
b b b b
b b b b
g g u I i
g u I i
g g u I
+          
= +               
               (17) 
Here: 
        2 2 2 3 3 3
2 2 2 3 3 3
,
,
a b a b
a b a b
g g g g g g
I I I I I I
+ = + =
+ = + =
 
For simplicity, we set: 
2 2 3 3 0.5a b a bg g g g= = = =  
The transmission delay equations of T2 are: 
2 2 2 2 2 2
2 2 2 2 2 2
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
a a b b
b b a a
i t W u t i t W u t
i t W u t i t W u t
τ τ
τ τ
+ ⋅ = − − + ⋅ −
+ ⋅ = − − + ⋅ −
           
 (18) 
The transmission delay equations of T3 are: 
Fig. 5.  Solving the unsymmetric linear system by VTM . (a) The original circuit with operational amplifiers. (b) Equivalent 
circuit of (a), with voltage controlled current sources (VCCS). (c) Partition the circuit by wire tearing. Node 2 and 3 is set to be 
the interfacial node. (d) Partition the equivalent circuit by wire tearing. 1/R2=1/R2a+1/R2b. I2=I2a+I2b. 1/R3=1/R3a+1/R3b. 
I3=I3a+I3b. (e) Replace the virtual transmission line with its equivalent circuit, and thus the original circuit is partitioned into 
two sub-circuits.  
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
8
3 3 3 3 3 3
3 3 3 3 3 3
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
a a b b
b b a a
i t W u t i t W u t
i t W u t i t W u t
τ τ
τ τ
+ ⋅ = − − + ⋅ −
+ ⋅ = − − + ⋅ −
 
(19) 
We set: 
2 0.5W = , 3 20000W =  
Merge (16), (18) and (19), we obtain the linear system for 
Sub-circuit 1: 
 
2 2 2 2
1 1
13 1 21 3 3 3 13 1 1 3
2 2 2 2 2 2
3 3 3 3 3 3
0 ( ) ( )
( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
a a a a
a a a a
a a b b
a a b b
g u t I i t
g g g g u t I g g I i t
i t W u t i t W u t
i t W u t i t W u t
τ τ
τ τ
− −
+     
=     
− +     
+ ⋅ = − − + ⋅ −
+ ⋅ = − − + ⋅ −
 
(20) 
Similarly, we get the linear system for Sub-circuit 2: 
1 1
22 42 4 34 2 42 4 4 2
33 3 3
2 2 2 2 2 2
3 3 3 3 3 3
( ) ( )
( )0 ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
bb b b
bb b b
b b a a
b b a a
u tg g g g I g g I i t
u tg I i t
i t W u t i t W u t
i t W u t i t W u t
τ τ
τ τ
− −   − + 
=     +    
+ ⋅ = − − + ⋅ −
+ ⋅ = − − + ⋅ −
 
(21) 
   Finally we solve this example on 2 processors by VTM, and 
the computing result is compared with a number of well-known 
stationary iterative algorithms, e.g. Jacobi, Gauss Seidel (GS), 
Successive Overrelaxation Method (SOR), Symmetric SOR 
(SSOR), Block Jacobi (BJ) [45]. Here the global iterative index 
k increases to k+1 until all the variables are updated.  
   As shown in Fig. 6, the traditional stationary algorithms are 
unconvergent for this circuit, but VTM is convergent. The key 
is to choose proper characteristic admittances for the virtual 
transmission lines.  
According to this example, we conclude that there exists a 
parallel between the stability of VTM and the stability of 
distributed linear circuit (or microwave network). The 
traditional techniques to stabilize a distributed circuit might be 
transplanted to stabilize VTM. 
However, for the large unsymmetrical linear system, we 
have not yet figure out an effective method to choose W. The 
convergence property of VTM is depending on the property of 
linear circuit, as well as the preconditioner W. 
X. NUMERICAL EXPERIMENTS 
In this section we compare VTM with the traditional 
stationary algorithms for both the symmetrical matrix and 
unsymmetrical matrix. We use MATLAB and SIMULINK as 
our test environment [53, 54], and use METIS and 
MESHPART as our partitioning methods [28, 51, 19]. 
The test matrices are described in Table 1. Grid2d and 
Grid3d are structured resistor networks [51, 43]. Fv2 is a 2D 
mesh constructed by finite element method (FEM) [14]. Pchip 
is extracted from a MOS integrated circuit of MCNC’90 
testbench [52]. Wang is a sparse matrix from semiconductor 
device simulation [14].  
Table 2 illustrates the computing results of the stationery 
iterative algorithms. BJ, OBJ and VTM are tested on 2 
processors. Jacobi, GS, Symmetrical GS (SGS), SOR and 
SSOR are tested on 1 processor.  
VTM_Diag means VTM with diagonal preconditioner. 
VTM_OB represents VTM with overlapped block 
preconditioner. VTM_WOB is VTM with weighted overlapped 
block preconditioner. VTM_SCA means VTM with Schur 
complement approximation preconditioner. All these 
preconditioning methods were previously described in Section 
8.  
According to Table 2, we conclude that VTM is convergent 
to solve the SPD system, and its convergence speed is fast when 
using the SCA or WOB preconditioner. The convergence speed 
TABLE I 
SPARSE MATRICES PROPERTY 
Name  Interface nodes Total nodes Property Source 
Grid2d 100 10000 SPD MESHPART 
Grid3d 900 27000 SPD H. Qian  
Fv2 129 9801 SPD UFpack   
Wang 870 26064 Unsymm. UFpack  
Pchip 13 2298 Unsymm. MCNC’90  
TABLE II 
ITERATION NUMBER TO ACHIEVE CONVERGENCE 
Method Grid2d Grid3d Fv2 Wang Pchip
Jacobi 27759 5238 212 50773 -- 
GS 18373 2950 109 27423 -- 
SGS 11773 1571 79 26842 -- 
SOR 14253 2049 66 16289 -- 
SSOR 8839 1073 74 29135 -- 
BJ 966 227 50 563 952 
OBJ 490 114 23 287 162 
VTM_Diag 963 335 42 854 160 
VTM_OB 491 115 23 287 161 
VTM_WOB 296 68 22 175 -- 
VTM_SCE 275 75 14 186 -- 
Fig. 6.  Comparison of the convergence speed of iterative algorithms for 
resistor network. 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
9
of OBJ and VTM_OB are same. For the unsymmetrical linear 
systems, VTM might be unconvergent.  
At last, we test VTM on a various number of processors, as 
shown in Fig. 7. In this case, we use Grid3d as the testbench. 
The test result indicates that the convergence speed of VTM is 
slightly relative to the number of processors. 
XI. CONCLUSION AND FUTURE WORK 
In this paper, we introduce VTM to solve the sparse linear 
system of circuit in parallel. VTM is a distributed and stationary 
iterative numerical algorithm, and it is efficient to solve the 
resistor networks.  
   VTM proves many insightful conclusions linking numerics 
and electrics. The physical background of VTM is electric 
circuit, and this makes this algorithm different from the 
traditional numerical algorithms. As presented in this paper, we 
have borrowed many concepts from circuit theory to describe 
this algorithm. 
According to the simple example presented in Section 9, we 
conclude that there exists similarity between the stability of 
VTM and the stability of distributed linear circuit (or 
microwave network). 
The convergence speed of VTM is highly depending on the 
characteristic admittance matrix of the virtual transmission 
lines, i.e. its preconditioner W. A number of preconditioning 
techniques are implemented to accelerate VTM. Experiments 
indicate that, if W is properly chosen, the performance of VTM 
would be appreciable; if not, VTM would be plain. 
Nevertheless, the computation cost to design the preconditioner 
W should be aware of [5]. 
For the SPD linear system, we have proved that VTM is 
convergent; for the large unsymmetrical linear system, we have 
not yet figure out an effective way to choose the characteristic 
admittance matrix to guarantee convergence. This is an open 
problem. 
VTM could also be used to solve the nonlinear system, as we 
pointed out in [59]. However, the convergence analysis of 
nonlinear system is more complicated than the linear system. 
This problem should be further studied. 
If the delays of virtual transmission lines are different and 
stochastic, VTM would turn to be a fully asynchronous and 
chaotic numerical algorithm [58].  If we replace each variable 
by a piece of waveform, VTM would become a 
waveform-relaxation algorithm [60].  
To simulate the post-layout integrated circuit in parallel, we 
have figured out another method to distributedly solve the 
nonlinear delay differential equations extracted from the 
distributed circuits [61]. 
APPENDIX 
Here we prove that, if the graph of the resistor network is 
2-partete, VTM converges.  
Lemma 3: Assume a linear resistor network is partitioned into 
2 sub-circuits by level-one wire tearing, if the characteristic 
impedance matrix Z of the virtual transmission lines is SPD, 
VTM converges at the answer to the resistor network. 
Proof:   
First, we consider the resistor network with inner nodes, and 
eliminate these inner nodes by Schur complement technique. 
Then, we prove Lemma 3 for the resistor network without inner 
nodes. 
The sparse linear system of the resistor network is:  
         
=              
1 2
1 11 1
2 2 2 2
u fC E E
y gF D 0
F 0 D y g
           
Eliminate the inner variables y1 and y2, we obtain: 
( )-1 -11 1 1 2 2 2
-1 -1
1 1 1 2 2 2
-1 -1
1 1 1 1 1
-1 -1
2 2 2 2 2
C - E D F - E D F u
= f - E D g - E D g
y = -D F u + D g
y = -D F u + D g
 
Set, 
=
-1 -1
1 1 1 2 2 2S C - E D F - E D F ,                  
=
-1 -1
1 1 1 2 2 2r f - E D g - E D g ,                       
then,  
⋅ =S u r                                       (22)   
S is called the Schur complement matrix associated with the 
interface variables u. 
 
Lemma 4: If 
   
C E
A =
F D
 is SPD, then the Schur 
complement matrix 1−=S C - ED F  is SPD.  
 
If the circuit is split into two sub-circuits by level-one wire 
tearing, the system of Sub-circuit 1 is (7): 
       
= +             
1 1 1 1 1
1 1 1 1
C E u f i
F D y g 0
                           
Eliminate the inner variables y1: 
Fig. 7.  Test VTM on a number of processors. 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
10
( )
+
-1 -1
1 1 1 1 1 1 1 1 1 1
-1 -1
1 1 1 1 1 1
C - E D F u = f - E D g + i
y = -D F u D g
 
Set 1 1=
-1
1 1 1S C - E D F , 1 1=
-1
1 1 1r f - E D g , then,  
1 1 1⋅ = + 1S u r i                                        (23) 
S1 is called the Schur complement matrix associated with the 
interface variables u1 in Sub-circuit 1. 
The system of Sub-circuit 2 is (8): 
2 2 2 2 2
2 2 2
       
= +             2
C E u f i
F D y g 0
                    
Eliminate the inner variables y2: 
( )2 2 2 2 2 2 2 2 2
2 2 2 2 2 2+
-1 -1
2
-1 -1
C - E D F u = f - E D g + i
y = -D F u D g
 
Set 2 2 2 2 2=
-1S C - E D F , 2 2 2 2 2=
-1r f - E D g , then,  
2 2 2⋅ = + 2S u r i                                    (24) 
S2 is called the Schur complement matrix associated with the 
interface variables u2 in Sub-circuit 2. 
Consequently, we have: 
1 2 1 2 2 2 2
2 2 2
+ = +
= −
=
-1 -1
1 1 1
-1 -1
1 1 1
S S C - E D F C - E D F
C - E D F E D F
S
              
   
1 2 1 1 1 1 2 2 2 2
1 1 1 2 2 2
+ = +
=
=
-1 -1
-1 -1
r r f - E D g f - E D g
f - E D g - E D g
r
                     
Until now, we have eliminated all the inner nodes in the 
resistor network by Schur complement technique, and all the 
nodes are interfacial nodes. The linear system of (22) is 
partitioned into two sub-systems of (23) and (24) by wire 
tearing. 
The iterative formula is Transmission Delay Equations (5), 
1 1
1 1 2 2
1 1
2 2 1 1
k k k k
k k k k
− −
− −
+ ⋅ − + ⋅
+ ⋅ − + ⋅
i W u = i W u
i W u = i W u
 
Here W is the characteristic admittance matrix of the 
transmission lines. W is SPD. 1−=Z W . 
    Eliminate 1
ki  and 2
ki  by combine (23), (24) and (5), we find, 
( ) ( )1 11 2 2 2k k k k− −⋅ ⋅1 1 1 2W u + S u - r = W u - S u - r  
( ) ( ) 12 2k k −− − +1 1 1 2W + S u r = W S u r  
Thus, 
( ) ( )
( )
1 1
2
1
k k− −
−
−
+
1 1 2
1
u = W + S W S u
W + S r
                     (25) 
Similarly, we obtain: 
( ) ( )
( )
1 1
2 2 1
1
2
k k− −
−
−
+
1u = W + S W S u
W + S r
                     (26) 
Merge (25) and (26), we come to know the iterative formula for 
k
1u , 
( ) ( )
( ) ( )
( ) ( ) ( )
( )
1
2
1 2
2 1 1
1 1
2 2
1
k
k
−
−
−
− −
−
−
× −
+ −
+
1 1
1
1
u = W + S W S
W + S W S u
W + S W S W + S r
W + S r
       (27) 
Set: 
( )( ) 1−= −1 1 1T W S W + S  
( )( ) 12 2 2 −= −T W S W + S  
Then, 
( ) ( )
( )
( )
1 2
1
1
1
k k− −
−
−
− −
+ − ⋅
+ − ⋅
1 1 1 2 1
1 1 2
1 1
u = W S T T W S u
W S T T r
W S T r
                 (28) 
Reformat (28) as (29): 
( ) ( ) 21k k−− −
+ ⋅ + ⋅
1 1 1 2 1
1 2 1
W S u = T T W S u
T T r T r
                   (29) 
If we make use a new variable:  
( )k k= −1 1 1a W S u  
then, (29) could be expressed as (30): 
2
1 1
k k −
⋅ +1 1 2a = T T a h                                (30) 
where = ⋅ ⋅ + ⋅1 1 2 1h T T r T r . 
In the following text, we will prove that (30) is convergent. 
Here we use Z-1 instead of W. 
( )( )
( ) ( )
11 1
11
−
− −
−
−
= −
= − ⋅ ⋅
1 1 1
1 1
T Z S Z + S
Z I Z S I + Z S Z
              (31) 
( )( )
( ) ( )
11 1
2 2 2
11
2 2
−
− −
−
−
= −
= − ⋅ ⋅
T Z S Z + S
Z I Z S I + Z S Z
              (32) 
 
Lemma 5: If Z is an SPD matrix of dimension n, then there 
exists a matrix Z , which satisfies: 
T
Z Z = Z . 
 
Proof: Because Z is SPD, 
TT TZ = Q ΛQ = Q Λ ΛQ = Z Z  
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
11
where TQ Q = I . I is the identity matrix. 
1 2( , , , )ndiag α α α=Λ  , iα +∈ . 
1 2( , , , )ndiag α α α=Λ  .  
=Z ΛQ . 
End of proof [25].  
 
Lemma 6: If both Z and S are SPD matrix of dimension n, 
then ⋅Z S  has the same eigenvalues as ⋅ ⋅
T
Z S Z . 
 
Proof: Because S and Z are SPD, ⋅Z S  is positive-definite, 
and 
T
Z S Z  is SPD.  
Assume ⋅ ⋅ =
T TZ S Z QTQ , where TQQ = I , 
),,,( 21 ntttdiag =T , niti ,,2,1,0 => . Then, 
( )
( )
( ) ( )
⋅ = ⋅ ⋅ ⋅ ⋅
= ⋅ ⋅
= ⋅ ⋅
-1T T T
-1T TT
-1T T
Z S Z Z S Z Z
Z QTQ Z
Z Q T Z Q
 
End of proof. 
 
According to Lemma 5 and 6, we come to know that, 
⋅ ⋅
T T
1 1 1 1Z S Z = Q Λ Q   
where 1 2( , , , )ndiag λ λ λ=1Λ  , 0iλ > , 1, ,i n=  , 
=
T
1 1Q Q I . ( ) ( )⋅ -1T T1 1 1 1Z S = Z Q Λ Z Q . 
Similarly, we obtain, 
2 2 2 2⋅ ⋅
T TZ S Z = Q Λ Q  
where 2 1 2( , , , )ndiag μ μ μ=Λ  , 0iμ > , 1, ,i n=  , 
2 2 =
TQ Q I . ( ) ( )2 2 2 2⋅ -1T TZ S = Z Q Λ Z Q . 
 
Then, (31) could be decomposed as below: 
( ) ( )
( )( )
( ) ( )
( )( )
( ) ( )
( )( )( )
( )( ) ( )
( )( )( ) ( )
( )
11
1
1
11
1
1
11
1 1
1
11 , ,
1 1
n
n
diag λλλ λ
−
−
−
−
−
−
−
−
−
−
−
= − ⋅ ⋅
  
=   
−  
  
×    
= −
× +
= − +
 
−−
= ⋅ ⋅ 
+ + 
1 1 1
T T
1 1
-1T T
1 1 1
T T
1 1
-1T T
1 1 1
-1T T
1 1 1
-1T T
1 1 1
-1T T
1 1 1 1
T
1
T Z I Z S I + Z S Z
Z Q Z Q
Z
Z Q Λ Z Q
Z Q Z Q
Z
+ Z Q Λ Z Q
Z Z Q I Λ Z Q
Z Q I Λ Z Q Z
Z Z Q I Λ I Λ Z Q Z
Z Z Q Z ( )-1T 1Q Z
 
Similarly, (32) could be decomposed as: 
( )( )
( )( )( ) ( )
( ) ( )
11
2 2 2
11
2 2 2 2
1 1
2 2
1
11 , ,
1 1
n
n
diag μμ
μ μ
−
−
−
−
−
= − ⋅ ⋅
= − +
 
−−
= ⋅ ⋅ 
+ + 
-1T T
-1T T
T Z I Z S I + Z S Z
Z Z Q I Λ I Λ Z Q Z
Z Z Q Z Q Z
The spectral radius of T1T2 is defined as:                                   
( ) 1/( ) lim kk
k
ρ
→∞
=1 2 1 2T T T T  
So we first calculate ( )2 k1T T . Here A  is the spectral 
norm of the square matrix A. 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
12
( )
( )( )( ) ( )
( )( )( ) ( )
( )( )
( ) ( )
( ) ( )
( )( )
( )( )
( )( )
2
11
1 1 1 1
11
2 2 2 2
1
1 1 1 1 1
1
2 2 2 2
1
1 1 1 1 1
1
2 2 2 2
1
1 1 1 11
1
2 2 2 2
k
k
k
k
−
−
−
−
−
−
−
−
−
−
−
−
−
 
− + 
=   × − +  
 
− + =  × − + 
− +
≤ ⋅ ⋅
× − +
 
⋅ − + ⋅ ≤ ⋅ 
× ⋅ − + ⋅ 
1
-1T T
-1T T
T
T
T
T
T
T
T T
Z Z Q I Λ I Λ Z Q Z
Z Z Q I Λ I Λ Z Q Z
Q I Λ I Λ Q
Z Z
Q I Λ I Λ Q
Q I Λ I Λ Q
Z Z
Q I Λ I Λ Q
Q I Λ I Λ Q
Z
Q I Λ I Λ Q
( )( )
( ) ( )
1
1 11
1
2 2
1 1
1
1
1
1 1
1
1
1
11 , ,
1 1
11 , ,
1 1
11max , ,
1 1
11max , ,
1 1
k
k
k
n
n
k
n
n
k n
n
k n
n
diag
diag
λλ
λ λ
μμ
μ μ
λλ
λ λ
μμ
μ μ
−
−
−
−
−

×
 
− + ⋅ ≤ ⋅ ⋅ 
× − +  
 
−−
= ⋅  
+ + 
 
−−
× ⋅ 
+ + 
 
−−≤ ⋅   + + 
 
−−
× ⋅  + + 
Z
I Λ I Λ
Z Z
I Λ I Λ
Z
Z
Z
Z




 
As the result, 
( ) 1/
1
1
1
1
2
( ) lim
11
max , ,
1 1
11
max , ,
1 1
( ) ( )
kk
k
n
n
n
n
ρ
λλ
λ λ
μμ
μ μ
ρ ρ
→∞
=
 
−−
≤   + + 
 
−−
×   + + 
=
1 2 1 2
1
T T T T
T T


 
Because 0iλ > , 1, ,i n=  , 
1
1
11
( ) max , , 1
1 1
n
n
λλρ λ λ
 
−−
= <  + + 1
T  ;  
Because 0iμ > , 1, ,i n=  , 
1
2
1
11
( ) max , , 1
1 1
n
n
μμρ
μ μ
 
−−
= <  + + 
T  . 
Consequently,  
( ) 1ρ <1 2T T  
So we conclude that VTM converges for 2-partite resistor 
network [45]. 
According to (27), when this algorithm is convergent, 
2
1 1 1
k k− ∞
= =u u u , then, 
( ) ( )
( ) ( )
( ) ( ) ( )
( )
1
2
1 2
2 1 1
1 1
2 2
1
k
k
−
−
−
− −
−
−
× −
+ −
+
1 1
1
1
u = W + S W S
W + S W S u
W + S W S W + S r
W + S r
 
( ) ( )( ) ( )
( ) ( ) ( ) ( )
( ) ( )( ) ( )
( ) ( )
( )( )( )
( )( )( )
( ) ( )
( )( )( )
( )( )( )
11 1
2 2 1
1 1 1
2 2
11
2 2 1
1
2 2
11
2 2
1
2 2
1
2 2
11
2 2
1
2 2
−
− −∞
− − −
−
−
−
−
−
−
−
−
−
−
 
− − − 
 × + − ⋅ 
 = − − − 
 × + − ⋅ 
 
− − + 
=  
+ −  
 × + − ⋅ 
 + −
=
× − − +
1 1
1 1
1
1
1
u = I W + S W S W + S W S
W + S W + S W S W + S r
W + S W S W + S W S
I W S W + S r
I W S W + S W
I W S W + S S
I W S W + S r
I W S W + S
I W S W + S W S
( ) ( ) ( )( )
( ) ( )( )( )
( )( ) ( )( )
( ) ( )
( ) ( )
( )( )
( )( )
1
11
2 2 2
1
2 2 2
11 1
2 2 2
111
2 2 2
111 1
2 2 2
111 1
2 2 2 2
11 1
2 2 2
2 2
−
−
−
−
−
− −
−
−
−
−
−
− −
−
−
− −
−
− −
    
 + − =  × − − + 
 = + 
 = + 
 
= +  
 
= +  
= +
1
1
1
1
1
1
r
W + S W + S W S
r
W + S W S W + S W S
W + S W S W + S W S r
W + S W S W + S W S r
I + S W S I + W S S r
S + S W S I + W S S r
S I + W S I + W S S
[ ]
1
1
2
1
−
−
−
   
= +
= ⋅
1
r
S S r
S r
 
 
Until now, we have proved that, VTM is convergent for 
2-partite resistor network when using level-one wire tearing. 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
13
For the more general cases, the resistor network is k-partite, and 
the wire tearing might be level-one or multilevel. A general 
proof was presented in [62]. 
 
ACKNOWLEDGMENT 
Discussions with Hao Zhang, Yi Su, Bin Niu, Yu Wang, Wei 
Xue, Peng Zhang and Chun Xia were very helpful. Thanks are 
due to Qi Wei, Bo Zhao, Xia Wei, Xiaojian Mao, Yongpan Liu, 
Fei Qiao and Rong Luo for encouragement and support. We are 
very grateful for the public domain softwares provided by John 
R. Gilbert, Tim A. Davis, Robert Bridson, Haifeng Qian and et 
al.  
REFERENCES 
[1] S. Balay, K. Buschelman, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. 
Knepley, L. Curfman McInnes, B. F. Smith and H. Zhang, PETSc Users 
Manual, ANL-95/11 - Revision 2.1.5, Argonne National Laboratory. 
[2] R. Barrett, M. Berry, T. Chan, J. Demmel, J. Donato, J. Dongarra, V. 
Eijkhout, R. Pozo, C. Romine and H. Van der Vorst. Templates for the 
solution of Linear Systems: Building Blocks for Iterative Methods, 2nd 
Edition, SIAM, 1994. 
[3] A. Basermann, U. Jaekel, and K. Hachiy. Preconditioning parallel sparse 
iterative solvers for circuit simulation. In Proceedings of the 8th SIAM 
Proceedings on Applied Linear Algebra, Williamsburg VA, 2003. 
[4] A. Basermann, U. Jaekel, and M. Nordhausen. Parallel iterative solvers for 
sparse linear systems in circuit simulation. Fut. Gen. Fut. Gen.. Comput. 
Sys., 21(8):1275–1284, 2005. 
[5] M. Benzi. Preconditioning techniques for large linear systems: A survey. 
Journal of Computational Physics, 182:418–477, 2002. 
[6] J. Bolz, I. Farmer, E. Grinspun, and P. Schröoder. Sparse matrix solvers on 
the GPU: conjugate gradients and multigrid. In ACM SIGGRAPH 2003 
(San Diego, California, July 27 - 31, 2003).  
[7] C. W. Bomhof and H. A. Van der Vorst, A parallel linear system solver for 
circuit simulation problems, Numerical Linear Algebra with Applications, 
7 (2000). 
[8] F. H. Branin. Transient analysis of lossless transmission lines, 
Proceedings of the IEEE, vol. 55, pp. 2012–2013, Nov. 1967. 
[9] R. Bridson. A MATLAB CMEX interface to the Metis library. Available 
at http://www.stanford.edu/~rbridson/download/metismex.c 
[10] R. Bridson and W.-P. Tang. Refining an approximate inverse. Journal on 
Computational and Applied Math, 123 (2000), Numerical Analysis 2000 
vol. III: Linear Algebra, pp. 293-306. 
[11] S. Cauley, V. Balakrishnan, Cheng-Kok Koh, A parallel direct solver for 
the simulation of large-scale power/ground networks, IEEE Transactions 
on Computer-Aided Design of Integrated Circuits and Systems, v.29 n.4, 
p.636-641, April 2010. 
[12] T. H. Chen and C. C.-P Chen. Efficient large-scale power grid analysis 
based on preconditioned Krylov-subspace iterative methods. In Proc. 
IEEE/ACM DAC, pages 559--562, 2001. 
[13] R. E. Collin. Foundations for microwave engineering, 2nd edition, 
Wiley-IEEE Press, 2000. 
[14] T. A. Davis and Y. F. Hu. The University of Florida Sparse Matrix 
Collection. Submitted to ACM Transactions on Mathematical Software. 
Available at: http://www.cise.ufl.edu/research/sparse/matrices/ 
[15] J. Demmel, J. Gilbert, and X. Li. An asynchronous parallel supernodal 
algorithm for sparse gaussian elimination. SIAM J. Matrix Analysis and 
Applications, 20(4): 915–952, 1999. 
[16] Zhuo Feng, Peng Li. Multigrid on GPU: tackling power grid analysis on 
parallel SIMT platforms, Proceedings of the 2008 IEEE/ACM 
International Conference on Computer-Aided Design, November 10-13, 
2008, San Jose, California.  
[17] Zhuo Feng, Zhiyu Zeng. Parallel multigrid preconditioning on graphics 
processing units (GPUs) for robust power grid analysis. In Proceedings of 
the 47th Design Automation Conference (Anaheim, California, June 13 - 
18, 2010). DAC '10. ACM, New York, NY, 661-666.   
[18] T. L. Floyd. Principles of electric circuits, 6th edition, Prentice Hall, 1999. 
[19] David Fritzsche, Andreas Frommer, and Daniel B. Szyld,  Overlapping 
blocks by growing a partition with applications to preconditioning, 
Research Report 10-07-26, Department of Mathematics, Temple 
University, July 2010. 
[20] N. Frohlich, B. M. Riess, U. A. Wever, and Q. Zheng A New Approach for 
Parallel Simulation of VLSI Circuits on a Transsitor Level. TCAD, June 
1998.  
[21] M. J. Gander, L. Halpern, and F. Nataf. Optimized Schwarz Methods. In T. 
Chan, T. Kako, H. Kawarada, O. Pironneau (eds.), Proceedings of the 
Twelveth International Conference on Domain Decomposition, DDM 
press, 2001, pp. 15–27. 
[22] M. J. Gander, Schwarz Methods in the Course of Time, Electronic 
Transactions on Numerical Analysis, 31:228–255, 2008. 
[23] A. George and J. W. Liu. Computer Solution of Large Sparse Positive 
Definite Systems. Prentice-Hall, Englewood Cliffs, New Jersey, 1981. 
[24] John R. Gilbert, Gary L. Miller, and Shang-Hua Teng. Geometric mesh 
partitioning: Implementation and experiments. SIAM J. Scientific 
Computing 19:2091-2110, 1998. 
[25] G. H. Golub and C. F. Van Loan, Matrix computations. Johns Hopkins 
University Press, 1989. 
[26] Wei Huang. HotSpot: A chip and package compact thermal modeling 
methodology for VLSI design. Ph.D. Dissertation, University of Virginia, 
2007. 
[27] Russell Kao. Piecewise Linear Models for Switch-Level Simulation. 
Chapter 5.6.1, Node and Branch Tearing. Technical Report, 
CSL-TR-92-532, Stanford University, 1992. 
[28] G. Karypis and V. Kumar. Metis 4.0: Unstructured graph partitioning and 
sparse matrix ordering system, Technical report, Department of Computer 
Science, University of Minnesota, 1998. Available at 
http://www.cs.umn.edu/metis 
[29] D. P. Koester. Parallel Block-Diagonal-Bordered Sparse Linear Solvers 
for Power Systems Applications. Ph.D dissertation, Syracuse University. 
1995. 
[30] J. N. Kozhaya and S. R. Nassif. Fast Power Grid Simulation. In 
Proceedings of the 37th Design Automation Conference, 2000. 
[31] J. N. Kozhaya, S. R. Nassif, and F. N. Najm, A multigrid-like technique for 
power grid analysis, IEEE Trans. Computer-aided Design, vol. 21, pp. 
1148--1160, Oct. 2002.  
[32] Zhao Li, C.-J. Richard Shi. A coupled iterative/direct method for efficient 
time-domain simulation of nonlinear circuits with power/ground networks. 
ISCAS (5) 2004: 165-168. 
[33] Xiaoye S. Li. An overview of SuperLU: Algorithms, implementation, and 
user interface. ACM Trans. Math. Softw. 2005. 
[34] Peng Li. What Is Parallel Circuit Simulation? ACM/SIGDA 
E-NEWSLETTER, vol. 40, No. 4, April 1, 2010 
[35] P. L. Lions, On the Schwarz alternating method III: a variant for 
nonoverlapping subdomains, Third International Symposium on Domain 
Decomposition Methods for Partial Differential Equations, 1989, Houston, 
Texas. 
[36] Sébastien Loisel and Daniel B. Szyld,  On the convergence of Optimized 
Schwarz Methods by way of Matrix Analysis , Domain Decomposition 
Methods in Science and Engineering XVIII, Michel Bercovier, Martin 
Gander, Ralf Kornhuber, and Olof B. Widlund, editors. Lecture Notes in 
Computational Science and Engineering, Vol. 70, Springer, 2009, pages 
363-370. 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
14
[37] J. D. Meindl. Beyond Moore's Law: The Interconnect Era. Computing in 
Science and Engineering. 2003.  
[38] L. Nagel. SPICE2: A Computer Program to Simulate Semiconductor 
Circuits, Electronics Research Laboratory Report No. ERL-M520. 
University of California, Berkeley, 1975. 
[39] Jan Ogrodzki. Circuit simulation methods and algorithms. CRC Press, 
1994. 
[40] H. J. Pain. The physics of vibrations and waves, Wiley, 1976. 
[41] He Peng, Chung-Kuan Cheng. Parallel transistor level full-chip circuit 
simulation. DATE 2009: 304-307. 
[42] H. Qian, S. R. Nassif, and S. S. Sapatnekar, Power grid analysis using 
random walks, IEEE Trans. Computer-aided Design, vol. 24, pp. 
1204--1224, Aug. 2005.  
[43] H. Qian and S. S. Sapatnekar, Stochastic Preconditioning for Diagonally 
Dominant Matrices, SIAM Journal on Scientific Computing, Vol. 30, No. 
3, pp. 1178 – 1204, March, 2008.  
[44] T. L. Quarles. Analysis of Performance and Convergence Issues for 
Circuit Simulation. ERL Memo No. UCB/ERL M89/42 April 1989. 
[45] Y. Saad. Iterative Methods for Sparse Linear Systems. The PWS 
Publishing Company, Boston, 1996. Second edition, SIAM, Philadelphia, 
2003. 
[46] R. A. Saleh, K. A. Gallivan, M. C. Chang, et al. Parallel circuit simulation 
on supercomputers. Proceedings of the IEEE, 1989. 
[47] Alberto Sangiovanni-Vincentelli, Li-Kuan Chen, and Leon O. Chua. A 
new tearing approach – node-tearing nodal analysis. In IEEE International 
Symposium on Circuits and Systems, 1977. 
[48] Jin Shi, Yici Cai, Wenting Hou, Liwei Ma, Sheldon X.-D. Tan, Pei-Hsin 
Ho, Xiaoyi Wang. GPU friendly fast Poisson solver for structured power 
grid network analysis, Proceedings of the 46th Annual Design Automation 
Conference, July 26-31, 2009, San Francisco, California. 
[49] Ken Stanley, T. A. Davis. KLU: a "Clark Kent" sparse LU factorization 
algorithm for circuit matrices. 2004 SIAM Conference on Parallel 
Processing for Scientific Computing (PP04). Originally appeared in NA 
Digest, 1997. 
[50] Heidi K. Thornquist, Eric R. Keiter, Robert J. Hoekstra, David M. Day, 
Erik G. Boman, A parallel preconditioning strategy for efficient 
transistor-level circuit simulation, Proceedings of the 2009 International 
Conference on Computer-Aided Design, November 02-05, 2009, San Jose, 
California. 
[51] John R. Gilbert. Meshpart, a public domain matlab toolbox for sparse 
matrix partitioning. Available at  
http://www.cerfacs.fr/algor/Softs/MESHPART/ 
[52] CircuitSim90, 1990 Circuit Simulation and Modeling Workshop at 
MCNC, Available at http://www.cbl.ncsu.edu/CBL_Docs/csim90.html 
[53] Mathworks Corparation. Matlab User Manual. R13. 2002. 
[54] Mathworks Corparation. Simulink User Manual. R13. 2002. 
[55] A. Toselli and O. Widlund. Domain Decomposition Methods – 
Algorithms and Theory. Springer Series in Computational, Mathematics 
34, Springer, Berlin, Heidelberg, 2005. 
[56] R. S. Varga. Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs, 
New Jersey, 1962. Second Edition, Springer Series in Computational 
Mathematics 27, Springer, Berlin, Heidelberg, New York, 2000. 
[57] W. T. Weeks, A. J. Jimenez, G. W. Mahoney, and D. Mehta. Algorithms 
for ASTAP - A Network-Analysis Program. IEEE Trans.Circuit Theory, 
CT-20(4):628-634, 1973. 
[58] Fei Wei, Huazhong Yang. Directed Transmission Method, A Fully 
Asynchronous Approach to Solve Sparse Linear Systems in Parallel. In 
ACM Proceedings of the 20th Symposium on Parallelism in Algorithms 
and Architectures (Munich, Germany, June 14 - 16, 2008). SPAA 2008. 
[59] Fei Wei, Huazhong Yang, Virtual Transmission Method, A New 
Distributed Algorithm to Solve Sparse Linear Systems. The Fourth 
International Conference on Networked Computing and Advanced 
Information Management, vol. 1, pp.703-709, 2008. 
[60] Fei Wei, Huazhong Yang. Waveform Transmission Method, A New 
Waveform Relaxation Based Algorithm to Solve Ordinary Differential 
Equations. Preprinted. 2009. 
[61] Fei Wei, Huazhong Yang. Transmission Line Inspires a New Distributed 
Algorithm to Solve the Nonlinear Dynamical System of Physical Circuits. 
The 5th International Conference on Computer Sciences and Convergence 
Information Technology, Nov. 30, Seoul, Korea, 2010. 
[62] Fei Wei, Huazhong Yang. Virtual Transmission Method, Algorithm and 
Theory. Preprinted. 2009. 
[63] J. White, A. Sangiovanni-Vincentelli, Relaxation Techniques for the 
simulation of VLSI circuits. Kluwer Academic Publishers, 1986. 
[64] Felix F. Wu. Solution of large-scale networks by tearing. IEEE 
Transactions on Circuits and Systems, 1976. 
[65] Wei Xue, Jiwu Shu, Yongwei Wu, Weimin Zheng. Parallel Algorithm and 
Implementation for Realtime Dynamical Simulation of Power System. 
ICPP 2005: 137-144.  
[66] D. M. Young. Iterative Solution of Large Linear Systems. Academic Press, 
New York, 1971.  
[67] Yu Zhong, M. D. F. Wong, Fast algorithms for IR drop analysis in large 
power grid, Proceedings of the 2005 IEEE/ACM International conference 
on Computer-aided design, p.351-357, November 06-10, 2005. 
[68] Quming Zhou, Kai Sun, Kartik Mohanram, Danny C. Sorensen. Large 
power grid analysis using domain decomposition, Proceedings of the 
conference on Design, automation and test in Europe: Proceedings, March 
06-10, 2006, Munich, Germany. 
[69] C. Zhuo, J. Hu, M. Zhao, and K. Chen. Power grid analysis and 
optimization using algebraic multigrid. IEEE Transactions on 
Computer-aided Design, 27:738–751, April 2008. 
 
Fei Wei received the B.S. degree in electronic engineering from Tsinghua 
University, Beijing, China in 2004. He is currently a Ph.D student in 
Department of Electronic Engineering, Tsinghua University. His research 
interests are distributed numerical algorithms and transistor-level circuit 
simulation. 
 
Huazhong Yang (M’97–SM’00) received the B.S. degree in microelectronics 
and the M.S. and Ph.D. degrees in circuits and systems from Tsinghua 
University, Beijing, China, in 1989, 1993, and 1998, respectively. Since 1993, 
he has been with the Department of Electronic Engineering, Tsinghua 
University, where he has been a Full Professor since 1998. His research 
interests include CMOS radio-frequency integrated circuits, VLSI system 
structure for digital communications and media processing, low-voltage and 
low power integrated circuits, and computer-aided design methodologies for 
system integration. He has authored or coauthored 6 books and more than 180 
journal and conference papers in this area and holds 9 China patents. He is also 
a coeditor of the research monograph High-speed Optical 
Transceivers-Integrated Circuits Designs and Optical Devices Techniques 
(World Scientific, 2006). 
Dr. Yang was a recipient of the fund for Distinguished Young Scholars from 
NSFC in 2000, the outstanding researcher award of the National Keystone 
Basic Research Program of China in 2004, and the Special Government 
Allowance from the State Council of China in 2006.served as a TPC member of 
the Asia-Pacific Conference on Circuits and Systems, the International 
Conference on Communications, Circuits and Systems, and the Asia and South 
Pacific Design Automation Conference. He is an Associate Editor of the 
International Journal of Electronics. 
 
