E-BLOW: E-Beam Lithography Overlapping aware Stencil Planning for MCC
  System by Yu, Bei et al.
E-BLOW: E-Beam Lithography Overlapping aware Stencil
Planning for MCC System
Bei Yu, Kun Yuan†, Jhih-Rong Gao, David Z. Pan
ECE Department, Univ. of Texas at Austin, Austin, TX, USA
†Cadence Design Systems, Inc., San Jose, CA, USA
{bei, jrgao, dpan}@cerc.utexas.edu
ABSTRACT
Electron beam lithography (EBL) is a promising maskless
solution for the technology beyond 14nm logic node. To
overcome its throughput limitation, recently the traditional
EBL system is extended into MCC system. In this pa-
per, we present E-BLOW, a tool to solve the overlapping
aware stencil planning (OSP) problems in MCC system. E-
BLOW is integrated with several novel speedup techniques,
i.e., successive relaxation, dynamic programming and KD-
Tree based clustering, to achieve a good performance in
terms of runtime and solution quality. Experimental re-
sults show that, compared with previous works, E-BLOW
demonstrates better performance for both conventional EBL
system and MCC system.
Categories and Subject Descriptors
B.7.2 [Hardware, Integrated Circuit]: Design Aids
General Terms
Algorithms, Design, Performance
Keywords
Electron Beam Lithography (EBL), Overlapping aware Sten-
cil Planning (OSP), Multi-Column Cell (MCC) System
1. INTRODUCTION
As the minimum feature size continues to scale to sub-
22nm, the conventional 193nm optical photolithography tech-
nology is facing great challenge in manufacturing [1]. In the
near future, double/multiple patterning lithography (DPL
/MPL) has become one of viable lithography techniques for
22nm and 14nm logic node [2–4] . In the longer future, i.e.,
for the logic node beyond 14nm, extreme ultra violet (EUV)
and electric beam lithography (EBL) are promising candi-
dates for lithographic processes. However, EUV suffers from
the delay due to the tremendous technical barriers such as
lack of power sources, resists, and defect-free masks [5].
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
DAC’13, May 29–June 07 2013, Austin, TX, USA.
Copyright 2013 ACM 978-1-4503-2071-9/13/05 ...$15.00.
Electrical Guns
Shaping Apentures
w1 w2
w3 w4
4 Regions on Wafer
Stencils
Figure 1: Printing process of MCC system.
EBL system, on the other hand, has been developed for
several decades [6]. One of the conventional EBL systems
is based on character projection (CP) mode. Some complex
shapes, called characters, are prepared on the stencil. The
key idea is that if a pattern is pre-designed on the stencil,
it can be printed in one electronic shot, otherwise it needs
to be fractured into a set of rectangles and printed one by
one through variable shaped beam (VSB). Compared with
purely VSB mode, in the CP mode the throughput can be
improved significantly. Compared with the traditional litho-
graphic methodologies, EBL has several advantages. (1)
Electron beam can be easily focused into nanometer diam-
eter with charged particle beam, which can avoid suffering
from the diffraction limitation of light. (2) The price of a
photomask set is getting unaffordable. As a maskless tech-
nology, EBL can reduce the manufacturing cost. (3) EBL
allows a great flexibility for fast turnaround times and even
late design modifications to correct or adapt a given chip
layout. Because of all these advantages, EBL is being used
in mask making, small volume LSI production, and R&D to
develop the technological nodes ahead of mass production.
Even with decades of development, the key limitation of
the EBL system has been and still is the low throughput.
Recently, multi-column cell (MCC) system is proposed as
an extension of conventional EBL system, where several in-
dependent character projections (CP) are used to further
speed-up the writing process [7, 8]. Each CP is applied
on one section of wafer, and all CPs can work parallelly
to achieve better throughput. Due to the design complex-
ar
X
iv
:1
40
2.
24
35
v1
  [
cs
.A
R]
  1
1 F
eb
 20
14
A B C
D E F
(a)
A B C
D E
F
(b)
Figure 2: (a) 1D-OSP. (b) 2D-OSP.
ity and cost consideration, different CPs share one stencil
design [9]. One example of MCC printing process is illus-
trated in Fig. 1, where four CPs are bundled to generate a
MCC system. The whole wafer is divided into four regions,
w1, w2, w3 and w4, and each region is printed through one
CP. Note that the whole writing time of the MCC system
is determined by the maximum writing time of the four re-
gions. For modern design, because of the numerous distinct
circuit patterns, only limited number of patterns can be em-
ployed on stencil. Therefore, the area constraint of stencil
is the bottleneck. To improve the throughput, the stencil
should be carefully designed/manufactured to contain the
most repeated cells or patterns.
Much previous works focus on the design optimization for
conventional EBL system [10–14]. Stencil planning, which
is one of the most challenges, has earned much attentions.
When blank overlapping is not considered, the stencil plan-
ning can be formulated as a character selection problem,
where integer linear programming (ILP) was applied to se-
lect group of characters for throughput maximization [10].
Recently, Yuan et al. in [12] investigated on the overlapping
aware stencil planning (OSP) problem.
However, no existing stencil planning work has done for
the MCC system. Compared with conventional EBL system,
MCC system introduces two main challenges. First, the ob-
jective is new: in MCC system the wafer is divided into sev-
eral regions, and each region is written by one CP. Therefore
the new OSP should minimize the maximal writing times of
all regions. While in conventional EBL system, the objec-
tive is simply minimize the wafer writing time. Besides, the
stencil for an MCC system can contain more than 4000 char-
acters, previous methodologies for EBL system may suffer
from runtime penalty.
The OSP problem can be divided into two sub-problems:
1D-OSP and 2D-OSP [12]. When standard cells with same
height are selected into stencil, the problem is referred as
1D-OSP. As shown in Fig. 2(a), each character implements
one standard cell, and the enclosed circuit patterns of all
the characters have the same height. Note that here we
only show the horizontal blanks, and the vertical blanks are
not represented because they are identical. In 2D-OSP, the
blanking spaces of characters are non-uniform along both
horizontal and vertical directions. By this way, stencil can
contain both complex via patterns and regular wires. Fig. 2(b)
illustrates a stencil design example for 2D-OSP.
This paper presents E-BLOW, the first study for OSP
problem in MCC system. E-BLOW integrates several novel
techniques to achieve near-optimal solution in reasonable
time. The main contributions of this paper are stated as fol-
lows: (1) First study for stencil planning problem in MCC
system. (2) Shows that OSP problem for both EBL and
MCC systems are NP-hard. (3) Proposes a simplified for-
mulation for 1D-OSP, and proves its rounding lower bound
theoretically. (4) A successive relaxation algorithm to find
a near optimal solution. (5) KD-Tree based clustering algo-
rithm for speedup in 2D-OSP.
The remainder of this paper is organized as follows. Sec-
tion 2 provides problem formulation. Section 3 presents al-
gorithm details to resolve 1D-OSP problem in E-BLOW,
while section 4 details the E-BLOW solutions to 2D-OSP
problem. Section 5 reports experimental results, followed
by the conclusion in Section 6.
2. PROBLEM FORMULATION
2.1 OSP Problem Formulation
In an MCC system with P CPs, the whole wafer is divided
into P regions {w1, w2, . . . , wP }, and each region is written
by one particular CP. We assume cell extraction [15] has
been resolved first. In other words, a set of character candi-
dates CC = {c1, · · · , cn} has already been given to the MCC
system. For each character candidate ci ∈ CC , its writing
time through VSB mode is denoted as ni, while its writing
time through CP mode is 1.
The regions of wafer have different layout patterns, and
the throughputs would be also different. Suppose character
candidate ci repeats tic times on region wc. Let ai indicate
selection of character candidate ci as follows.
ai =
{
1, candidate ci is selected on stencil
0, otherwise
If ci is prepared on stencil, the total writing time of patterns
ci on region wc is tic · 1. Otherwise, ci should be printed
through VSB. Since region wc comprises tic candidate ci,
the writing time would be tic · ni. Therefore, for region wc
the total writing time Tc is as follows:
Tc =
n∑
i=1
ai · (tic · 1) +
n∑
i=1
(1− ai) · (tic · ni)
=
n∑
i=1
tic · ni −
n∑
i=1
tic · (ni − 1) · ai = TV SBc −
n∑
i=1
Ric · ai
where TV SBc =
∑n
i=1 tic · ni, and Ric = tic · (ni − 1). TV SBc
represents the writing time on wc when only VSB is applied,
and Ric can be viewed as the writing time reduction of can-
didate ci on region wc. In MCC system, both T
V SB
c and
Ric are constants. Therefore, the total writing time of the
MCC system is formulated as follows:
Ttotal = max{Tc}
= max{TV SBc −
n∑
i=1
Ric · ai},∀c ∈ P (1)
Problem 1. Overlapping aware Stencil Planning (OSP) for
MCC system: Given a set of character candidate CC , select
a subset CCP out of CC as characters, and place them on
the stencil. The objective is to minimize the total writing
time Ttotal expressed by (1), while the placement of C
CP is
bounded by the outline of stencil. The width and height of
stencil is W and H, respectively.
For convenience, we use the term OSP to refer OSP for
MCC system in the rest of this paper.
2.2 NP-Hardness
Lemma 1. 1D-OSP problem is NP-hard.
Let us consider a special and simper case of 1D-OSP,
where each candidate ci has zero blank space, and CP num-
ber is 1. Then the problem can be reduced from a multiple
knapsack problem, which is a well known NP-hard prob-
lem [16].
Lemma 2. 2D-OSP problem is NP-hard.
Let us consider a special case of 2D-OSP, where each can-
didate ci has zero blank space, and CP number is 1. The
2D-OSP problem includes two subproblems: candidate se-
lection and candidate packing. After some candidates are
selected on the stencil, the candidates packing problem can
be reduced from a strip packing problem [17], which is NP-
hard.
Combining Lemma 1 and Lemma 2, we can achieve the
conclusion that OSP problem, even for conventional EBL
system, is NP-hard.
3. E-BLOW FOR 1D-OSP
When each character implements one standard cell, the
enclosed circuit patterns of all the characters have the same
height. Corresponding OSP problem is called 1D-OSP, which
can be viewed as a combination of character selection and
single row ordering problems [12]. Different from two heuris-
tic steps proposed in [12], we show that the two problems
can be solved simultaneously through ILP formulation (2).
min Ttotal (2)
s.t Ttotal ≥ TV SBc −
n∑
i=1
(
M∑
k=1
Ric · aik), ∀c ∈ P (2a)
xi + wi ≤W, ∀i ∈ N (2b)
m∑
k
aik ≤ 1, ∀k ∈M (2c)
xi + wij − xj ≤W (2 + pij − aik − ajk) (2d)
xj + wji − xi ≤W (3− pij − aik − ajk) (2e)
aik, ajk, pij : 0− 1 variable (2f)
In (2) W is the width constraint of stencil, M is the num-
ber of rows, and wi is width of character ci. xi is the x-
position of ci. If and only if ci is assigned to kth row,
aik = 1. In other words, aik determines the y-position of
ci. wij = wi − ohij and wji = wi − ohji, where ohij is the
overlapping when candidates ci and cj are packed together.
Constraints (2d) (2e) are used to check position relationship
between ci and cj . For kth row, it is easy to see that only
when aik = ajk = 1, i.e. both character i and character j
are assigned to row j, then only one of the two constraints
(2d) (2e) will be active. If either of them are not assigned
to the row, neither of the constraints are active. The num-
ber of variables for (2) is O(N2), where N is the number of
character candidates.
Since ILP is a well known NP-hard problem, directly solv-
ing it may suffer from long runtime penalty. One straight-
forward speedup method is to relax ILP (2) into linear pro-
gramming (LP) as following: replacing constraints (2f) by
0 ≤ aik, ajk, pij ≤ 1. It is obvious that the LP solution
provides a lower bound to the ILP solution. However, we
observe that the solution of relaxed LP would be like this:
for each i,
∑
j aij = 1 and all the pij are assigned 0.5. Al-
though the objective function is minimized and all the con-
straints are satisfied, this LP relaxation provides no useful
Apply S-Blank 
Assumption
Successive Rounding
Simplified LP Formulation
Refinement
Output 1D-Stencil
Solve New LP
Finish?
Update LP
No
Yes
Regions Info
Characters 
Info
Figure 3: E-BLOW overall flow for 1D-OSP.
information to guide future rounding, i.e., all the charac-
ter candidates are selected and no ordering relationship is
determined.
To overcome the limitation of above rounding, E-BLOW
proposes a novel iterative solving framework to search near-
optimal solution in reasonable runtime. The main idea is
to modify the ILP formulation, so that the corresponding
LP relaxation can provide good lower bound theoretically.
As shown in Fig. 3, the overall flow includes three parts:
Simplified ILP formulation, Successive Rounding and Re-
finement. At section 3.2 the simplified formulation will be
discussed, and its LP Rounding lower bound will be proved.
Function SuccRounding() is the successive rounding method,
which will be introduced at section 3.3. At last, section 3.4
proposes a dynamic programming based refinement.
3.1 Symmetrical Blank (S-Blank) Assumption
Our simplified formulation is based on a symmetrical blank
assumption: the blanks of each character is symmetry, left
slack equals to right slack. Note that for different characters
i and j, their slacks si and sj can be different.
At first glance this assumption may lose optimality, how-
ever, it provides several practical and theoretical benefits.
(1) Single row ordering [12] was transferred into Hamilton
Cycle problem, which is a well known NP-hard problem and
even particular solver is quite expensive. Under the assump-
tion, this ordering problem can be optimally solved in O(n).
(2) The ILP formulation can be simplified to provide a rea-
sonable rounding bound theoretically. Compared with pre-
vious heuristic framework [12], the proved rounding bound
provides a better guideline for a global view search. (3)
To compensate the inaccuracy in the asymmetrical blank
cases, E-BLOW provides a refinement to further improve
the throughput.
Given p character candidates, single row ordering problem
adjusts the relative locations to minimize the total width.
Under symmetrical blank assumption, this problem can be
optimally solved by a two steps greedy approach. First,
all characters are sorted decreasingly by blanking space si;
second, they are inserted one by one. Each one can insert
at either left end or right end.
Theorem 1: Under S-Blank assumption, the greedy ap-
proach can get maximum overlapping space
∑
i si−max{si}.
In practical, we set si = d(sli + sri)/2e, where sli and sri
are ci’s left slack and right slack, respectively.
3.2 Simplified Formulation
To further simplify (2), we modify the objective function
through assigning each character ci with one profit value
profiti. Then based on the Theorem 1, the formulation (2)
can be simplified as follow:
max
∑
i
∑
j
aij · profiti (3)
s.t.
∑
i
(wi − si) · aij ≤W −Bj , ∀j (3a)
Bj ≥ si · aij , ∀i (3b)∑
j
aij ≤ 1, ∀ci ∈ CC (3c)
aij = 0 or 1 (3d)
(3a) and (3b) are based on Theorem 1 to calculate the row
width, where (3b) is to linearize max operation. Here Bj can
be viewed as the maximum blank space of all the characters
on row rj . (3c) means each character can be assigned into at
most one row. It’s easy to see that the number of variables
is O(nm). Generally speaking, single character number n
is much larger than row number m, so compared with ba-
sic ILP formulation (2), the variable number of (3) can be
reduced dramatically.
Furthermore, theoretically the simplified formulation (3)
can achieve reasonable LP rounding lower bound. To explain
this, let us first look at a similar program (3′) as follows:
max
∑
i
∑
j
(wi − si) · aij · ratioi (3′)
s.t.
∑
i
(wi − si) · aij ≤W −maxs (3a′)
(3c)− (3d)
where ratioi = profiti/(wi − si), and maxs is the maxi-
mum horizontal slack length of every character, i.e. maxs =
max{si|i = 1, 2, . . . , n}. Program (3′) is a well known mul-
tiple knapsack problem [16].
Lemma 3. If each ratioi is the same, the multiple knapsack
problem (3′) can find a 1/2−approximation algorithm using
LP Rounding method.
For brevity we omit the proof, detailed explanations can
be found in [18]. It shall be noted that if all ratioi are
the same, program (3′) can be approximated to a max-flow
problem. Based on Lemma 3, if we denote α as max{ratioi}
/min{ratioi}, we can achieve the following theorem:
Theorem 2: The LP Rounding solution of (3) can be a
0.5/α− approximation to program (3′).
Due to space limit, the detailed proof is omitted. The only
difference between (3) and (3′) is that the right side values at
(3a) and (3a′). Blank spacing is relatively small comparing
with the row length, we can get that W −maxs ≈W −Bj .
Then based on Theorem 2, we can conclude that program
(3) has a reasonable rounding bound.
3.3 Successive Relaxation
Because of the reasonable LP rounding property shown in
Theorem 2, we propose a successive relaxation algorithm to
solve program (3) iteratively. The ILP formulation (3) be-
comes an LP if we relax the discrete constraint to a contin-
uous constraint as: 0 ≤ aij ≤ 1. The successive relaxation
algorithm is shown in Algorithm 1. At first we set all aij to
variables since any aij is not guided to rows. The LP is up-
dated and solved iteratively. For each new LP solution, we
Algorithm 1 SuccRounding( thinv )
Input: ILP Formulation (3)
1: set all aij to variables;
2: repeat
3: update profiti for all variables aij ;
4: solve relaxed LP of (3);
5: repeat
6: find apq = max{aij , and ci can insert into row rj};
7: for all aij ≥ apq × thinv do
8: if ci can be assigned to row rj then
9: aij = 1 and set it to a non-variable;
10: Update capacity of row rj ;
11: end if
12: end for
13: until cannot find apq
14: until
search the maximal apq (line 6). Then for all aij that is close
the the maximal value apq, we try to pack ci into row rj ,
and set it as non-variable. Note that since several aij are as-
signed permanent value, the number of variables in updated
LP formulation would continue to decrease. This procedure
repeats until no appropriate aij can be found. One key step
of the Algorithm 1 is the profiti update (line 3). For each
character ci, we set its profiti as follows:
profiti =
∑
c
tc
tmax
· (ni − 1) · tic (4)
where tc is current writing time of region wc, and tmax =
max {tc, ∀c ∈ P}. Through applying the profiti, the region
wc with longer writing time would be considered more dur-
ing the LP formulation. During successive relaxation, if ci
hasn’t been assigned to any row, profiti would continue to
updated, so that the total writing time of the whole MCC
system can be minimized.
3.4 Refinement
Simplified formulation and successive relaxation are under
the symmetrical blank assumption. Although it can be effec-
tively solved, for asymmetrical cases it would lose some opti-
mality. To compensate the losing, we present a dynamic pro-
gramming based refinement procedure. As discussed above,
for k characters, single row ordering can have 2k−1 possible
solutions. Under symmetrical blank space assumption, all
these orderings get the same length. But for the asymmetri-
cal cases, it does not hold anymore. Our dynamic program-
ming based algorithm Refine(k) finds the best solution from
these 2k−1 options. The detailed is shown in Algorithm 2.
At first, if k > 1, then Refine(k) will recursively call Re-
fine(k-1) to generate all old partial solutions. All these par-
tial solutions will be updated by adding candidate ck (lines
6-8). Note that maintaining all solutions is impractical and
unnecessary, because many of them are inferior to others. In
SolutionPruning(), all solutions are checked, if one solution
SA is inferior to another solution SB , SA would be pruned to
save computation cost. For each solution a triplet (w, l, r) is
constructed to store the information of width, left slack and
right slack. We define the inferior relationship as follow.
For two solutions SA = (wa, la, ra) and SB = (wb, lb, rb), SB
is inferior to SA if and only if wa ≥ wb, la ≤ lb and ra ≤ rb.
After Refine(k) for each row, if more available spaces are
generated, a greedy insertion approach similar to [12] would
Algorithm 2 Refine(k)
1: if k = 1 then
2: Generate partial solution (w1, sl1, sr1);
3: else
4: Refine(k-1);
5: for each partial solution (w, l, r) do
6: (w1, l1, r1) = (w + wk −min(srk, l), slk, r);
7: (w2, l2, r2) = (w + wk −min(slk, r), l, srk);
8: Replace (w, l, r) by (w1, l1, r1) and (w2, l2, r2);
9: if solution set size ≥ threshold then
10: SolutionPruning();
11: end if
12: end for
13: end if
be proposed to further improve the throughput.
4. E-BLOW FOR 2D-OSP
KD-Tree based
Clustering
Simulated Annealing 
based Packing
Output 2D-Stencil
Pre-FilterRegions Info
Characters 
Info
Figure 4: E-BLOW overall flow for 2D-OSP.
Now we consider a more general case: the blanking spaces
of characters are non-uniform along both horizontal and ver-
tical directions. This problem is referred as 2D-OSP prob-
lem. In [12] the 2D-OSP problem was transformed into a
floorplanning problem. However, several key differences be-
tween traditional floorplanning and OSP were ignored. (1)
In OSP there is no wirelength to be considered, while at
floorplanning wirelength is a major optimization objective.
(2) Compared with complex IP cores, lots of characters may
have similar sizes. (3) Traditional floorplanner could not
handle the problem size of modern MCC design. To deal
with all these properties, an approximation packing frame-
work is proposed (see Fig. 4). Given the input character
candidates, the pre-filter process is first applied to remove
characters with bad profit (defined in (4)). Then the second
step is a KD-Tree based clustering algorithm to effectively
speed-up the design process. Followed by the final floorplan-
ner to pack all candidates.
4.1 KD-Tree based Clustering
Clustering is a well studied problem, and there are many
of works and applications in VLSI [19]. However, previous
methodologies cannot be directly applied here. (1) Tradi-
tional clustering is based on netlist, which provides the all
clustering options. Generally speaking, netlist is sparse, but
in OSP the connection relationships are so complex that any
two characters can be clustered, and totally there are O(n2)
clustering options. (2) Given two candidates ci and cj , there
are several clustering options. For example, horizontal clus-
tering and vertical clustering may have different overlapping
space.
Algorithm 3 KD-Tree based Clustering
Input: set of candidates CC .
1: repeat
2: Sort all candidates by profiti;
3: Set each candidates ci to unclustered;
4: for all unclustered candidate ci do
5: Find pair (ci, cj) with similar blank spaces and prof-
its;
6: Cluster (ci, cj), label them as clustered;
7: end for
8: Update candidate information;
9: until reach clustering threshold
c5
c2
c4
c3
c1
c7
c6
c8
c9
Horizontal Space
Ve
rti
ca
l S
pa
ce
(a)
c5
c2
c3 c4
c1
c7
c6 c8
c9
(b)
Figure 5: KD-Tree based region searching.
The details of our clustering procedure are shown in Al-
gorithm 3. The clustering is repeated until the clustered
candidate number reaches the clustering threshold. Initially
all the candidates are sorted by profiti, it means those can-
didates with more shot number reduction are tend to be
clustered. Then clustering (lines 3-8) is carried out. For
each candidate ci, finding available cj may need O(n), and
complexity of the horizontal clustering and vertical cluster-
ing are both O(n2). Then the complexity of the whole pro-
cedure is O(n2), where n is the number of candidates.
A KD-Tree [20] is used to speed-up the process of find-
ing available pair (ci, cj). It provides fast O(logn) region
searching operations which keeping the time for insertion
and deletion small: insertion, O(logn); deletion of the root,
O(n(k − 1)/k); deletion of a random node, O(logn). Using
KD-Tree, the complexity of the Algorithm 3 can be reduced
to O(nlogn). A simple example is shown in Fig. 5. For the
sake of convenience, here for each candidate we only con-
sider horizontal and vertical space. Given candidate c2, to
find another candidate with the similar space, it may need
O(n) to scan all other candidates. However, using the KD-
Tree structure shown in Fig. 5(a), this finding procedure
can be viewed as a region searching, which can be resolved
in O(logn). Particularly, as shown in Fig. 5(b), only candi-
dates c1 − c5 are scanned.
4.2 Approximation Framework for 2D-OSP
In E-BLOW we adopt a simulated annealing based frame-
work similar to that in [12]. To demonstrate the effectivity
of our pre-filter and clustering methodologies, E-BLOW uses
the same parameters with that in [12]. Sequence Pair [21] is
used as a topology representation.
5. EXPERIMENTAL RESULTS
E-BLOW is implemented in the C++ programming lan-
guage and executed on a Linux machine with two 3.0GHz
CPU and 32GB Memory. GUROBI [22] is used to solve lin-
ear programming. Eight benchmarks from [12] are tested.
Besides, eight benchmarks (1M-x) are designed for 1D-OSP
and the other eight (2M-x) are generated for the 2D-OSP
problem. Character projection (CP) number are all set as
10. For small cases (1M-1, . . . , 1M-4, 2M-1, . . . , 2M-4) the
character candidate number is 1000, and the stencil size is
set as 1000µm×1000µm. For larger cases (1M-5 , . . . , 1M-8,
2M-5, . . . , 2M-8) the character candidate number is 4000,
and the stencil size is set as 2000µm × 2000µm. The size
and blank width of each character is similar to that in [12].
5.1 Comparison for 1D-OSP
For 1D-OSP, Table 1 compares E-BLOW with greedy method
and the heuristic framework in [12]. Note that the greedy
method was also described in [12]. Column ”char #” is num-
ber of character candidates, and column“CP#”is number of
character projections. For each algorithm, we record “shot
#”, “char #” and “CPU(s)”, where “shot #” is final num-
ber of shots and “char #” is number of characters on fi-
nal stencil, “CPU(s)” reports the runtime. From table 1 we
can see E-BLOW achieve better performance and runtime.
Compared with E-BLOW, the greedy algorithm introduces
47% more shots number, and [12] would introduce 19% more
shots number. Note that compared with heuristic method
in [12], mathematical formulation can provide global view,
even for traditional EBL system (1D-1, . . . , 1D-4), E-BLOW
achieves better shot number. Besides, E-BLOW can reduce
34.3% of runtime.
5.2 Comparison for 2D-OSP
For 2D-OSP, Table 2 gives the similar comparison. For
each algorithm, we also record“shot #”,“char #”and“CPU(s)”,
where the meanings are the same with that in Table 1. From
the table we can see that for each test case, although the
greedy algorithm is faster, its design results are not good
that it would introduce 30% more shot number. Besides,
compared with the work in [12], E-BLOW can achieve better
performance that the shot number can be reduced by 14%.
Meanwhile, because of the clustering method, E-BLOW can
reach 2.8× speed-up.
From both tables we can see that compared with [12], E-
BLOW can achieve a better tradeoff between runtime and
performance.
6. CONCLUSION
In this paper, we have proposed E-BLOW, a tool to solve
OSP problem in MCC system. For 1D-OSP, a successive
relaxation algorithm and a dynamic programming based re-
finement are proposed. For 2D-OSP, a KD-Tree based clus-
tering method is integrated into simulated annealing frame-
work. Experimental results show that compared with pre-
vious works, E-BLOW can achieve better performance in
terms of shot number and runtime, for both MCC system
and traditional EBL system. As EBL, including MCC sys-
tem, are widely used for mask making and also gaining mo-
mentum for direct wafer writing, we believe a lot more re-
search can be done for not only stencil planning, but also
EBL aware design.
Acknowledgment
This work is supported in part by NSF and NSFC.
7. REFERENCES
[1] B. Yu, J.-R. Gao, D. Ding, Y. Ban, J.-S. Yang, K. Yuan,
M. Cho, and D. Z. Pan, “Dealing with IC manufacturability in
extreme scaling,” in IEEE/ACM International Conference on
Computer-Aided Design (ICCAD), 2012, pp. 240–242.
[2] A. B. Kahng, C.-H. Park, X. Xu, and H. Yao, “Layout
decomposition for double patterning lithography,” in
IEEE/ACM International Conference on Computer-Aided
Design (ICCAD), 2008, pp. 465–472.
[3] B. Yu, K. Yuan, B. Zhang, D. Ding, and D. Z. Pan, “Layout
decomposition for triple patterning lithography,” in
IEEE/ACM International Conference on Computer-Aided
Design (ICCAD), 2011, pp. 1–8.
[4] K. Lucas, C. Cork, B. Yu, G. Luk-Pat, B. Painter, and D. Z.
Pan, “Implications of triple patterning for 14 nm node design
and patterning,” in Proc. of SPIE, vol. 8327, 2012.
[5] Y. Arisawa, H. Aoyama, T. Uno, and T. Tanaka, “EUV flare
correction for the half-pitch 22nm node,” in Proc. of SPIE, vol.
7636, 2010.
[6] H. C. Pfeiffer, “New prospects for electron beams as tools for
semiconductor lithography,” in Proc. of SPIE, 2009.
[7] H. Yasuda, T. Haraguchi, and A. Yamada, “A proposal for an
MCC (multi-column cell with lotus root lens) system to be used
as a mask-making e-beam tool,” in Proc. of SPIE, 2004.
[8] T. Maruyama, Y. Machida, S. Sugatani, H. Takita, H. Hoshino,
T. Hino, M. Ito, A. Yamada, T. Iizuka, S. Komatsue, M. Ikeda,
and K. Asada, “CP element based design for 14nm node EBDW
high volume manufacturing,” in Proc. of SPIE, 2012.
[9] M. Shoji, T. Inoue, and M. Yamabe, “Extraction and
utilization of the repeating patterns for CP writing in mask
making,” in Proc. of SPIE, 2010.
[10] M. Sugihara, T. Takata, K. Nakamura, R. Inanami, H. Hayashi,
K. Kishimoto, T. Hasebe, Y. Kawano, Y. Matsunaga,
K. Murakami, and K. Okumura, “Cell library development
methodology for throughput enhancement of character
projection equipment,” IEICE Transactions on Electronics,
vol. E89-C, pp. 377–383, 2006.
[11] K. Yuan and D. Z. Pan, “E-Beam lithography throughput
improvement with stencil planning and optimization,” in ACM
International Symposium on Physical Design (ISPD), 2011.
[12] K. Yuan, B. Yu, and D. Z. Pan, “E-Beam lithography stencil
planning and optimization with overlapped characters,” IEEE
Transactions on Computer-Aided Design of Integrated
Circuits and Systems (TCAD), vol. 31, no. 2, pp. 167–179,
Feb. 2012.
[13] P. Du, W. Zhao, S.-H. Weng, C.-K. Cheng, and R. Graham,
“Character design and stamp algorithms for character
projection electron-beam lithography,” in IEEE/ACM Asia
and South Pacific Design Automation Conference
(ASPDAC), 2012.
[14] B. Yu, J.-R. Gao, and D. Z. Pan, “L-Shape based layout
fracturing for e-beam lithography,” in IEEE/ACM Asia and
South Pacific Design Automation Conference (ASPDAC),
2013.
[15] S. Manakli, H. Komami, M. Takizawa, T.Mitsuhashi, and
L. Pain, “Cell projection use in mask-less lithography for 45nm
& 32nm logic nodes,” in Proc. of SPIE, 2009.
[16] S. Martello and P. Toth, Knapsack problems: algorithms and
computer implementations. New York, NY, USA: John Wiley
& Sons, Inc., 1990.
[17] C. Kenyon and E. Remila, “Approximate strip packing,” in
Foundations of Computer Science, 1996. Proceedings., 37th
Annual Symposium on, oct 1996, pp. 31 –36.
[18] M. Dawande, J. Kalagnanam, P. Keskinocak, F. Salman, and
R. Ravi, “Approximation algorithms for the multiple knapsack
problem with assignment restrictions,” Journal of
Combinatorial Optimization, vol. 4, pp. 171–186, 2000.
[19] C. J. Alpert and A. B. Kahng, “Recent directions in netlist
partitioning: a survey,” Integr. VLSI J., vol. 19, pp. 1–81,
August 1995.
[20] J. L. Bentley, “Multidimensional binary search trees used for
associative searching,” Commun. ACM, vol. 18, pp. 509–517,
September 1975.
[21] H. Murata, K. Fujiyoshi, S. Nakatake, and Y. Kajitani, “VLSI
module placement based on rectangle-packing by the
sequence-pair,” IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems (TCAD), vol. 12,
pp. 1518–1524, 1996.
[22] “GUROBI,” http://www.gurobi.com/html/academic.html.
Table 1: Result Comparison for 1D-OSP
char CP Greedy in [12] [12] E-BLOW
# # shot # char # CPU(s) shot # char # CPU(s) shot # char # CPU(s)
1D-1 1000 1 79193 876 0.2 50809 926 13.5 29536 934 2.2
1D-2 1000 1 122259 806 0.2 93465 854 11.8 44544 863 2
1D-3 1000 1 179822 708 0.2 152376 749 9.13 78704 758 2.7
1D-4 1000 1 223420 645 0.2 193494 687 7.7 107460 699 3.4
1M-1 1000 10 83786 876 0.2 53333 926 13.5 45243 938 4.3
1M-2 1000 10 123048 806 0.2 95963 854 11.8 81636 868 5.4
1M-3 1000 10 184950 708 0.2 156700 749 9.2 140079 769 10.8
1M-4 1000 10 225468 645 0.2 196686 687 7.7 179890 707 7.6
1M-5 4000 10 377864 3417 1.02 255208 3629 1477.3 227456 3650 59.2
1M-6 4000 10 542627 315 1.02 417456 3346 1182 373324 3388 65.1
1M-7 4000 10 760650 2809 1.02 644288 2986 876 570730 3044 58.68
1M-8 4000 10 930368 2565 1.01 809721 2734 730.7 734411 2799 65.3
Avg. - - 319454.6 1264.7 0.47 259958.3 1594.0 362.5 217751.1 1618.1 23.9
Ratio - - 1.47 0.78 0.02 1.19 0.99 15.2 1.0 1.0 1.0
Table 2: Result Comparison for 2D-OSP
char CP Greedy in [12] [12] E-BLOW
# # shot # char # CPU(s) shot # char # CPU(s) shot # char # CPU(s)
2D-1 1000 1 159654 734 2.1 107876 826 329.6 105723 789 65.5
2D-2 1000 1 269940 576 2.4 166524 741 278.1 170934 657 52.5
2D-3 1000 1 290068 551 2.6 210496 686 296.7 178777 663 56.4
2D-4 1000 1 327890 499 2.7 240971 632 301.7 179981 605 54.7
2M-1 1000 10 168279 734 2.1 122017 811 313.7 91193 777 58.6
2M-2 1000 10 283702 576 2.4 187235 728 286.1 163327 661 48.7
2M-3 1000 10 298813 551 2.6 235788 653 289 162648 659 52.3
2M-4 1000 10 338610 499 2.7 270384 605 285.6 195469 590 53.3
2M-5 4000 10 824060 2704 19 700414 2913 3891 687287 2853 59
2M-6 4000 10 1044161 2388 20.2 898530 2624 4245 717236 2721 60.7
2M-7 4000 10 1264748 2101 21.9 1064789 2410 3925.5 921867 2409 57.1
2M-8 4000 10 1331457 2011 22.8 1176700 2259 4550.0 1104724 2119 57.7
Avg. - - 550115 1218.1 8.3 448477 1324 1582.7 389930.5 1291.9 56.375
Ratio - - 1.41 0.94 0.15 1.15 1.02 28.1 1.0 1.0 1.0
