Abstract: This paper describes a design methodology for determining the number of stages in a cascaded time amplifier to minimize the area consumption. The total area consumption is categorized into three parts, which allows mathematical analysis and optimization to be performed. A combination of the proposed mathematical analysis and 2D mapping can determine the number of stages to minimize the area consumption.
Introduction
CMOS device scaling has been improving the available time resolution, which raises the possibility of developing better time-resolved circuits. In a timeresolved circuit design, one of the most important components is the time amplifier (TA).
The principle of a TA was proposed in 2003 [1] . Since then, many tech-niques for the design of TAs have been reported, such as a cascaded architecture [2] and a closed-loop architecture [3] . The cascaded architecture is expected to be used for high-gain time amplification. For example, the cascaded architecture has already been adopted for high-resolution on-chip jitter measurement in order to obtain high-gain characteristics [4] . Although TAs consists of only several gates and require small area occupation in usual situation, high-gain TAs require considerable footprint, which is fatal for cost-severe on-chip BIST application. For example, the previous design [4] result in 176% additional area overhead (from 490 μm 2 to 1350 μm 2 ) due to high-gain TA implementation. However, there has been no report on a design methodology for the determining the optimal number of stages in a cascaded TA. This paper introduces design guidelines for optimizing the design of a cascaded TA from a mathematical background. 
Here, g m is the transconductance of the metastable NAND gate, C is the output capacitance of the NAND gate, and T d is the delay of the delay cell. For constant transistor sizes for the NAND and XOR gates, the gain and input dynamic range can be changed by adjusting C and T d . Figure 2 shows a schematic of a two-stage TA. The gains of each TA are as follows:
The area consumption of a TA can be mainly categorized into that of the delay cell, S(T d ); that of the NAND and XOR gates, S(T r); and that of the output capacitor, S(C). S(T d ) is determined by the required input dynamic range and is constant for all stages. Meanwhile, S(T r) is determined by the implementation process, and S(C) is determined by the gain. Under the above conditions, the total area occupied by a two-stage cascaded TA can be expressed as
The design consideration for minimizing the total area can be written as
The above expression suggests that the gains should be constant for each stage in the TA in order to minimize the area consumption. This gain design methodology can also be applied to cases with more than three stages.
Design methodology for determining the number of stages in a cascaded TA
The occupied area of the TA, S, can be expressed as
Here, a p is an index term for implementation process and b p is an index term for the area efficiency for implementing the capacitor for the output of the NAND gate. Since the delay lines are consisted by the transistors, the area occupation of the delay lines can be expressed by the multiplication of that of the NAND and XOR gates. A schematic of an n-stage TA is shown in Fig. 3 . As stated in the previous section, each stage's gain is designed to be n √ α, where the total gain is α. Since the input time differences of the latter-stages' TAs are amplified by the former-stages' TA, the delay cells of all TAs have to be n √ α n−1 times larger than the input time difference, T d . In other words, the input dynamic range of each-stage TA is n √ α-times smaller than the output time difference,
Under this condition, the total area consumption of an nstage TA, S n can be expressed as
The deviation of the total area, S n can be expressed as A conceptual image of the design methodology for determining the optimum number of stages is depicted in Fig. 4 . The total area consumption is proportional to n n √ α, and the number of stages producing the minimum value is thus the optimum number of stages.
Optimization of the number of stages in a cascaded TA
This section introduces the design methodology for optimizing the number of stages in a cascaded TA after process porting such as from 90 nm CMOS to 65 nm CMOS and modification of the specification.
Design optimization at process porting
In process porting, a p and b p change while the design specification (the input dynamic range and total gain) is assumed to be constant. From (10), this optimum point can be derived as
From (11), a 2D mapping of the design optimization after process porting can be depicted as shown in Fig. 5 . In this graph, the gain and input dynamic range are fixed at 100 and 10 ps, respectively. Finally, the number of stages can be optimized analytically by referring to 2D mappings such as that in Fig. 5 .
Design optimization for modification of the specification
When the specification is modified, the total gain α and the input dynamic range T d change while a p and b p are assumed to be constant. From (10), this optimum point can be derived as From (12), a 2D mapping of design optimization at modification of the specification can be depicted as shown in Fig. 6 . In this graph, a p and b p are fixed at 10 10 and 500, respectively. These indexes are approximately determined by referring the state-of-the-art process of 65-nm CMOS technology. Finally, the number of stages after modification of the specification can be optimized analytically by referring the 2D mapping such as Fig. 6 . The application of Fig. 6 is as follows. For example, when the input dynamic range and gain are 10 ps and 10, respectively, the point marked by red square is referred. The point is included in the area where the optimum number of stages is two. This means that the optimum number of stages is two for area minimization. By process the above steps, we can minimize the area consumption of the cascaded open-loop time amplifier.
Verification of the effectiveness of the proposed optimization
This subsection shows the verification of the effectiveness of the proposed design optimization. For verifying the effectiveness, we have implemented the time amplifiers with typical specifications for on-chip jitter measurement application. Figure 7 shows the occupied area dependence on the number of stages. The characteristic curve indicates that there is an optimum point in the number of stages and the optimum number of stages is two as expected in the previous sub-section.
Discussion
The combination of subsections 4.1 and 4.2 forms a feasible design methodology for minimizing the area. This section introduces a simplified approach to estimate the optimum number of stages in specific situations.
Assuming that the total gain, α is sufficiently large, the area of the output capacitances of the NAND gates, S(C) is dominant. Under this assumption, the optimum number of stages can be determined theoretically using simple mathematical analysis.
Therefore, the optimum number of stages can be determined from only the total gain α when the area of the capacitance is dominant. In general, the scaling speed of the capacitance is slower than that of the transistors; therefore, this approximate analysis will be effective for future scaled technologies. 
Conclusion
The design methodology for optimizing the number of stages in a cascaded TA was demonstrated. The total area consumption was categorized into three parts (delay cells, output capacitance, logic gates), which made mathematical analysis and design optimization feasible. Finally a combination of the proposed mathematical analysis combined and 2D mapping yielded the number of stages for minimizing the area consumption.
