Abstract-Optimization-simulation loop-based method is popular and efficient in design migration/reuse automation. However, it is only restricted to be used in block-level due to the complexity of current mixed-signal system. This paper presents a hierarchical methodology for efficiently migrating mixed-signal circuit design from one technology node to another, while keeping the same circuit and layout topologies. It utilizes two stages of optimization processes to automatically resize and refine device dimensions in target technology. In the first stage, to avoid the costly simulation time without scarifying systematical functionality, only one block is represented in transistor level (TL), while other blocks are replaced with behavioral models. The multistart global optimization technique is applied to resize the TL block in systematic connection. This stage provides a good initial point for next system-level refinement. Moreover, for obtaining a process and parasitic closure solution, both parasitic and process variation effects are explored and used to constrain the schematic migration. A representative mixed-signal system, charge-pump phase-locked loop, is used to validate the proposed methodology. The experimental results show that the proposed methodology efficiently generates quality designs in target technology with much less simulation iterations, when comparing with recent available approaches.
to another, while keeping the same schematic and layout topologies. During design migration, experienced designers are involved to manually readjust the device sizes to pull the design into compliance over all the required conditions. However, considering circuit complexity and tight time-tomarket constraint, current mission of technology migration is to reuse the proven functional blocks as many as possible, rather than redesign every block from scratch.
Due to the aid of standard cell-based design methodology and advanced Electronic design automation tools, technology migration for digital circuit can be achieved by rerunning the fully automated synthesis flow [1] . The remap for analog/mixed-signal circuit, on the other hand, is still cumbersome and resource-intensive so far. Because of the highly nonlinear relationship between the device parameters and circuit characteristics, most analog/ mixed-signal circuits are designed for a specific technology, which greatly limits the flexibility to be reused between different process nodes. Simply employing the same shrink rule as migrating digital circuit, the ported analog/mixed-signal circuit cannot even function correctly. Moreover, the performance metrics, in the most of time, are changed accordingly under target technology. Therefore, a complete redesign is often unavoidable and tedious resizing task is inevitable.
In the literature, numerical optimization algorithms are intensively employed in the automation procedure of technology migration and design reuse for topology synthesis, device resize, and biasing to meet all the design specs [2] [3] [4] [5] [6] [7] , [15] , [17] [18] [19] [20] [21] [22] . However, there still exist the following incommodities.
1) Most approaches separately treat a complex system
block by block without an overall systematic consideration. Note that a well-designed block can be incompatible to the whole system. The key hurdle that prevents the transition from cell level to system level is that system-level design is associated with a large number of design variables, resulting in an incontrollable search space. 2) The optimization algorithms utilized by most approaches have either initial points dependency problem, like gradient-based algorithms [6] , or slow convergence problem, such as genetic algorithms [20] , [21] , geometric programming [22] , [23] , and simulated annealing [19] . 3) Most existing process migration approaches are failing to consider circuit reliability and robustness. In fact,
given the process, voltage, and temperature (PVT) variations, circuit characteristics can easily deviate from desired values. 4) Most of existing methods for process migration focus on schematic level migration and fail to account for the layout parasitics. Based on the abovementioned issues, this paper presents an efficient solution for migrating mixed-signal design from one CMOS technology to another. The major characteristics of the proposed method are summarized as follows.
1) Systematic Evaluation: First, a transistor-level (TL) block in systematical configuration is retargeted, while other blocks are represented in behavioral level (BL). For one aspect, this abstraction reduces the simulation time. For another, it assures the functional accuracy as a whole system. Then, starting from the result determined in stage 1, another systematic level resizing is conducted. 2) Global Optimization: The multistart global optimization (MGO) framework is employed for automatically sizing TL blocks in target technology in stage 1. This strategy presents fast convergence ability and overcomes the start-point dependency and local-optimal trapping problems. 3) Robustness Verification: The PVT corners are considered inside migration process to achieve a robust design. 4) Parasitic Aware Evaluation: To obtain parasitic-closure design, the impacts of layout parasitics is explored and the extracted parasitics are reused to constrain the optimization-based resizing procedure. The remaining parts of this paper are organized as follows. In Section II, existing approaches for technology migration and the exampled mixed-signal circuit structure are reviewed. A detailed description of proposed automatic technology migration methodology is presented in Section III. In Section IV, the experimental results to validate the proposed approach are provided. Finally, the conclusion is offered in Section V.
II. RELATED WORKS AND CASE STUDY

A. Related Works
There are many automatic and computer-aided technology migration approaches available in the literature, which can be broadly classified into three categories.
1) Analytical Equation/Linearized Equivalent Model-Based Approach:
In [4] , the equivalent linearized circuit model is first abstracted using operation point analysis method. Then optimization iterations are used to minimize the differences of the equivalent circuit element parameters between source and target technologies. In [7] , transistor's transconductance g m and output conductance g ds are kept as close as possible with respect to the original design to preserve small signal frequency behaviors. Both scaling equations and numerical optimization iterations are used to adjust transistor dimensions to match the same g m and g ds values between the original and scaled circuits. In [8] , by assuming the same supply voltage, initial scaling factors are first calculated based on the constant transconductance constraint of minimal transistor length.
Then, device sizes are tuned based on qualitative reasoning method, where user-provided qualitative dependency matrix is required. The qualitative dependency matrix maintains the relationship between important parameters and circuit features, which are based on the designers' experiences or simple design equations. Hammouda et al. [9] assume that preserving the parameters of each individual component in a circuit would preserve the overall circuit features. Then, the device sizes are tuned by keeping relative bias conditions during retargeting process. In [10] and [11] , scaling factors for some parameters, like transistor dimension and transconductance are computed based on advanced compact model [12] of MOSFET transistors. In [13] , generalized scaling factor-based resizing rules are derived for maintaining the original dynamic range and gain-bandwidth specs. In [14] , the scaling rules are proposed based on the simplest MOS transistor model (level 1 model) for keeping the same figure of merit and reducing area and power consumption, if the target technology is smaller.
The analytical equation/linearized equivalent model-based approach attempts to achieve the same characteristics as the original design using simplified design equations or linearized models. It starts from original designs and gathers the necessary information connecting target and source technology. This method has the principal advantage of fast evaluation in terms of computational time. However, both manufacturing processes and circuit designs nowadays have been an exponential growth of complexity. It often faces challenges in accuracy due to the approximated and simplified equations/models only describe the essential dynamics of the circuit instead of representing the circuit in practice. Considering recent nanoscaled integrated circuit, its complete performance evaluation often requires a combination of dc, ac, transient analysis, and so on, which are unrealistic to be captured through analytical equations. This type of method trades accuracy for speed, making it fail to migrate into mainstream use in industry.
2) Template-Based Approach: In [15] , a generator approach is proposed. A set of basic analog building cells, like differential pair and current mirror, is implemented in a scalable and technology independent manner using generic model approach. The complex analog/mixed-signal design is first transferred into a compact symbolic description, and then resized through optimization iterations. For a certain type of circuit, like digital/analog converter, a module generator-based approach is developed in [16] . Basic building blocks are made first into standard cells. Then, hierarchical sizing procedures are used to map system-level parameters into TL parameters. The generation of system-level model is based on the design experience, while the TL synthesis is based on analytical equations. In [17] , an analog IP database is first built, which includes adaptive models, circuit characteristic equations, and variational ranges. Then, an application-specified integration circuit-like design flow for analog reuse is implemented based on the analog IP database.
The template-based method covers a substantial fraction of circuit needed in most applications. The structures under migration are considered as a set of basic cells. However, a more general approach which is able to cope with arbitrary architecture is still in great demand.
3) Simulation-Optimization Loop-Based Approach: This approach, as shown in Fig. 1 , sets up a consistent platform for the collaborative work between mixed-signal circuit simulator and global optimizer. It adjusts the circuitry parameters under the guidance of an optimization algorithm and evaluates each sized circuitry using the same circuit simulator as it is designed, to comply with the desired performance goals. This method requires less or no expert knowledge of the designed circuit for creating mathematical equations or equivalent linearized models. It is based on more reliable and practical Simulation Program for Integrated Circuit Emphasis (SPICE)-accurate simulations. Furthermore, this approach provides great generality, which is applicable to different types of circuits, technology nodes, and design specs. Last but not least, the associated optimization procedure generates new designs automatically, which will not only meet design goals but also achieve better overall performance. The works on this approach include AMODA [6] , ASTRX/OBLX [18] , STAR [19] , Anaconda [5] , as well as commercial synthesis tools [27] , [28] . Our proposed method falls into this category.
B. Cases Study: Charge-Pump Phase-Locked Loop
In this paper, we utilize charge-pump phase-locked loop (CPPLL) as a concrete example. CPPLL is a representative mixed-signal circuit. It is wildly used in a variety of fields, such as frequency synthesis, skew cancellation, clock generation and recovery, and chip synchronization [26] [27] [28] [29] . In this section, we briefly discuss the basic structure of CPPLL.
In The inputs include three essential elements: process design kit of target technology, source design, and design constraints. Source design provides original circuit netlist, specs, and their test benches. The circuit netlist is a fine tuned circuit schematic in source technology. The design specs and their testbench describe what data should be collected and how it should be analyzed, as well as what input should be applied to stimulate the circuit. Constraints include design specs, functional conditions (e.g., transistor is working in the saturation region), as well as design variables and their variational ranges. The proposed migration method generates the target design, which complies with all the desired performance accounting for both process variation and postlayout parasitics.
The flowchart is briefly described as follows and then discussed in detail later.
Step 1 (Primary Schematic Migration): The MGO-based resizing procedure focuses on one block represented in TL, while other blocks are replaced with BL models. It primarily resizes the original design block by block in target technology.
Step 2 (Schematic Migration Refinement): In this step, all the blocks are represented in TL. Starting from the primary design determined from Step 1, another round of optimizationbased sizing procedure is conducted to further improve the solution.
Step 3 (PVT Variation Evaluation): In this step, the generated solution is evaluated accounting for PVT variations. If it fails, it goes to Step 1.
Step 4 (Layout Parasitic Evaluation): This step completes the layout and explores the parasitic effect. If it fails, it goes to Step 1.
A. Hierarchical Optimization-Based Schematic Migration
1) Primary Schematic Migration:
Simulating a TL mixed-signal system displays multiple challenges and is time demanding. A complex mixed-signal design can take days or even weeks to simulate. Moreover, the simulationoptimization loop-based method requires evaluating circuit characteristics for many iterations. Therefore, it is important to shorten the simulation time.
The primary schematic migration is performed block by block. At a time, the i th analog block is represented in TL using BSIM4 model and set as the optimization object; the preceding i − 1 resized blocks are presented with updated BL models with parameters calibrated by the retargeted TL designs; the following n − i blocks are presented using the original BL models with parameters extracted from source design. Using behavioral models, the resultant system can be efficiently simulated in a shorter time without compromising the systematical functionality.
The n blocks can be either analog or digital blocks. For digital blocks, the migration is either automatically synthesized from RTL description or manually created using standard cells. As a result, the migration effort mainly focuses on complex and technology dependent analog blocks.
2) Behavioral Model Creation and Replacement: Behavioral models are created to capture the circuit functionality and input-output relationship. The block characteristics are parameterized and parameters are extracted from TL simulations. The created behavioral model is more flexible than TL model. It is easier to maintain, reuse, configure, and migrate between different technologies.
The behavioral model is implemented using Verilog-AMS language [31] in this context. Verilog-AMS language, as an extension of traditional Verilog HDL language, is intended to support BL modeling for mixed-signal system. Using Verilog-AMS, BL blocks are written into event driven models in terms of ports and external parameters. The modules are connected with TL blocks to ensure a continuous and smooth simulation of the whole mixed-signal system. The behavioral model of PFD using Verilog-AMS is presented in Fig. 5 . It is based on a well-known ideal state machine diagram depicted in [32] . PFD measures f i with f b to produce two signals: up and dn. up and dn are preset to be high and low, respectively. When f i leads f b , up generates a pulse, while f i lags f b , dn produces a pulse. The pulsewidth is proportional to phase difference between f i and f b , and it becomes larger as the phase difference increases.
The parameter state in Fig. 5 is utilized to record the difference between f i and f b . When the rising edge of f i or f b occurs, an event is triggered and conditionally changes the state value. There are four model parameters are extracted to fulfill the behavioral model. t d_up and t d_dn capture the delay time of input to up and dn outputs, while t rise and t fall variables denote the rising and falling time of signal transitions. These variable values are obtained initially through the transient analysis of original design. After the primary resizing procedure, model parameters are adaptively modified.
To validate the BL model of PFD, we compare the performance of two PFDs with and without BL models. In Fig. 7 , when f i leads f b , the BL up signal (pink dotted line) well captures the transition behaviors of its TL counterpart (blue dotted line). In Fig. 8 , we further validate the PFD BL model through the settling behaviors of the whole CPPLL. The one with BL PFD model (black dashed line) exhibits great consistency against its TL counterpart (blue solid line).
The behavioral model for VCO in Verilog-AMS is shown in Fig. 6 . VCO is to generate square wave with a specific frequency responding to different V ctrl values. The gain of VCO is a critical concern. The f b − V ctrl transfer curve is accurately modeled using curve fitting with the third degree polynomial. k 1 , k 2 , k 3 , and k 4 are fitting coefficients, which are originally extracted from source design and then adjusted accordingly after VCO is sized. Fig. 9 compares VCO gains of BL (blue line with 'o' mark) and TL (black dotted line) models. These two curves show great consistence. Variables V amp and V os denote the amplitude and offset voltage of VCO output, respectively. The noise behavior is modeled with the built-in noise_table function, which contains a series of frequency-power pairs. The frequency-power pairs are obtained though noise analysis originally from source design, and then updated after VCO is primarily sized through stage 1. In Fig. 10 , we validate the BL noise model that consists with TL phase noise simulation.
Using behavioral modeling techniques, a complex mixedsignal system can not only be efficiently simulated with good accuracy but also within a reasonable runtime cost. In Table I , the runtime costs of the CPPLLs with different behavioral models are compared with their TL CPPLL counterpart. The data are the mean values from 100 simulation runs of each CPPLL. The CPPLL employing PFD, CP and VCO behavioral models achieve averagely 1.91×, 3.1×, and 48× speedup over the TL CPPLL, respectively. The speedup in simulation time makes feasible and promising to utilize BL model in the optimization iterations.
3) Constrained NLP Problem Formulation: The primary schematic migration of i th block is finished by applying MGO. It aims at generating a design point that satisfies all the design requirements. For solving the problem, a nonlinear programming (NLP) problem is formulated where p indicates an n-dimensional parameter space, which maps into a response space F; f (·) formulates the objective function in terms of p.
dimension is predefined by an IC foundry or userdefined; biasing current is limited by power consumption requirement.
A feasible solution to the NLP problem is a parameter set that makes all the constraints hold (meets all the design specs). An optimal solution is the feasible solution with largest F value. F can be either a single or combined performance responses. In this paper, we use an auxiliary equation that combines design specs together. The absolute difference between the measured characteristic value and its corresponding design spec is normalized with respect to design spec and summed together 
where M indicates the number of design specs; y m measure and y m spec are the mth measured characteristic and spec. When the measured value is better than spec, it is added to F eval . Otherwise, it is subtracted from F eval . The lager the F eval is, the better the overall performance is.
The aim of solving (1) is to find the optimal solution. However, it is not trivial. There exist several challenges. First, for most real engineering problems, an explicit expression which precisely relates a circuit characteristic to device parameters is not simple and straightforward. It is almost never available and generally implicit. Consider the phase noise of CPPLL, although a few simplified mathematical models are proposed for expressing the phase noise in terms of circuit parameters [33] , they are inferior to SPICE-like simulations in terms of accuracy. Thus, we run SPICE simulation to evaluate the circuit characteristics. Second, it is due to the high dimensionality of parameter space and the complicated response surface. For better illustration, we first vary two parameters: biasing current I bias of CP [ Fig. 2 Mc2 , attempting to achieve comparatively symmetrical rising and falling transitions. In Fig. 12(a) , a 2-D parameter space constituted by I bias and W Mc2 is shown. We randomly generate a few of sampling points in the parameter space marked as blue dots. Fig. 12(b) visualizes how the CPPLL features vary with I bias and W Mc2 . Each point in parameter space corresponds to a different CPPLL design, yielding different F response. Fig. 13 shows the contour graph of the Fig. 12(b) .
From Figs. 12 and 13, we can see that there are many feasible solutions available which are marked with dark red. Among all the feasible solutions, there exists a globally optimal solution. The global optimum is mostly desired solution. We randomly sample a start point 1 in the parameter space shown in Fig. 13 . Then, a local search routine is triggered from it. The local search discovers a local optimum rather than the global optimum. A good start point can help the numerical optimization routine to find the global optimal solution faster. Otherwise, it is easily confined in a local solution.
The simplest way to break the limitation is to restart the search routine at different regions. Therefore, MGO is a suitable strategy. MGO is a very famous global optimization strategy. Its procedure is briefly shown in Fig. 11 . In general, it includes two phases: 1) global and 2) local phases. In global phase, a number of start points are generated which are uniformly distributed in the parameter space. In local phase, a local search routine starts from one of the start points, and then follows a path to arrive at a local optimum. The start points are produced to discover as many local optima as possible. Then, the best one among them is considered as the global optimum. In theory, as the number of visited point approaches infinity, the probability of finding a global optimum solution is close to one [34] .
The easiest method to generate start points is to generate in a pure random manner. However, pure random sequences are prone to form cluster, which leads to the same local optimum is obtained repeatedly. This phenomenon is shown in Fig. 13 using start point 1 and 3 . The diversity of start points plays an indispensable role in seeking a globally optimal solution. We generate a quasi-random Sobol sequence [35] to be the initial points. Sobol sequence belongs to quasiMonte Carlo method family. Sobol sequence is more uniformly distributed over the parameter space than the sequence generated through a purely random way. What is more, when more start points are needed, the successively generated start points know about the position of their predecessors and fill the gaps left previously. This prevents the same region from being repeatedly searched and the same local optimum from being discovered many times. Thereby, it increases the chances of finding the global solution.
After initial points are generated, local searches are invoked from each of them. In general, the constrained NLP problem (1) is first converted into an unconstrained NLP problem using Lagrange function [36] 
where λ j is Lagrangian multiplier. Therefore, (3) rather than (2) is solved iteratively. The successive solutions will eventually converge to the solution of the original constrained NLP problem. During optimization, both p and λ are combined into one variable vector X, which will be refined iteratively to obtain the optimal value. For local search, one can choose evolutionary-based optimization algorithms, like generic, simulated annealing, or gradient-based optimization algorithms, e.g., steepest descend, conjugate, and quasi-Newton algorithms. In our methodology, we use the classic limited-memory Broyden-FletcherGoldfarb-Shanno (L-BFGS) algorithm [37] . In each iteration, 
where X k and ∇ L k are the ordered set of parameters and the gradient of the objective function L at the kth iteration, respectively. Then, the approximated inverse Hessian matrix H k is updated by
and the new point for the next iteration is given by
The first-order partial derivative of L with respect to each X k is calculated numerically through backward difference approximation
The information of gradient is important that it describes the variation trends in the nearby area of X k . Two main features of MGO can be concluded. First, it provides diversification which overcomes local optimality. The evenly distributed start points trigger design explorations at different regions. Compared with searching solely from the scaled source design, MGO adds more possibilities of obtaining better designs. Second, it shows quick convergence. Gradient-based optimization algorithm is applied, which guides the searches to follow the steepest increasing/decreasing direction toward the optimum.
B. Schematic Migration Refinement
Note that Step 1 trades the accuracy with efficiency using behavioral models. In Step 2, the accuracy is compensated by taking another optimization round to further improve the result obtained from Step 1.
As a result, another NLP problem is formulated. Table II compares two NLP problems in Steps 1 and 2. The major difference is that Step 1 performs a block-level optimization in system-level connection, while Step 2 is a system-level optimization. In Step 1, the optimization objective and variables are from the block under optimization. Both block and system-level specs are used as constraints for both steps. For example, if the VCO block is under migration, the NLP problem in Step 1 can be configured as: 1) optimization objective: phase noise of VCO; 2) variables: device dimensions of VCO; 3) constraints: block-level specs, such as VCO gain, transistor operation range; system-level specs, e.g., the overall phase noise, power consumption, and locking time. On the other hand, in Step 2, system-level spec is optimized subject to the constraints defined by both local and global design requirements. The variables are chosen over the whole system rather than a single block. Similarly, the NLP can be created in Step 2 as: 1) optimization objective: phase noise of CPPLL; 2) variables: device dimensions of VCO, CP, and LP; 3) constraints: block-level specs, e.g., bandwidth of LP, VCO gain, transistor operation range; and system-level specs, like the overall phase noise, power consumption, and locking time. For the optimization algorithm, Step 1 employs MGO and Step 2 can apply any suitable optimization algorithm. We choose a local optimization algorithm, like L-BFGS algorithm, to search from the initial point obtained from
Step 1. The stopping criteria are the same for both Steps 1 and 2 NLP problems.
The initial point plays critical role in the action of optimization. It greatly assists in quick convergence and locating the global optimum. When MGO is first conducted to every block in Step 1, it generates a good initial point for the system-level optimization of Step 2.
C. PVT Aware Evaluation
After the hierarchical optimization-based schematic migration, an optimal parameter set is obtained. However, there are several aspects that have to consider. Its robustness with respect to PVT variations has to be verified. The circuit characteristics are usually monotonic to VT variations. Their worst case performance due to VT variations can be evaluated at extreme conditions of VT. For process variations, Monte Carlo (MC) method is always the golden standard, although there are a lot of non-MC methods are available in the literature.
In this evaluation step, MC method using Latin hypercube sampling is applied to evaluate the circuit performance robustness accounting for process variations. Latin hypercube sampling requires one-fourth of the number of simulations needed by traditional MC method [44] . The required number of traditional MC samplings can be obtained [45] 
where Y is the actual yield value,Ȳ is the estimated yield value, and C σ is the confidence level. For a 99.7% confidence level (C σ = 3), an error |Y −Ȳ | = ±1%, and a yield of 90%, N MC is 8100. To achieve the same accuracy, the number of Latin hypercube samplings is 2025. After PVT evaluation, if the circuit character fails to meet the specs. Then, performance deviation information is fed back to Step 1 to modify the design constraints. Then, a new migration iteration starts.
To avoid too many iterations between Steps 1 and 3, Step 2 is modified into a process variation aware optimization process. The condition that all the specs are met at process corners is used to further constrain the optimization procedure in Step 2. Note that the process corners are corresponding to ±3σ values of the joint probability density function of process parameters. It is prone to give a pessimistic analysis which leads to overdesign. A more accurate but time-consuming method is to include MC method in the optimization process.
D. Parasitic Aware Evaluation
In addition to process variations, layout parasitics also have significant impact on circuit performance. Parasitics arise from transistor source/drain capacitance, resistance, and capacitance in the interconnect wires [33] . Therefore, in our proposed migration flow, we also evaluate the parasitic effect.
In this step, the layout is developed and parasitic effect is explored using commercial layout parasitic extraction tool [46] . Then, postlayout evaluations are performed to further verify whether the circuit performance meet the specs or not. If the parasitic evaluation fails, the extracted parasitic resistors, capacitors are reused and constrain the next iteration in Step 2. In the next iteration, the computed parasitics are added to the netlist and travels the whole optimization procedure.
IV. EXPERIMENTAL RESULTS
A. Experiment Settings
The proposed methodology for design migration is validated by the CPPLL shown in Fig. 2 . The CPPLL is used in an NFC system [30] , which is originally implemented in UMC 130-nm CMOS technology with 1.2 V V dd . This CPPLL is to provide 32-MHz output frequency. It is first migrated to UMC 65 nm, and then to IBM 65-nm CMOS technology with 1.1 V V dd .
The circuit characteristics of source design are listed in the second column of Table IV. They are set as the specs for target design. The migration for PFD and FD is based on digital library cells in target technology. The main migration task focuses on CP, LP, and VCO blocks.
MGO algorithm is implemented using C\C++ language based on SOBOL library [38] for generating start points and ALGLIB library for L-BFGS [39] . All the circuits are simulated using HSPICERF. All the experimental data has been run on a PC with 12-GB RAM and Linux operating system.
B. Efficiency of MGO
In this experiment, CP, LP, and VCO are represented in TL. MGO algorithm is performed to retarget CPPLL from UMC 130-to 65-nm CMOS technology. Phase noise of CPPLL is set as the optimization objective. It is computed through the built in harmonic balance and phase noise analysis of HSPICERF. The considered design parameters include every transistor dimension (length, width, number of finger, and multiplier) of CP and VCO, biasing current, and dimension (width, length, and multiplier) of C 1 , C 2 , and R of LP. Variables are bounded by the minimum and maximum values allowed in target technology. Both C 1 and C 2 are bounded that are less than 15 p for area saving purpose.
We have compared MGO algorithm with genetic algorithm (GA) [40] , pattern search (PS) [41] , difference evolution (DE) [42] , and MC in terms of computation cost. GA, PS, and DE are all famous optimization methods. They do not require gradient information of the objective function. The MATLAB built-in GA and PS algorithms are used, while the implementation of DE in [43] is utilized. GA randomly chooses a set of initial points from the parameter space and updates them through many iterations. In sake of fairness, we start all the algorithms from the same start point.
The compared results are shown in Fig. 14 in terms of evaluation value F eval with respect to simulation runs. Each visited parameter vector is used to calculate the evaluation value F eval . The result obtained from MC space is used as the ground truth, which uniformly samples 2000 points in parameter space. The pink dash line shows the result of MC that does not converge. Three stopping criteria are utilized for other approaches: the design specs and constraints are satisfied, the maximum number of function evaluation runs (2000 runs in this case) is met and the norm change of two consecutive searched parameter sets or change of objective function is less than the predefined tolerance (e.g., 10 −6 ). GA evolves 40 generations with population size of 50. Other settings are using default values. GA converges to a local optimum, as shown in red line with o mark. MGO (green dotted line), DE (blue line with * mark), and PS (black line) converge to local optima. From both Fig. 14 of GA, and it shows very similar performance as GA. PS has inferior performance compared with other considered optimization methods. Note that due to the heuristic property of all the methods, the recorded numbers are the mean value from 10 runs of each algorithm. The memory usage of all considered optimization algorithms are compared and tabulated in the last column of Table III .
C. Validation of the Proposed Migration Methodology
The resulting design has comparable or even superior performance to source design. Table IV tabulates the overall performance of CPPLL in both source and two target technologies.
We have compared the runtime with and without Step 1 in Fig. 15 . Note that the recorded runtime only includes Step 1 and Step 2. Since we assume we use the same amount of time for PVT and parasitic evaluations. For porting to UMC65, only one round of migration process is conducted.
Step 1 takes 1.2 h, Step 2 takes 3.2 h, and it totally takes 4.4 h. However, if MGO is directly applied to migrate TL CPPLL rather than performing Step 1 first, 6.2 h are required to achieve the same result.
For porting to IBM65, it takes almost the same time (1.2 h in Step 1 and 3.1 h in Step 2) as porting to UMC65. However, after the parasitic evaluation of Step 4, there is 17% performance degradation noticed at the fastest frequency that VCO can achieve. Then we proceed to iteration 2. In Step 1, we tighten the highest VCO frequency spec by 17% faster. In Step 2, the parasitic information is added to the original netlist and evaluated during the whole optimization process. Since the parasitic aware netlist contains much more components, it requires more SPICE time to solve. As a result, totally 13.2 and 18.8 h are needed with and without Step 1, respectively.
Step 1 reduces the complexity of design exploration for
Step 2, it generates a good initial point for Step 2.
For robustness analysis accounting for PVT variation, we randomly run 2025 samplings using Latin hypercube technique in the parameter space. The statistical distribution for phase noise is shown Fig. 16(a) and (b) . In Fig. 16(a) ,
Step 2 does not consider the process corner information, which results in a 9.72 dBc/Hz deviation to phase noise at 600 kHz offset frequency. In the other hand, by taking into consideration of process corners in the Step 2 optimization flow, the deviation is significantly reduced.
V. CONCLUSION
A hierarchical optimization-simulation loop based methodology is proposed for process migration automation of mixedsignal system. The methodology can retarget a mixed-signal design in reasonable iterations, while giving the same or better performance. The migration time cost is dramatically reduced by employing the behavioral models and MGO in the first round of optimization. A second round of local optimization is followed for better performance. To obtain a robust result, the process corner is consideration in the optimization procedure, and the parasitic effect is also considered for obtaining a parasitic closure design. The proposed methodology has been proved useful and efficient for technology migration and design reuse through a concrete CPPLL system example. This general methodology can be extended to other mixed-signal systems.
By making good use of modern multiprocessors computer structures or distributed workstation network, the inherent parallel computing property of MGO can be greatly explored to further speedup the proposed technology migration of mixed-signal circuit approach.
