3-D CMOS Circuits Based on Low-Loss
Vertical Interconnects on Parylene-N substrate. Several techniques have been used to overcome the high losses of passive components including the use of high-resistivity silicon substrate instead of a CMOS-grade substrate, 3-D out-of-plane inductors, and transformers [4] , [5] and elevation of inductors and transmission lines over the CMOS-grade substrate through substrate removal or through a low loss and low dielectric constant layer [6] - [11] . Utilizing high-resistivity silicon substrate, substrate removal, and adopting out-of-plane passive structures reduces or eliminates the eddy currents in the substrate, and thus, the dielectric loss; however, they are not compatible with standard CMOS or BiCMOS processes. Applying a dielectric layer to elevate transmission lines away from the low-resistivity Si substrate reduces the substrate interactions with the transmission lines when a relatively thick dielectric layer is used (typically 10 m) [8] . Polyimide, benzocyclobutene (BCB)-based polymers, and SU-8 are used as dielectric layers [10] - [12] . 1 Processes based on these dielectrics are either expensive, require high processing/curing temperature, or are characterized with relatively high dielectric loss. In this study, we have used a thick (15 m) parylene-N layer with a frequency-independent dielectric constant of 2.35-2.4, and a very low loss tangent of 6 10 up to 60 GHz that deposits in a conformal fashion using a simple process at room temperature. 2 The main drawback of parylene-N is its large thermal mismatch to Si (thermal expansion coefficient of 69 ppm/ C versus 3.2 ppm/ C for Si), which complicates its application for large area circuits under high operating temperature such as high-power electronics.
A 3-D narrowband amplifier using parylene-N was previously implemented and experimentally characterized [13] . It was shown that it is necessary to account for various parasitic effects in a 3-D design environment to accurately simulate a true 3-D circuit. In this paper, we have demonstrated how the building blocks of this amplifier are implemented in such a 3-D design space. This is done through design, fabrication, and characterization of coplanar waveguide (CPW) transmission lines, 3-D vertical interconnects, and CPW-based discontinuities such as L-, T-, and U-shaped structures on a thick parylene-N layer using a CMOS compatible fabrication process.
II. FABRICATION

A. Vertical Interconnects and Transitions
A low-resistivity silicon wafer with bulk resistivity of [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] cm is coated with a 5-m-thick thermal silicon dioxide. The bottom metal layer (Metal 1) is a 3.5-m-thick aluminum layer formed by evaporation and lift-off processes. This step forms the bottom layer CPW lines, thru-reflect-load (TRL) calibration standards including open, thru, and delay lines and required underpasses for ground equalization of top metal transmission lines. A set of measurements is performed at this stage to characterize the CPW lines fabricated using Metal 1 [see Fig. 1(a) ]. Before parylene-N deposition, the samples are soaked in the adhesion promoter solution (2-Propanol: DI water: Silane (A-174) 100 mL: 100 mL: 1 mL) and air dried. A 15-m-thick parylene-N layer is then deposited using a chemical vapor deposition (CVD) process at room temperature. Details of the parylene-N deposition process are discussed in [14] . Vias are etched in a reactive ion etching (RIE) chamber with O plasma at 150 mtorr for about 45 min to completely etch through the parylene-N layer [15] . These vias are used to make contact for phase equalizing underpasses or to form vertical transitions. For best step coverage of the top metallization (Metal 2), a seed layer of titanium/gold is sputtered and then 3.5 m of gold is electroplated in a Orotemp gold electroplating solution. Finally, the photoresist is removed and the seed layer is etched away [see Fig. 1 (b) and (c)].
B. 3-D Low-Noise Amplifier (LNA)
This design is post-fabricated on an original prefabricated CMOS chip with dimensions of 2 mm 12 mm. Details of the fabrication process is provided in [13] . Since patterns to be post-fabricated on this chip will extend all the way to the edges, and the chip is relatively small in one dimension, a carrier wafer is used to embed this chip in a self-aligned wafer level integration technology discussed in [16] . By employing this technique, handling of the chip becomes easier while the accuracy of the lithography is preserved due to a uniform thickness of photo-resist across the entire chip. With a wafer level processing this step may be relinquished. The CMOS chip is embedded in a low-resistivity Si substrate using poly-di-methyl-silicone (PDMS). To form decoupling metal-insulator-metal (MIM) capacitors, parylene-N is partially etched ( 14 m) in RIE to form a 1-m dielectric layer [17] . Through another lithography step, parylene-N is etched thoroughly to form vias for interconnection to the bottom metal layer. Sputtering a thin layer of Ti/Au followed by a lift-off process and 3-m gold electroplating to form the top metallization (Metal 2) completes the process.
III. CPW TRANSMISSION LINE COMPONENTS
A. Design
Among various transmission lines that can be implemented on a silicon substrate, CPW lines have the advantage of simple one-metal layer fabrication, since signal and ground metallization are implemented using the same metal layer. Additionally, the impedance of CPW lines is not sensitive to variations in the substrate and dielectric layer thicknesses, but rather depends on dimensions of the CPW line metallization. Moreover, CPW lines provide a good short circuit with much lower parasitic inductance, as opposed to microstrip lines or slot-lines, which require vias to create a short circuit. For these reasons, a CPW architecture is employed to achieve ultra low-loss lines, vertical transitions, and 3-D circuits based on low-loss low-parylene-N dielectric material on top of a CMOS grade silicon substrate.
To obtain various designs investigated in this study, two metal layers are used. The first metal layer, Metal 1, resembles the top Al metallization in a typical RF CMOS technology. Metal 1 has a thickness of 3.5 m and is fabricated on a low-resistivity silicon substrate (10 cm) coated with a 5-m-thick SiO layer. The second metal layer, Metal 2, is made with Au with a thickness of 3.5 m and is fabricated on top of the parylene-N layer with a thickness of 15 m that is deposited on the Si/SiO substrate. Upon performing simulations with Ansoft Technologies' High Frequency Structure Simulator (HFSS), a 50-CPW line on an Si/SiO substrate using a bottom metal (Metal 1) has signal line-gap-ground line dimensions of 45 m 54 150 m, respectively. A 50-CPW line on parylene-N using top metal (Metal 2) has signal line-gap-ground line dimensions of 90, 30, and 300 m. Fig. 1(a) and (b) demonstrates these two architectures. For a fair comparison, of both lines are chosen to be the same. In order to measure these lines, two different sets of TRL calibration standards are designed and fabricated separately using each of the respective metal layers.
A vertical interconnect between the bottom layer 50 line (Metal 1) and the top layer 50 line (Metal 2) is also designed and simulated using HFSS. To access CPW ports for measure- ment purposes, back-to-back transitions are designed so that CPW contacts are available on the top metal (Metal 2), as depicted in Fig. 1(c) , where the substrates are not shown for the purpose of clarity. Metal 1 and Metal 2 partially overlap around the transition and introduce a local parasitic capacitance that cancels the inductive effect of vias connecting the signal and ground traces. By choosing proper dimensions and tapering of the two metals to cancel parasitic capacitance and inductances, as shown in Fig. 1 (c), one can reduce reflections and maintain a smooth transition for wideband application [17] , [18] .
Routing CPW transmission lines for circuit implementation on a planar substrate requires asymmetric CPW structures such as L-and U-shaped turns and T-junctions. Bending a transmission line results in extra parasitic components due to introduced discontinuities. A simple L-shaped CPW bent is depicted in Fig. 2(a) . The dashed lines represent the propagation paths of the slot-line mode of the CPW line. As seen from the figure, the signal on the inner ground conductor has a shorter path to travel compared to the outer conductor. The imbalance causes degradation of insertion loss and return loss due to distortion in the phase of the wavefront. In order to minimize this imbalance and the radiating slot mode caused by different path length traversed by the magnetic current wave, traditionally bond-wires [19] , dielectric slabs [see [22] , or continuous top and bottom shielding [23] are used to achieve ground equalization. These techniques add to the local parasitic capacitance, inductance, and resistances, which should be included to obtain an accurate design model. A study performed by Dib et al. revealed that air-bridge approach shows better electrical performance prediction due to its relative small parasitic components compared to bond-wires even though air-bridge process is more complex and has lower yield [19] . Even though all the mentioned approaches minimize the discontinuity and help to balance the traveling wave on inner and outer slots, they have their own disadvantages in terms of their compatibility with the CMOS process, fabrication cost, complexity of the process, and fabrication yield. In this study, underpasses are chosen to eliminate the slot-line modes of CPW discontinuities because of their ease of fabrication and high yield. Underpasses are easily realized using Metal 1 to balance the phases for the transmission lines fabricated using Metal 2. Optimization to achieve minimum insertion loss and return loss over a maximum bandwidth is performed using HFSS to design L-, T-, and U-shaped CPW components. In the designed bent CPW lines, underpasses are utilized before and after each bend. The underpass width is 10 m and it connects to via posts with dimensions of 90 m 90 m that encircle vias with dimensions of 70 m 70 m. The width of the underpass is relatively small and given the 15-m height of the dielectric layer (parylene), it shows a negligible parasitic capacitance between the underpass and the conductors of the CPW line (less than 0.1 fF). Therefore, no local narrowing on the signal line is needed. Simulation results are verified through comparison with measurement of these components. In order to perform measurements, a back-to-back bent architecture is used with four underpasses to prevent the need for repositioning the probes at a right angle with respect to each other. Such design uses the same CPW signal-gap-ground dimensions as the original CPW on Metal 2 (90, 30, 300 m).
B. Measurement and Analysis
Measurements are done using on-wafer probing technique with 150-m-pitch ground-signal-ground (G-S-G) (CPW type) probes. An Agilent 8722 vector network analyzer (VNA) is calibrated from 1 to 40 GHz with two sets of TRL calibration kits for each metal layers separately using three different delay lines. Measurements are first performed on CPW lines on each metal layer to verify the accuracy of the designs and their proximity to the simulation results [17] . The insertion loss is a contribution of three different components: substrate loss (low-resistivity substrate), dielectric loss (parylene-N in this case), and conductor loss. In order to show the significance of each component in insertion loss, multiple simulations are performed, where in each case all, a few or only one of the loss contributors is present. In all these analyses, a fixed geometry are present shows that there is a 0.25-dB reduction in insertion loss at 40 GHz when the substrate resistivity is increased from 10 to 20 cm and an additional 0.35-dB reduction when the substrate is not doped at all. Comparing this result with measurement shows some discrepancy. Hence, in simulations, a modified substrate is defined to match the measured response: the substrate is modified with a bulk resistivity of 20 cm everywhere, except 5 m (equal to the thickness of the oxide) underneath the oxide, where the resistivity is set to be 40 cm. This is due to the fact that during the process of growing a 5-m-thick oxide on top of a low-resistivity Si substrate (doped with boron), a thin segregated area just beneath the oxide is formed due to high segregation coefficient of boron at the oxide/silicon interface [24] .
Measurement results of back-to-back 3-D vertical interconnects [architecture shown in Fig. 1(c) ] shows an extremely low insertion loss of less than 0.013 dB per vertical transition for frequencies up to 40 GHz [17] . Fig. 4(b) depicts the measurement and simulation response of two back-to-back L-shaped bends shown in Fig. 4(a) . The total length of the back-to-back bent design is 2440 m (2.44 mm). The distance between the two bent sections is 1.25 mm, while each underpass is located 120 m away from the bend, as depicted in Fig. 4(a) . Measured total insertion loss of better than 1.3 dB up to 40 GHz is an indication of proper balancing of the signal paths. Higher insertion loss and return loss are simulated in Fig. 4(b) from an identical structure without ground balancing underpasses.
Several U-shaped designs that use back-to-back L-shaped designs are also designed and fabricated, as depicted in Fig. 5(a) .
All these designs include four back-to-back 90 bends and use eight equalizing underpasses, but they have slightly different geometrical parameters. Each design is based on the same 50-CPW architecture with signal line-gap-ground line dimensions of 90, 30, and 300 m on the top metal (Metal 2), respectively. Table II summarizes the geometrical characteristics and performances of these designs. As the total length of CPW section is varied, the insertion loss is scaled accordingly. Fig. 5(b) and (c) demonstrates the measurement results for the insertion phase, insertion, and reflection coefficient of all four designs.
Comparison among the measured phase of these designs with 1 mm of the straight CPW line shows that a relatively small phase lag is generated when lines are bent, which is expected due to the parasitic that the bend introduces on the signal path. Comparing the reflection coefficients of Designs 3 and 4 shows an additional valley around 21 GHz in Design 3. In Design 4, the ground is expanded , and hence, it creates an additional path for the traveling waves between points "A" and "B," as denoted in Fig. 6(b) . In Design 3, the only path between points "A" and "B" is the coupling through the substrate, which with presence of the parylene-N layer is negligible. This reduces the bandwidth of Design 4 compared to that of Design 3. Different loss components of these two designs are compared, as shown in Fig. 6(a) , and prove to be very similar. The calculated "Loss," as in (1) , represents the radiation and resistive losses in the system
Electromagnetic simulations in HFSS are carried out to calculate the magnetic current densities on both of these designs, as shown in Fig. 6(b) . Even though the current density fades slightly in the midsection of the ground plane of Design 4, its magnitude remains relatively the same with that of the edges of the truncated ground plane (Design 3), and hence, is not completely negligible. This phenomena is important when bent CPW lines are laid out as part of distributed amplifiers.
Furthermore, T-junctions using underpasses are designed fabricated and measured. Fig. 7(a) shows a photomicrograph and dimensions of this design with three underpasses. As this configuration has three ports, for measurement purposes, Port 3 is left open and the performance is measured through connecting Ports 1 and 2 to the two ports of the VNA. Leaving Port 3 open makes an open stub, and hence, presents frequency-dependent impedance at the junction. For a loss-less transmission line, the impedance measured at distance mm away from the open circuit is given by the following equation: (2) where is the characteristic impedance of the line represents the speed of light, and is the relative dielectric constant of the substrate. Fig. 7(b) illustrates a simplified model for the three-port network. At the frequency where the length of the stub (1.05 mm) is equal to a quarter of the wavelength (38.4 GHz, in this case), the impedance seen at the junction will be a short circuit [in Fig. 7(c)] . Hence, the load seen from Port 1 is an open circuit and all the power is reflected. Fig. 7(d) shows good agreement between the measured and simulated -parameter response of the T-junction. 
IV. 3-D CIRCUIT DESIGN
Vertical transition and CPW discontinuities can be utilized in the design of 3-D distributed circuits. Simulations of undesired coupling and cross-talk effects between two U-shapes shown in Fig. 8(a) were performed to investigate the importance of layout design in a 3-D circuit. Originally, the two U-shapes are located exactly on top of each other. Dimensions of the two structures are slightly different because each CPW structure corresponds to 50-characteristic impedance on their respective substrates. and denote the relative displacement of the top U-shape from its original position. Fig. 8(b) and (c) shows the insertion and return loss and coupling between ports obtained by HFSS simulations, where ports 1-3 are designated in Fig. 8(a) . The following three different cases are investigated.
Case 1) There is no bottom metallization underneath the top U-shape design. Case 2) Both U-shapes are stacked on top of each other. Case 3) The top U-shaped design is shifted 500 m in the -direction with respect to the bottom U-shaped design. For simplicity of analysis, the underpasses are not present. Case 1) represents the nominal response with insertion loss of about 4.8 dB and reflection of better than 17.5 dB up to 40 GHz. Adding bottom metallization in Case 2) creates extra parasitic capacitances unique to this configuration, which to some extent improves the insertion loss compared to Case 1) by 2.8 dB at 40 GHz. Since some part of energy is stored in this mode and not transferred to the output, the reflection loss increases. To further justify this phenomenon, more detailed analysis has to be performed, which is not the focus of this study. On the other hand, in Case 3), the response gets even worse due to the existence of parallel-plate modes, which increase the insertion loss significantly. Fig. 8(b) also depicts the simulation response for the coupling between the top and bottom metal, Port 1 and 3, when the CPW lines are exactly on top of each other [Cases 2) and 3)]. Due to the thick low-dielectric layer (15 m of parylene-N) between the layers, the coupling is relatively small and less than 21.5 dB up to 40 GHz for Case 3). For Case 2), however, the coupling is higher by about 10 dB. It is, therefore, important to design the layout of multilayer CPW lines such that the lines are not exactly on top of each other.
A. LNA Design
An LNA with CMOS transistors is implemented based on the aforementioned 3-D design techniques [13] . For this demonstration, a chip with prefabricated cascode cells along with other circuitry built using 0.13-m CMOS is utilized. Two cascode cells, each using two transistors with 60 fingers and overall width of 120 m, is used. Note that the circuit layout has to be adapted to the shape and distance between the prefabricated cascode transistors.
A two-stage amplifier is designed based on 50-characteristic impedance for CPW lines. Agilent Technologies' Advanced Design System (ADS) is used to optimize the response to obtain a stable amplifier with optimum noise, gain, and input matching performance. The dimensions of pads connected to the cascode devices were initially designed for characterization of the devices with 150-m-pitch probes and are 107 m 135 m. These pads introduce an extra parasitic capacitance of about 0.1 pF between the pad and substrate (with a 5-m-thick BEOL oxide).
B. Parasitic Consideration, Measurement, and Analysis
In order to better characterize the effect of the layout on the overall performance of the circuit, the parasitic effects of 3-D components that do not exist in a traditional 2-D CMOS design must be considered. The 3-D parasitic effects that are considered here are as follows.
1) Parasitic capacitance to substrate due to connection pads required for each active component (Fig. 9) . 2) Parasitic inductance of the vertical interconnects through vias, as shown in Fig. 9(b) . Fig. 9(a) ]. 6) Capacitive parasitic loading due to parylene-N coating and continuous metallization over the active area of the chip, as shown in Fig. 9 (b). 7) Capacitive and inductive effects of the underpasses, seen in Fig. 9(a) , that are used for balancing the bends and T-junctions of the CPW transmission lines. These parasitic effects are simulated using HFSS and then added to the circuit design software (ADS). For HFSS simulation, the layout is broken into smaller pieces and each section is simulated individually. The effect of the top metallization of the CMOS chip is case dependent and unique to the design. In this case, due to complexity of these simulations, the results could not be integrated with other effects [mentioned above in 1)-7)]. However, based on these results, the ultimate measured response is predictable [25] . It is very important to consider this type of simulations to even out any unforeseen effects of such parasitic components. Fig. 10 summarizes these simulations and shows partial role of each parasitic components in the overall performance of the circuit. Plot "A" shows the near ideal response of the circuit if none of the 3-D parasitic effects mentioned above existed. The effects of the connection pads are included in this simulation, while ideal transmission lines are used in ADS for simulating this case. Next, the layout of the circuit is broken in to smaller sections including the underpasses and vertical vias through parylene-N. These sections are simulated individually using HFSS. By putting all the components together in ADS, the overall effect can be seen from curve "B." These effects have caused a 750 MHz down shift in frequency as well as 4.5-dB additional loss in the overall gain of the amplifier from the ideal case. Next, the effect of the parasitic capacitance due to large metal plate of is integrated in the (6) parylene-N and metal coating over active area, and lastly, (7) underpasses. From [13] .
response, which shifts down the peak of the response by another 250 MHz while the gain is about 11.3 dB. On the other hand, the inductive dc-bias line and parasitic capacitance between the top metal on parylene-N and the active area of CMOS chip [shown in Fig. 9(b) ] degrades the shape of the gain as shown by curve "D" of Fig. 10 . Coupling between the top metal of the CMOS chip and the post-fabricated metallization also causes additional shift in the frequency response of the amplifier and brings the response closer to what have been measured as peak gain of 13 dB at 1.96 GHz [25] . These sets of simulations provide an excellent prediction of the response of the 3-D circuit. Under the bias conditions of: V, V, V, and mA, a forward gain of 13 dB at 1.96 GHz is measured. Total power dissipation is 37.4 mW. The 3-dB bandwidth of the amplifier is extended from 1.7 to 2.2 GHz (500 MHz). A minimum noise figure (NF) of 3.3 dB at 2 GHz is measured, while the input referred 1-dB compression point of 7.3 dBm, which corresponds to an output 1-dB compression point of 4.6 dBm is measured [13] .
V. CONCLUSION
In this study, parylene-N is used as a dielectric material to elevate the coplanar transmission lines away from the lossy CMOS substrate. An improvement of about 70% in the insertion-loss performance of the CPW line is achieved. This technology is fully compatible with CMOS processes and the dielectric deposition occurs at room temperature. It is shown that by using a simple air-bridgeless technique the performance of the bent transmission lines used in distributed MMICs is improved. A novel 3-D circuit is demonstrated in this study. 3-D design challenges, measurements, and fabrication techniques are discussed through implementing a 3-D LNA with a measured gain of 13 dB at 2 GHz and a 3-dB bandwidth of 500 MHz. The implemented 3-D LNA with distributed two-stage cascode architecture shows an NF of 3.3 dB and output-referred 1-dB compression point of 4.6 dBm at 2 GHz. Applying 3-D design techniques shown here fully utilizes the high performance of CMOS transistors without degradation due to substrate losses and crosstalk. This also yields a smaller footprint for the active devices, and hence, reduces the overall size and even the cost of the system despite an additional post-fabrication technology.
