In this article, five two-stage ∼6-mW and four three-stage ∼9-mW matched amplifier architectures are proposed to establish optimization procedure and quantify relative merits of cascode (CC), common-gate (CG), and commonsource (CS) building blocks for low-voltage low-power multi-stage front-ends. The circuits are simulated with a 90-nm CMOS technology including modeling of layout parasites. Integrated bias trees and passive port matching networks are incorporated in the K-band designs. In the face of process mismatch, variability in noise and gain figures remains <0.39 dB and <7.1 dB from the design values. The five combinations of building blocks in twostage low-power (6.1-6.6 mW) amplifiers achieve linearity (IIP 3 ) in the range of -5.2∼-13.5 dBm, good reverse isolation (better than -26 dB), 2.89-3.82 dB noise penalties, and 17.2-25.5 dB peak forward gain. In case of threestage front-ends built with CS, CC, and CG blocks (power rating 9.2-9.6 mW), forward gain and optimized noise figures are found as >33 dB and <3.26 dB, respectively. They achieve -2.5∼18.3 dBm IIP 3 , <-39 dB reverse isolation, and <-17 dB minimum IRL. The results are compared with reported simulated findings on CMOS multistage amplifiers to highlight their relative advantages in terms of power requirement and decibel(gain)-per-watt. 
Introduction
The operating frequency of wireless on-chip receiver front-ends has been consistently pushing upward over the microwave range (≥ 3 GHz) during the last decade [1, 2] .
Initial literature on this subject has been dominated by technologies built with compound semiconductors and hetero-junction structures like PHEMT (high electron mobility transistors, GaAs/InSb/AlSb) and HBT (heterojunction bipolar * E-mail: apratimroy45@gmail.com transistors, SiGe/GaAs) [3] [4] [5] [6] [7] . These processes provide better gain and noise performance at RF frequencies but at the expense of higher cost and fabrication difficulties [8, 9] . Alternatively, scaling down of deep submicron CMOS process below the 0.35 µm channel length has led to the reporting of monolithic CMOS wireless transceiver architectures [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] . For a CMOS transceiver front-end, a low-noise low-power amplifier (LNA) proves to be a crucial performance determining component which follows the antenna-filter section of an on-chip receiver [12, 15, 17] . To complete the design of this amplifier, a number of trade-offs between factors like power limit, noise level, and forward gain have to be considered for optimization of its microwave performance [13, 18, 22, 23] . A multi-stage amplifier topology can be employed when a high-gain low-NF front-end has to be supported [9, 10, 13, 24] and its multiple stages may adopt a combination of commongate (CG), cascode (CC), and common-source (CS) blocks. Investigation of relative merits of these building blocks is necessary to facilitate judicial selection of unit CMOS stages for high-gain high-frequency front-end amplifiers. Ultimately, overall performance of a multistage low-NF linearized amplifier will depend on careful process optimization, design trade-offs, and selection of identical/diverse building blocks for the cascaded circuit. In this article, five 6.1∼6.6 mW low-NF (2.89-3.82 dB) two-stage and four -2.5∼18.3 IIP 3 9.2∼9.6 mW threestage amplifiers have been presented using a 90-nm CMOS process. Simulated outputs from the reported amplifiers are compared to quantify role of individual CG, CS, and CC stages on the microwave performance of a multi-stage amplifier. Among the two-stage architectures, the CC-CC amplifier achieves best peak-gain and portisolation performance (25.5 dB and <-57 dB, respectively) and the CG-CS LNA offers the widest range of linearity (up to -5.2 dBm of input power, determined by IIP 3 ). Noise contribution (minimum NF) for the three-stage topologies is similar to their two-stage counterparts within the bandwidth (2.92-3.44 dB) but the range of linearity is improved (reaching up to 18.3 dBm). Both set of amplifiers operate in the K-band (19.6-25.5 GHz) with two domains of gain coverage (14-26 and 34-44 dB for two-and three-stage architectures, respectively). The results are used to establish selection criteria of cascode, common-gate, and common-source blocks for a multi-stage CMOS amplifier depending on design requirements of gain, reverse insulation, noise penalty, and port matching.
Proposed Architectures
The position of a low-NF low-power amplifier block in a typical microwave receiver front-end is shown in Figure 1 .
In this architecture, a post-antenna band-selection filter and a post-mixer low-pass filter contribute in the rejection of interference which accumulates as the signal passes through an on-chip channel. In contrast, overall gain of the front-end is improved by a low-power amplifier and a radio-frequency correlator (mixer). The carrier recovery loop driving the VCO (voltage controlled oscillator) may include divider, multiplexer, and prescaler blocks and any filter integrated with the postamplifier usually employs a number of cascaded stages (to enhance output signal swing) [1] . In spite of the inclusion of an RF mixer and a band-pass filter, effective front-end noise figure is dominated by the high-frequency CMOS amplifier which may adopt a single-or multi-stage topology depending on gain requirements. If the receiver is able to maximize the LNA gain without compromising on noise contribution, it would reduce accumulation of noise from remaining transceiver components as a result. Moreover, when a multi-stage amplifier circuit is put into use, it increases the front-end's power handling capability and introduces wider scope of port-impedance/inter-stage matching. To avail these benefits, extension of the number of stages in a front-end can be facilitated if the design includes a significant number of active devices with matched parameters. The next section attempts to address these issues with a number of multi-stage circuits and assess the relative merits of their building blocks.
Performance of Matched CS, CG, and CC Building Blocks
The topologies of three basic building blocks of a multistage CMOS amplifier, namely a common-source (CS), a cascode (CC), and a common-gate (CG) stage are presented in Figure 2 (a), 2(b), and 2(c), respectively. The circuits include linking elements at the output port which would facilitate their extension to multi-stage architectures. They also house resonating load inductors and LC matching networks at the input port to minimize reflection loss for incoming signals. The return-loss for the amplifiers is regulated at the load port by device parasitic elements and a coupled capacitive circuit. The single-stage common-source (CS) 1.2-V amplifier in Figure 2 (a) is built with a standalone active device (T 1 ) and its center point is controlled by load inductor L 1 , coil parasitic (resistive and capacitive) contributions, and junction parasites associated with the drain terminal of T 1 .
The dc current I 1 is determined by a gate bias voltage (V 1 ) and dimensions (aspect ratio) of the input device. The isolating resistor R 1 has a value greater than 2 kΩ so the small-signal operation is not influenced by any bias supply. The placement of a source inductor L 1 reduces forward gain and manipulates input impedance when seen from the point of view of port RF . Consequently, it controls the magnitude of input return-loss (S 11 ) while the frequency for minimum port reflection is determined by gate element L 1 . The dc blocking capacitors (C 1 and C 1 ) also work as a part of the port matching networks. The parasitic contributions of driving transistor and degenerating inductor take part in the noise matching process by reducing the effect of channel noise current from active devices. The cascaded blocks of a multi-stage amplifier chain may have to deal with a possible trade-off between achievable gain and accumulation of noise. In that regard, the CS stage incurs a relatively low noise penalty which is accompanied by a limited forward gain.
The isolation between load port (RF ) and driving signal is degraded by a gate-drain parasitic capacitor present in the CS transistor. This situation can be improved by a low-power cascode amplifier presented in Figure 2 (b). It has a common-gate (CG) block (formed by T 3 ) built on top of an input common-source stage (realized by T 2 ). T 3 also works as an insulating transistor which leads to insignificant reverse leakage between RF and RF ports. Two active devices boost the forward gain achieved by the amplifier which is accompanied by an increase in noise penalty. Output matching for the cascode stage is provided by parasites related with T 3 and loading capacitor C 2 as the latter becomes a part of the resonance tank formed by L 2 . An extra parallel capacitor (not shown in the figure) may also be included to improve the flexibility of the load port matching circuit. Both amplifier stages in Figure 2 To compare the performance of CS, CC, and CG singlestage low-NF 1.2 V amplifier blocks, ideally they should be powered with similar bias ratings. Keeping that in mind, they are biased with low gate voltages leading to nearly equal dc currents for the amplifier tree (I 1 ≈I 2 ≈I 3 ) and comparable power requirements (3.32, 3.33, and 3.25 mW). The amplifiers operate in the K -band with their center frequencies being located within the 22.2-23.1 GHz range, as shown in Figure 3 . This figure plots no-load voltage gain and small-signal forward gain provided by matched 1.2 V CS, CC, and CG amplifiers. The CC block introduces its peak forward gain near a frequency of 23 GHz which is 2.45 dB higher than the CS stage peak and 8.17 dB higher than the CG amplifier pinnacle. The trend is also followed by voltage gain curves where the difference between CC and CS peaks is 1.6 dB and CS and CG summits is 8.64 dB. Matching performance and range of linearity for the three single-stage building blocks are portrayed in Figure 4 and Figure 5, and -49.2 dB. Minimum output return-loss (ORL, S 22 ) for the CC LNA (-16.8 dB) is 1.3 dB lower than minimum peak of the CS stage and 4.8 dB lower than minimum loss for the CG amplifier. Subsequently, signal loss due to reflection at the amplifier ports remains low and highly centered for the single-stage building blocks. .75 dBm) and 2.94-3.36 dB noise-figure, whereas the CS LNA achieves 8.69 dB peak forward gain and 7.3 GHz 3-dB bandwidth. The 90-nm CC block, on the other hand, boasts higher gain (11.14 dB), increased NF (3.19-3.7 dB), and an input power limit of -1.25 dBm as IIP 3 . 
Two-Stage 1.2-V CMOS Amplifiers
A multi-stage topology has to be adopted to boost the gain of a receiver front-end when a single-stage low-NF amplifier is not able to provide sufficiently high gain in the vicinity of design frequency. The choice between CS, CC, and CG stages as individual blocks of a multi-stage amplifier depends on design requirements of noise limit, port isolation, range of linearity, and forward gain for the overall circuit. Five two-stage amplifier topologies have been proposed in Figure 6 (a)-6(e) to assess the relative merits of using CG, CS, and CC building blocks in a 1.2-V multi-stage design. As was the case with singlestage amplifier blocks, these circuits include resonating load inductors (L 1−10 ) and LC matching networks for the driving port (RF ) to minimize return-loss of input signals. (6.2 kΩ) resistors R 2 and R 3 separate any small-signal amplifier operation from the bias circuit. The branch currents of the amplifier (I 1 and I 2 ) are matched to similar levels (2.45 and 2.47 mA, respectively). R 1/2 and C 1/2 represent parasitic elements contributed by the bank inductors (L 1 and L 2 , ∼0.45 nH) which influence the amplifier's design frequency. On the other hand, matching necessary to reduce output return-loss is provided by L 2 and capacitors C 1 , C 1 (∼90 fF). For lowering reflection loss at port RF , a larger input coupling capacitor C 1 (220 fF) and a 0.6 nH gate inductor (L 1 ) are employed. A small degenerating reactance is included in order to maximize the gain (L 1 =0.1 nH). It is notable that the second CS stage avoids using a source inductor and inter-stage matching is provided by gate parasites of T 2 and linking capacitor C 12 .
To improve isolation between an antenna which precedes the front-end amplifier and a mixer which generally follows the LNA, a CS-CC topology is proposed as a twostage architecture in Figure 6 If a CG input block is appended in the double-stage amplifier, it results in a drop of forward gain and a corresponding improvement for range of linearity. This proposition is tested with two low-NF amplifiers presented in Figure 6 (d) and Figure 6 (e) which employ common-gate stages for input interfacing. The CG-CS front-end in Figure 6 (d) is biased with dc currents of 2.71 and 2.47 mA with separate gate voltages for the two driving transistors (V 10 and V 11 ). The input matching network now consists of a single inductor (L 4 ) which improves the amplifier's IIP 3 to -5.2 dBm as compared to IIP 3 of the CS-CS LNA (-7 dBm). On the downside, the front-end faces a drop of 0.7 dB in peak forward gain. Similarly, the CG-CC amplifier proposed in Figure 6 (e) delivers a wider range of the linearity (-10.8 dBm) than the CS-CC architecture but bandwidthlimited small-signal gain of the former proves to be lower than the latter. The two biasing voltages for the CG-CC amplifier are set as 0.65 V (V 12 ) and 0.73 V (V 13 ). Hence it can be concluded that, the doublestage amplifiers with an input CG stage manifest a higher permissible input power limit while remaining in the domain of linear behavior.
Three-Stage 90-nm Front-End Architectures
With an objective of further extending the cascaded multistage front-end architecture, four three-stage amplifier topologies are presented in Figure 7 
Figure 7. Proposed three-stage matched (a) CS-CS-CS, (b) CS-CC-CS, (c) CG-CC-CC, and (d) CC-CC-CC amplifiers.
stage is proposed in Figure 7 (c). This amplifier is expected to have higher power gain and lower signal leakage in the reverse direction. Now the input driving transistor T 8 has a slightly lower gate voltage (V 8 =0.65 V) than biasing voltage for the drivers in the other two blocks (V 9 11 ). All five transistors (T 8 -T 12 ) and three resonating inductors in this three-stage front-end have similar dimensions (50 µ/0.1 µ) or identical reactance (∼65 Ω). However, tree current powering the input CG stage (I 7 =3.34 mA) is significantly higher than bias currents of the two CC stages (I 8/9 ). The two CC blocks provide it with greater reverse isolation and the CG stage ensures a reasonable linear range for the LNA. Finally, to maximize bandwidth limited forward gain, an all-cascode (CC-CC-CC) amplifier is presented in Figure 7 
Results and Discussion

Effect of Layout Parasites
The proposed multi-stage amplifier circuits are analyzed with a 90-nm CMOS process including the effect of layout parasites. Figure 8 shows the parasitic components used in models of active and passive devices employed for the presented architectures. The 90-nm IBM CMOS process [25] supplies spiral and symmetric inductors in an octagonal shape. The inductor model in Figure 8 (a) has series reactive elements to account for skin/proximity effects (L / ) and magnetic coupling between two halves of a spiral. For each inductor, four resistors (R / ) and four capacitors (C / ) are included in the circuit to account for conductive substrate loss. The parasitic elements existing between a spiral and substrate are represented by four oxide capacitors (C / ). Each spiral of an inductor has two halves which are designated with inductances L , L and linked by a coupling coefficient K . The option for including a center tap for the scalable inductor is excluded from the design. In this discussion, inductor values reside within a design range of 100-800 pH, so number of turns for the spiral varies between 1 and 2. Supported turn-to-turn spacing for a spiral is 3-5 µm and overall outer dimension remains within 90-150 µm.
The proposed architectures use thirty-two core transistors (T 1 -T 14 for two-stage topologies and T 1 -T 18 for threestage amplifiers) and nine transistors in gate biasing circuits.
These transistors are modeled with three junction capacitors (C , C and C ) when their source terminal is shorted to substrate, as shown in Figure 8(b) . Parasitic models for metal-insulator-metal capacitors and doped polysilicon resistors (including top-bottom-layer/substrate elements) are presented in Figure 8 (c) and Figure 8 
(d).
A number of portmatching/interstage-linking capacitors are employed by the proposed amplifiers and each of them includes resistive elements for conductive loss in metal layers and a capacitive component formed between insulator and substrate. Parasitic junction capacitance associated with core active transistors account for about 30-50 fF and 10-25 fF, respectively, whereas biasing transistors generate smaller parasites (10-40 fF). Biasing resistors used by the amplifiers (R 1 -R 13 for two-stage LNAs and R 1 -R 15 for three-stage amplifiers), on the other hand, have values covering a range of 1.8-6.2 kΩ. These passive components are built with doped polysilicon layers in the 90-nm CMOS process and modeled by three-pi distributed networks, as shown in Figure 8(d) . In this figure, C 1∼4 are parasitic components formed between substrate and design layers and R 1∼4 model for substrate loss in the resistor. Layout-sensitive analysis becomes possible for the proposed multi-stage architectures with the inclusion of these extraneous parasitic elements.
Two-Stage 1.2-V ∼6 mW Amplifiers
To compare noise figure and forward gain of the five twostage 1.2-V amplifiers, necessary curves are plotted in groups in Figure 9 . It suggests that peak gain of CG-CS, CS-CS, and CS-CC amplifiers are comparable (19.0-19.71 dB) whereas CC-CC LNA provides an additional ∼7 dB gain with a peak of 25.5 dB near 22 GHz. Additionally, CG-CC amplifier has a lower peak gain of 17.2 dB at a higher center frequency. When it comes to NF, CC-CC and CG-CC topologies have higher bandwidth noise penalties (3.3-3.6 and 3.5-3.8 dB) which may reach up to 5.0-5.8 dB near lower or upper edges of K -band. Range of noise figure for CS-CS and CG-CS amplifiers covers 2.98-3.37 dB and 3.19-3.41 dB within the bandwidth. There is significant difference in the level of reverse port insulation provided by the five topologies, as suggested by Figure 10 . Peak S 12 for the CS-CS architecture is 4 dB lower than that of the CG-CS amplifier, and peak for the CS-CC stage is 15 dB below the minimum limit set by the CS-CS amplifier. CS-CC and CG-CC amplifiers have similar shapes for isolation curves and CC-CC LNA is able achieve the lowest degree of reverse leakage (better than -57 dB). According to the results, minimum input and load port reflection loss for the two-stage topologies are always lower than -14.3 dB and -10.9 dB, respectively, for the concerned frequencies.
The addition of a CG input stage among these structures is expected to improve the front-end's intermodulation point (linear coverage). Consequently, CG-CS and CG-CC blocks achieve a wider range of linear behavior (-5.2 and -10.8 dBm IIP 3 ) than CS-CS and CS-CC amplifiers (-7 and -12.3 dBm IIP 3 ) in Figure 11 . On the other hand, CC-CC design has a degraded input referred 3 order intermodulation product of -13.5 dBm because of the inclusion of two CC blocks. The chip area occupied by the proposed two-stage architectures (excluding probe topologies are able to isolate the circuit ports by a margin greater than -38.5 dB within the K -band and the amount of isolation is proportional to the number of CC blocks present in the multi-stage front-end. To have an estimate of the reflection loss at input port, their input return-loss (IRL) is plotted for K -band frequencies in Figure 13 . The CS-CC-CS LNA achieves best input matching figures as its S 11 is lower than -14 dB over its entire bandwidth (22.4-24 GHz) with a minimum peak of -44 dB. For the CG-CC-CC and the triple CC amplifiers input return-loss is <-9 dB over their bandwidths, and the reading is limited to <-5 dB within the triple CS frequencies (20.5-21.7 GHz). The lowest minimum IRL peak (-48.1 dB) is managed by the CS-CS-CS architecture but it is centered at 22.5 GHz (to the right of the amplifier's bandwidth). When NF of the three-stage architectures is presented in the same figure, it shows proof that inclusion of an input CG block lowers noise for the 17-21.5 GHz sub-K-band, leading to an NF range of 2.92-3.12 dB for the CG-CC-CC amplifier. The triple CS and the CS-CC-CS amplifiers register bandwidth limited NF readings of 3.1-3.3 and 3.08-3.11 dB, respectively. When input referred 3 order intermodulation products are presented for the multi-stage submicron front-ends in Figure 14 , it shows the CS-CS-CS amplifier achieving a wider range of linear behavior which may extend up to 18.3 dBm of input power. With the addition of cascode block(s) for each of the three remaining topologies, IIP 3 decreases and leads to a linearized input power reading of -2.5 dBm for the triple CC amplifier. At maximum input limits corresponding to 3 order intermodulation products, the three-stage low-NF amplifiers are able to deliver 2.3-4.6 dBm output power to the connected load. In case of three-stage front-end architectures, core amplifier chip area including four-five inductors (without probe pads) resides within the range of 0.28-0.39 mm 2 . Table 3 . Performance summary for proposed three-stage 1.2-V amplifiers. 
Performance of Three-Stage Front-Ends
Architectures
CS-CS-CS CS-CC-CS CG-CC-CC CC-CC-CC
Effect of Process Variation
Monte Carlo analysis reveals the effect of process variation on the multi-stage front-end amplifiers and illustrates the outcome of deviation in their design parameters. It is able to estimate fluctuation in the amplifiers' gain, noise, and matching parameters when they are subjected to process variation. The results for the two-stage architectures predict that mismatch in process may lead to a fluctuation of 0.6∼2.3 dB in peak forward gain (a change of 3.04-12.1% with respect to centered peak). Optimum NF for the amplifiers may change by 0.12∼0.34 dB (change of 3.8-10.3%) and minimum input and output reflection loss remain lower than -12.9 and -10 dB, respectively. In case of three-stage frontends, maximum variation in NF is estimated as 0. 
Comparison of Simulated Results
Summary of performance for the proposed two-and three-stage 1.2-V amplifier architectures is documented in Table 2 and Table 3 , respectively, obtained on a 90-nm CMOS platform. The five 2-stage amplifiers (Table 2) consume similar amounts of dc power (6.05∼6.59 mW) and operate in the 19.6-25.5 GHz band for a 90-nm process. CG-CS, CS-CS, and CS-CC amplifiers achieve Table 3 , the CC-CC-CC design trails with a smaller IIP 3 of -2.5 dBm but offers the strongest resistance against back leakage (minimum isolation registers -80.4 dB). It achieves a wider bandwidth than the other three amplifiers (2.3 GHz) and a >40 dB power gain. The CG-CC-CC topology manages good K -band noise figures (minimum 2.92 dB) and the triple CS LNA delivers input-referred IP3 at 18.3 dBm. By drawing 7.69-8.01 mA currents from V (=1.2 V), the three-stage circuits spend 9.23-9.61 mW power and achieve input-referred third order compression points above -2.5 dBm.
The proposed 1.2-V multi-stage amplifiers are compared with simulated results of reported CMOS millimeterwave amplifiers [22] [23] [24] [26] [27] [28] [29] [30] [31] [32] in Table 4 and Table 5 . These results suggest that the 6.1-mW CC-CC amplifier offers better area efficiency and higher gain than its counterparts. The 1.2-V CG-CS architecture provides a wider region of linear behavior while the CS-CS amplifier achieves better port matching. According to Table 5 , the 9.2-9.6 mW 3-stage topologies manage wide load port matching performance. They also achieve low noise limits and improved gain-per-watt ratings. In Table 2 and Table 3 
Conclusions
This paper presents four three-stage 18∼24 GHz and five two-stage 19.6∼25.5 GHz CMOS 1.2-V amplifier architectures to establish the selection criteria of CC, CG, and CS stages as building blocks of low-power multi-stage front-ends. Including parasitic contributions of circuit components and integrated bias circuits, the amplifiers maintain 6.1-9.6 mW power rating and 17-44 dB peak gain when simulated with a 90-nm CMOS process. Among the two-stage architectures, the CG-CS amplifier supports 19 dB peak S 21 (21.2 GHz) and best linear coverage (-5.2 dBm IIP 3 ) with a 1.2-V supply rail. Its gain figures are comparable with the CS-CS LNA which is less linearized (-7 dBm) but offers 2.98-3.37 dB bandwidth NF. The CC-CC topology manages higher forward gain (>22.5 dB) and greater port-isolation (at least -44 dB) over a bandwidth of 2.8 GHz. The proposed three-stage K-band amplifiers achieve gain figures of 33.8-43.7 dB with a <10 mW power demand. Amid them, the triple CC topology achieves strong reverse insulation (better than -80 dB) and the CS-CC-CS amplifier manages lowest input return-loss (<-14 dB) over its bandwidth. Bandwidth IRL for the triple CS LNA is limited to <-5 dB while it establishes the widest domain of linear behavior (IIP 3 =18.3 dBm). The proposed 1.2-V front-end amplifiers fare better than simulated examples from published reports and should facilitate the selection process of unit stages in multi-stage CMOS amplifiers.
