Introduction
Device scaling { the reduction of device layer thicknesses and lithographic feature dimensions { is essential to extending the operating frequency of transistor-based integrated circuits. The bene¯ts of aggressive device scaling are illustrated in silicon CMOS technology where progressive reduction in transistor gate length has been essential to the rapid increases in microprocessor speeds. III-V compound semiconductors o®er inherent material advantages over silicon. These advantages include higher electron mobilities, higher electron saturation drift velocities, and stronger heterojunctions than Si/SiGe. Extending transistor technology towards THz frequencies will require combining these material advantages with deep submicron device scaling.
The gate lengths of III-V-based high electron mobility transistors (HEMTs) have been scaled to submicron dimensions. InP-based InGaAs/InAlAs HEMTs have exhibited impressive high frequency performance. Devices in this technology 1 with 45 nm gate lengths have been reported with maximum current gain cut-o® frequencies (f ¿ ) of over 400 GHz 1 . Separately, transistors with 100 nm gate lengths and maximum frequencies of oscillation (f max ) of 600 GHz have been reported 2 . HEMT-based multi-stage ampli¯ers with large power gains in the 140-220 GHz band have also been reported 3 , 4 , 5 , 6 , 7 . State-of-the-art HEMT ampli¯er results include: a 3-stage ampli¯er with 30 dB gain at 140 GHz 3 , a 3-stage ampli¯er with 12-15 dB gain from 160-190 GHz 4 , and a 6-stage ampli¯er with 20 § 6 dB gain from 150-215 GHz 5 .
In contrast to HEMTs, aggressive scaling of III-V heterojunction bipolar transistors (HBTs) has not been prevalent. InP and GaAs-based HBTs are typically fabricated with emitter widths of 1-2 ¹m, and collector junction widths of 3-5 ¹m. By comparison, state-of-the-art Si bipolar and Si/SiGe HBTs are fabricated with < 0.2 ¹m emitter-base junction width. SiGe devices with 0.14 ¹m emitter-base junction widths have been reported with 92 GHz f ¿ and 108 GHz f max 8 . Despite disadvantages in material properties, highly-scaled SiGe technologies will challenge III-V integrated circuits for market share in next generation >40 Gb/sec optical ber communication systems. The full bene¯ts of scaling III-V HBTs are only realized if all transistor parasitics are simultaneously reduced. Devices with highly scaled emitter-base junctions have been fabricated for low power applications 9 ; however, reduced emitter dimensions have not necessarily correlated to improvements in device bandwidth. The parasitic capacitance of the base-collector junction lying under the base Ohmic contacts presents the most severe limit to HBT scaling. The geometry of the mesa HBT used throughout the III-V community is such that the minimum size for base Ohmic contacts places a lower limit on the size of the collector-base junction, preventing submicron scaling. Approaches to facilitate scaling of the collector-base junction include: removal of excess collector semiconductor using a lateral-etch undercut 10 , 11 , de¯nition of extremely narrow base contacts using > 10 20 =cm 3 base layer doping 12 , and substrate transfer to allow lithographic pattern de¯nition on both sides of the base epitaxial layer.
We have developed a transferred-substrate HBT technology in an InP-based material system. The process allows the emitter-base and collector-base junctions to be simultaneously scaled to submicron dimensions, resulting in dramatic increases in f max . A record unilateral power gain of 21 dB at 100 GHz has been measured in the technology 13 . Recently fabricated submicron devices have exhibited a small negative output conductance from 40-110 GHz, resulting in unbounded unilateral power gain in the 75-110 GHz band 14 . As a result, f max cannot be extrapolated from these measurements. Other device results in the transferred-substrate technology include transistors with simultaneous 295 GHz f ¿ and f max 15 , and double heterojunction transistors with 425 GHz extrapolated f max and 8 V common-emitter breakdown voltage 16 .
In this paper, general scaling laws for HBTs are reviewed. The transferredsubstrate process is subsequently described as a means of realizing the potential of a highly scaled III-V HBT for mm-wave applications. We then present an overview of our measurement and calibration methods for on-wafer device measurements, as these factors are critical for accurate characterization of submicron devices. Measured transistor results are then presented, and di±culties in extending low frequency device models to high frequencies are described. Finally, ultra-high frequency HBT ampli¯er design is discussed and results from the transferred-substrate technology are presented.
HBT Scaling
In general, transistor bandwidths are determined by carrier transit times and RC charging time constants. HBT transit times are reduced by decreasing the thicknesses of the base and collector epitaxial layers. Reduction of the HBT's epitaxial thicknesses will lead to an increase in base resistance and collector capacitance, unless accompanied by lateral scaling of the base and collector junction widths.
The simpli¯ed cross-section of a mesa HBT shown in¯g. 1 illustrates the di±-culty in scaling the transistor's collector-base junction. The patterned etches and metal depositions that form the HBT junctions result in a device structure where the collector-base junction must lie beneath the full area of the base Ohmic contacts. To obtain low base contact resistance, the base Ohmic contact must be at least one contact transfer length L contact wide at the sides of the emitter stripe. In an InGaAsbase HBT with 400 º A base thickness and 5 £ 10
19 =cm 3 doping, L contact ' 0:4 ¹m. Processing tolerances for lithographic alignment may further limit the minimum collector-base junction dimensions.
In contrast to the mesa HBT, a simpli¯ed cross-section of an idealized HBT structure is shown in¯g. 2. Here, the width of the collector-base junction has been e®ectively de-coupled from the width of the base Ohmic contacts. Scaling of the collector-base junction width in this device is limited only by alignment tolerances between the emitter and collector stripes. Through substrate transfer, we are able to realize this idealized geometry by lithographically patterning both sides of the base epitaxy. The transferred-substrate process will be discussed further in the Section 3. Next, we will consider the factors that determine HBT bandwidth.
In literature, transistor bandwidth is commonly described by two¯gures-ofmerit: the current gain cuto® frequency f ¿ , and the power gain cuto® frequency f max . Independent of f ¿ , transistors cannot provide power gain at frequencies above f max . Thus, f max de¯nes the maximum usable frequency of a transistor in narrowband reactively-tuned circuits. In more general analog and digital circuits, the transistor¯gures-of-merit may not accurately predict circuit performance. For instance, f ¿ is commonly used to evaluate a transistor's potential in digital logic applications. However, a detailed charge-control analysis of switching times reveals that device current density, collector-base junction capacitance and emitter resistance make much larger fractional contributions to logic gate delay than they contribute to the emitter-collector forward delay ¿ ec = 1/2¼f ¿ 17 . In analog and digital circuits, f ¿ and f max are used to provide a¯rst-order estimate of device transit delays and of the magnitude of the dominant transistor parasitics.
We estimate below the cuto® frequencies from HBT parameters calculated from physical device properties and¯t to a lumped-element device model. Experience has shown that a simple hybrid-¼ small-signal circuit model (¯g. 3) is su±cient to describe all but the most highly scaled devices up to a frequency of 110 GHz. Concerns regarding the accuracy of model at higher frequencies and in describing highly scaled devices are discussed in section. 5. Those concerns notwithstanding, analysis using this¯rst-order model proves excellent for determining those terms that limit transistor bandwidth.
The scaling analysis that follows has been presented in greater detail elsewhere 17 , 18 . It is repeated here for completeness as the bene¯ts of device scaling motivate our approach towards developing THz frequency electronics.
Factors determining f ¿
Our approach to HBT scaling is determined from the parameters that limit device bandwidth. The current-gain cuto® frequency is given by
where R ex and R c are the parasitic emitter and collector resistances, ¿ b and ¿ c are the base and collector transit times, C je and C cb are the emitter-base and basecollector junction capacitances, and I c is the collector current.
Examining each term separately, we begin with the base transit time ¿ b . If a linear grading of the base semiconductor bandgap energy with position is used to reduce ¿ b , then
where ¢E is the grading in the base bandgap energy, T b the base thickness, and D n is the base minority carrier di®usivity. For a typical InGaAs base at 5 £ 10 19 =cm 3 doping, 52 meV bandgap grading is su±cient to reduce ¿ b by » 2:1. For a thick base layer or a large v exit , ¿ b / T 2 b ; with InGaAs base layers below »400 º A thickness, the exit velocity term in eqn. 2 adds a signi¯cant correction.
The collector transit time ¿ c is the mean delay of the collector displacement current, and in¯rst order analysis is given by 20 ,
where v(x) is the position-dependent electron velocity in the collector drift region and v e® an e®ective electron velocity. ¿ c is most strongly dependent upon the electron velocity in the proximity of the base, and becomes progressively less sensitive to the electron velocity as the electron passes through the collector 21 . In HBTs with thin epitaxial layers, nonequilibrium electron transport is observed in the collector drift region 22 . At low collector-base bias voltages, electrons may travel through a signi¯cant fraction of the collector drift region in the high velocity ¡-valley before acquiring su±cient kinetic energy (0.55 eV for InGaAs 23 , 0.6eV for InP 24 ) to scatter to the lower velocity satellite L-valley. As a result, v(x) is fortuitously highest near the base. This velocity overshoot e®ect signi¯cantly reduces collector transit times and in thin InGaAs or InP layers (< 3000 º A), v e® = 3{5 £ 10 7 cm=s. By contrast, measured saturation velocities in thick InGaAs drift layers are in the range of v sat = 6{9 £ 10 6 cm=s 23 . Equation 3 assumes that v(x) is not modulated by changes in carrier density. However, under high injection conditions, electrons screen the bound charge in the collector region and in°uence the electric¯eld pro¯le. Changes in carrier density alter the¯eld pro¯le and modulate v(x). In the presence of velocity modulation the collector transit time is given by
where ¿ c is de¯ned in eqn. 3, and J c is the collector current density. Equation 4 predicts a reduction in collector transit time with velocity modulation correlating a decrease of velocity with increasing current density. Despite these predictions, experimentally we have not measured an anomalous reduction in transit time at high current densities for InP-based HBTs.
The RC charging terms in eqn. 1 comprise a signi¯cant fraction of the total forward delay of submicron HBTs, and these terms must be considered in detail. Consider¯rst the term [kT=qI c ]C cb . The limits of collector current density are set by the onset of base pushout (the Kirk e®ect 26 ). At high collector current densities, electron space charge screening at the edge of the base-collector junction eventually leads to a collapse of the electric¯eld. Holes may then di®use into the collector e®ectively extending the base region and leading to an increase in base transit time and collector-base capacitance. It has been shown by Ishibashi that GaAs HBTs show improved f ¿ when biased close to the Kirk threshold 22 , as the reduced electric eld at the collector-base junction edge increases velocity overshoot and reduces the collector transit time. We will ignore these considerations in considering the contribution of [kT =qI c ]C cb to f ¿ .
From electrostatic considerations, the maximum collector current before base pushout is
where v sat is an (assumed) uniform electron velocity within the collector, and the collector doping N d is choosen to obtain a fully-depleted collector at zero bias current and the applied V cb . The collector capacitance is C cb = ²A c =T c . With the HBT biased at
. This delay term is thus minimized by scaling (reducing T c ), but bias current densities must increase in proportion to the square of the desired fractional improvement in f ¿ .
The emitter charging time (C je [kT =qI c ] in eqn. 1) plays a signi¯cant role in determining f ¿ . If we were to assume that C je were simply a depletion capacitance, it would be reasonable to expect that this charging time could be minimized simply by making the emitter-base depletion region very thick, by use of very low emitter doping, combined with a thick bandgap grading region in the base-emitter heterojunction. The tradeo®s between the depletion capacitance and excessive charge storage in the depletion layer were considered in detail elsewhere and the results are repeated here 17 . Using methods similar to those used to derive the collector transit time 20 ,
where T eb is depletion layer thickness and n(x) is the electron density in the depletion region. The term (kT =qI c )C je in eqn. 1 can be then written as
where ¡ = kT=¢E ¡ (kT =¢E ¡ D n =v exit T b )e ¡¢E=kT is a factor involving the base bandgap grading (¡ ' 1 for an ungraded base) and ³ = x=T eb is a normalized position variable. The¯rst term in eqn. 7 results from the depletion-layer capacitance, and is minimized using high bias current densities J e = I e =A e ; the second term re°ects Figure 4 : Cross-section of the emitter layers within a typical HBT, comprising an a heavily-doped semiconductor contact (\cap") layer, a low-resistance N ++ emitter layer, and the N + emitter. Lateral depletion of the N + emitter can be signi¯cant in submicron devices.
storage of mobile electron charge within the depletion layer, and is minimized by reducing T eb T b . This analysis clearly shows that the depletion region thickness cannot be inde¯nitely extended to reduce base-emitter junction capacitance, as charge storage in the region also contributes to the transistor's forward delay.
The delay term R ex C cb is a major limit to HBT scaling for high f ¿ . Because of the relative sizes of the emitter and collector Ohmic contacts, in a well-designed submicron HBT, R c is 4:1 to 10:1 smaller than R ex and R c C cb can be neglected in a¯rst analysis. To calculate R ex , we must consider the geometry of the emitter layer structure.
The emitter layer structure of a typical HBT (¯g. 4) contains a heavily-doped and narrow-bandgap contact (\cap") layer, and a heavily-doped N ++ wide-bandgap emitter layer. A portion of the emitter layer may be more lightly (N +) doped for reduced junction capacitance, and may be of several hundred º A thickness to avoid dopant di®usion from the N ++ layer into the emitter-base junction. Accurately calculating the emitter resistance requires the consideration of the resistivity and dimensions of each of the emitter layers. For submicron emitters, the junction width W e;junct is signi¯cantly smaller than the contact width W e;contact due to lateral undercutting of the emitter during etching of the emitter-base junction. For simplicity in scaling analysis, we will approximate
where ½ e is a¯tted parameter, approximately 50−¡¹m 2 for submicron InAlAs/InGaAs HBTs fabricated to date at UCSB.
The R ex C cb charging time can now be examined. Since C cb = ²A c =T c ,
This term can constitute a signi¯cant delay. In HBTs we have fabricated with 275 GHz peak f ¿ , the substrate transfer process allows A c =A e to be kept small at 2.3:1, yet R ex C cb still constitutes 11% of the total 1=2¼f ¿ forward delay. In mesa HBTs (¯g. 1) A c =A e is often larger than 2.3:1 and hence R ex C cb will contribute a larger delay. Because R ex C cb / 1=T c , thinning the collector to reduce ¿ c also increases
To increase HBT current gain cuto® frequencies, the base and collector layers must be thinned and the bias current density increased. Thinning the collector increases R ex C cb , imposing a limit to scaling. Limits to bias current density imposed by device reliability, and loss in breakdown voltage with reduced collector thickness, are two further potential limits to scaling. Finally, unless the device structure of g. 1 is laterally scaled, vertical HBT scaling for increased f ¿ will result in reduced power-gain cuto® frequencies f max .
Factors determining f max
In an HBT with base resistance R bb and collector capacitance C cb , the powergain cuto® frequency is approximately
1=2 . The base-collector junction is a distributed network, and ¿ cb represents an e®ective, weighted time constant. Because the base-collector junction parasitics are distributed, calculation of ¿ cb is complex. To simplify analysis, we will¯rst roughly approximate
1=2 , where R bb C cb is the product of the base resistance and the full capacitance C cb = ²A c =T c of the collector-base junction.
The base resistance R bb is composed of the sum of contact resistance R cont , baseemitter gap resistance R gap , and spreading resistance under the emitter R spread . With base sheet resistance ½ s , and speci¯c (vertical) contact access resistance ½ v , we have
The base-collector time constant is then
Consider the in°uence of device scaling on the time constant R bb C cb . Decreasing the base thickness to reduce ¿ b increases the base sheet resistivity ½ s , increasing R bb C cb . Decreasing the collector thickness T c to reduce ¿ c directly increases R bb C cb , as is shown explicitly in eqn. 11.
Low R bb C cb , and consequently high f max , is obtained by scaling the emitter and collector junction widths W e and W c to submicron dimensions. Reducing the emitter width W e alone reduces towards zero the component of R bb C cb associated with the base spreading resistance (the second term in eqn. 11). In the normal triple-mesa HBT (¯g. 1), the minimum collector junction width W c is set by the base Ohmic contacts which must be at least one contact transfer length
. As a result, the component of R bb C cb associated with the base contact resistance (the¯rst term in eqn. 11) has a minimum value, independent of lithographic limits, and consequently, f max does not increase rapidly with scaling. Given this minimum R bb C cb , attempts to obtain high f ¿ by thinning the collector have resulted in decreased f max , frustrating e®orts to improve HBT bandwidths.
If the parasitic collector-base junction is eliminated, f max will instead increase rapidly with scaling. The collector-base junction need only be present where current°o ws, e.g. under the emitter. We have fabricated such a device (¯g. 2) using a substrate transfer process. If we neglect processing alignment tolerances, the emitter and collector junctions can be of equal width, hence W c = W e . With submicron scaling of the emitter and collector junction widths, the¯rst term in eqn. 11 dominates and scales as W e . f max then increases as the inverse square root of the process minimum feature size.
To more accurately predict f max , the distributed nature of the base-collector junction parasitics must be considered. Figure 5 shows a distributed model of a transferred-substrate HBT. Using a small grid spacing, we have entered the resulting network into a microwave circuit simulator (HP-EESOF 27 ) to calculate {without approximation{ the HBT f max . Alternatively, analytic expressions for f max can be developed from hand analysis of the distributed network of¯g. 5. Among these is the model of Vaidyanathan and Pulfrey 28 , which provides good physical insight. We now consider the Vaidyanathan/Pulfrey model applied to a transferredsubstrate HBT 29 . Figure 6 shows an equivalent circuit representing the Vaidyanathan/Pulfrey model of a transferred-substrate HBT. We de¯ne three capacitances. C cb;e = ²L e W e =T c is the capacitance of the collector junction lying under the emitter. C cb;gap = 2²L e W eb =T c is the capacitance of the collector junction lying under the gap between the emitter and the base contact. C cb;ext = 2²L e W cb =T c is the capacitance of the collector lying under the base Ohmic contacts. Components of the base resistance are as de¯ned in eqn. 10, with the exception of two additional resistances R vert = ½ v =2W cb L e , and R horiz = ½ s W cb =2L e . R vert represents the vertical access resistance through the base Ohmic contact over the path W cb , and R horiz represents the lateral sheet resistance over that same path.
The R vert /R horiz network approximates the distributed network charging C cb;ext in the mesh model of¯g. 5. This approximation is valid under the condition that W cb · L contact . If this condition does not hold, R vert and R horiz must be replaced by a¯nite element ladder network with a larger number of discrete elements, as in g. 5. The model of¯g. 6 further assumes that the total base width W b >> L contact . Typical geometries of transferred-substrates HBTs meet both of the aforementioned assumptions.
From¯g. 6, we note that charging resistance seen by C cb;e and C cb;gap contains the component R x = R cont k R vert + R horiz = R cont . While the simpli¯ed lumped element model of¯g. 6 approximates¯g. 5 only if W cb · L contact , the relationship R x = R cont is in general true even for W cb > L contact , as are the expressions for f max presented below.
In the limit of zero collector series resistance, Vaidyanathan and Pulfrey's model, 28 , 29 reduces to
where
and
The model of¯g. 6 accurately represents the distributed nature of the collectorbase junction. We can approximate this network with the simple hybrid-¼ model of¯g. 3 if we select C cbi such that the correct transistor f max is obtained. The components of¯g. 3 are then given by C cbi = ¿ cb =R bb and C cbx = C cb ¡ C cbi , where C cb = C cb;e + C cb;gap + C cb;ext . Figure 7 compares the f max of mesa and transferred-substrate HBTs, computed using the¯nite-element model. For the transferred-substrate device, f max increases rapidly with deep submicron scaling. Experimentally, we observe a more rapid variation of predicted f max with collector width than is shown, and¯g. 7 predicts a higher f max than is experimentally observed for mesa HBTs. Series resistance in the base metallization and collector series resistance 28 (not modeled above, and not present in Schottky-collector transferred-substrate HBTs) are possible explanations for the discrepancy.
At high collector current densities, di®erential space-charge e®ects in the collector space-charge region result in C cb;e smaller than ²A e =T c , and increase the HBT f max 30 , 31 . In III-V materials at high¯elds, electron velocity v(E) decreases with increasing electric¯eld. Modulating the collector voltage V cb modulates the collector transit time ¿ c (eqn. 3), and partially modulates the space-charge in the collector drift region. This modulated space-charge partially screens the base from modulations in the collector applied¯eld, and C cb;e is reduced to 31 C cb;e = ²A e =T c ¡ I c d¿ c dV cb (15) where ² A e / T c is the normal dielectric capacitance of the junction. Experimental data con¯rming C cb cancellation will be shown in section 5. The derivation of eqn. 15 is limited by the charge control assumption that changes in carrier concentration occur instantaneously. Clearly, this is not valid at frequencies approaching the inverse of the collector transit time. We now describe the dynamics of capacitance cancellation to¯rst order in frequency, assuming a simpli¯ed velocity dependence 1=v(E) ' · 0 + · 1 E . We further assume that the electric¯eld induced by mobile collector charge is small compared to both the DC and AC applied¯eld across the collector-base junction. Using a formalism similar to 20 , 21 , 31 it can then be shown that
where d¿ c =dV cb = · 1 =2. The term -I c d¿ c =dV cb is the capacitance cancellation of eqn 15, while the terms in j! represent the dynamics of this e®ect. The di®erential Figure 8 : Modi¯ed hybrid-¼ small-signal HBT equivalent circuit with additional negative capacitance C cb;canc and negative resistance R cb;canc elements to account for dynamics of capacitance cancellation. equation of eqn. 16 can be represented by an equivalent circuit model consisting of the dielectric junction capacitance C cb;e in parallel with the negative capacitance cancellation term, C cb;canc = ¡I c d¿ c =dV cb , in series with a negative resistance, R cb;canc = (3=2)¿ c =C cb;canc . These terms are included in the Vaidyanathan/Pulfrey transistor model of¯g. 6. We note that C cb;canc and R cb;canc are charged through the total base resistance (R b;cont + R gap + R spread ). The elements C cb;canc and R cb;canc can therefore appear in the approximate hybrid-¼ model appearing in shunt across C cbi . A revised hybrid-¼ model including the capacitance cancellation terms is shown in¯g. 8.
We caution that this derivation models only to the¯rst order in frequency the dynamics of the space-charge redistribution in the collector region. However, the method, though approximate, is su±cient to predict that negative resistance e®ects should be observed, and may explain device results presented in section 5.1.
Transferred-substrate HBTs
We now consider the transferred-substrate process as a means of realizing a highly scalable HBT. Substrate transfer provides access to both sides of the device epitaxial material, which allows for the simultaneous de¯nition of narrow emitter and collector stripes. With the extrinsic collector-base capacitance greatly reduced, aggressive lithographic scaling without epitaxial scaling greatly increases f max at constant f ¿ . If high values of both f ¿ and f max are sought, simultaneous lithographic and epitaxial scaling is required. Further improvements in device bandwidth will require operation at higher current densities and reduced emitter parasitic resistance. 
Growth and fabrication
The MBE epitaxial structure is grown on a Fe-doped semi-insulating InP substrates. Both single and double heterojunction transistors have been fabricated in the transferred-substrate technology. The single heterojunction transistors have an InAlAs/InGaAs emitter-base junction. The double heterojunction devices have an InP collector for increased breakdown, and may have an InP emitter for improved heat°ow in the device. A chirped superlattice grade is used to smooth conduction band discontiniuties at the heterojunctions. The InGaAs base is typically 300{400 º A thick, has 2kT bandgap grading, and is Be-doped at 5£10
19 =cm 3 . The transistor collector thickness is typically 2000-3000 º A and a N + pulse-doped layer placed 400 º A from the base delays the onset of base push-out at high collector current densities. High f max devices are typically fabricated with Schottky collector contacts which provide zero collector series resistance 33 . Figure 9 shows the process°ow. Standard fabrication processes 34 de¯ne the emitter-base junction, the base mesa, polyimide planarization, and the emitter contacts. The IC wiring environment consists of thin-¯lm NiCr resistors, two levels of metal interconnects, and a PECVD Si 3 N 4 insulator layer for MIM capacitors. The substrate transfer process commences with deposition the Benzocyclobutene (BCB) transmission-line dielectric (5 ¹m thickness). Thermal and electrical vias are etched in the BCB. The wafer is electroplated to metallize the vias and to form the ground Fig. 10 shows a detailed device cross section.
Minimum device feature sizes are determine by the lithography system used in the process. The projection lithography system at UCSB can de¯ne emitter widths down to 0.5 ¹m. The relative sizes of the emitter and collector junctions are determined by lithographic alignment tolerances, and the collector stripe width must exceed the emitter stripe width by twice the lithographic alignment tolerance. Our projection lithography system aligns to 0.1{0.3 ¹m registration, depending on the time since maintenance. For deep submicron devices, a JEOL JBX electronbeam lithograpy system at UCSB is used. With this system, 0.2 ¹m emitter and 0.3 ¹m collector stripe widths have been realized. Collector alignment of better than 0.1 ¹m is acheived using local registration marks for each device.
For the emitter-base junction, deep submicron scaling requires tight control of lateral undercutting during the base contact recess etch. The undercut both narrows the emitter and de¯nes the lifto® edge in the self-aligned base contact deposition. For InAlAs emitters, a combination dry and wet etch is used. A CH 4 / H 2 / Ar dry etch removes the N + InGaAs emitter contact layer and etches into the InAlAs emitter. A HCl/HBr/Acetic selective wet etch then removes the InAlAs The collector junction is de¯ned by the stripe width of the deposited metal. Subsequent to collector deposition, a self-aligned wet etch of »1000 º A depth removes the collector junction sidewalls (eliminating fringing¯elds) and reduces the collector junction width by »2000 º A.
High Frequency Device Measurements
Prior to discussing transferred-substrate device results, we will consider issues related to high frequency device measurements. Submicron transistors have extremely small reverse transmission characteristics and low shunt output conductances. These features make device measurement and model extraction challenging even in the DC-50 GHz band covered by typical commercial vector network analyzers (VNAs). As state-of-the art transistor bandwidths far exceed this frequency range, we would like to measure device scattering parameters (S-parameters) at as high a frequency as possible. Presently, VNA test set extensions are available covering frequencies up to 220 GHz, and 325 GHz systems will soon be available.
Accurate device measurements at these frequencies requires that careful attention be paid to measurement and calibration methodology.
Ultra-high frequency measurement systems
The 140-220 GHz VNA measurement system used at UCSB consists of an Agilent HP8510C network analyzer interfaced with Oleson Microwave Lab millimeter wave VNA extensions. Frequency synthesizers in the VNA test system generate 17.5-27.5 GHz RF and 11.6-18.4 GHz LO signals that are sent to the Oleson extenders through microwave coaxial cables. A harmonic multiplier chain in the extenders upconverts the RF signal to the measurement frequency, and harmonic mixers downconvert stimulus and response signals obtained from a dual directional coupler at the extender's test ports. The IF signals ( < 300 MHz) are then sent back to the VNA for processing. Full two-port transmission and re°ection measurements can be obtained with this system, and the dynamic range is greater than 50 dB. The test ports of the extenders are connected to on-wafer probes through short lengths of WR-5 rectangular waveguide. Due to the relatively high loss of the waveguide ( 12 dB/m), it is important to keep the the connection length as short as possible to preserve the measurement system's dynamic range. At UCSB, » 50 cm of waveguide with two right angle bends is used. This arrangement is found to provide adequate loss and su±cient range of motion for probe manipulation. GGB Industries groundsignal-ground wafer probes are mounted onto probe station micromanipulators. A waveguide-to-microcoax transition is internal to the probes, and the insertion loss of the probes is better than 3 dB across the band. Internal bias-tees in the wafer probes allow for biasing of active devices through the center probe conductor.
On-wafer calibration
To underscore the importance of measurement calibration for submicron device measurements, consider that a state-of-the-art InP HBT with a 300 GHz f ¿ has a total forward delay of 0.53 psec, the same delay as » 100 ¹m length of transmission line in our on-wafer transmission line environment (² eff =2.2). One sees the importance of removing from transistor measurements the e®ects of all extraneous propagation delays and losses incurred in the measurment system up to the device under test.
An accurate VNA calibration will place the measurement reference planes precisely at the input and output of the device under test. However, standard 12-term VNA error corrections do not account for leakage and coupling between on-wafer probes. Highly scaled transistors have extremely small reverse transmission characteristics (S12) and excessive probe-to-probe coupling can easily corrupt device measurements. Probe-to-probe leakage can be accounted for using more complicated 15-or 16-term VNA error corrections 35 , 36 . However, these corrections require precise characterization of calibration standards, and such characterizations are di±cult to achieve for on-wafer elements, particularly at mm-wave frequencies.
The use of 15-or 16-term error corrections can be avoided if the wafer probes are spaced far enough apart to provide su±cient isolation. Probe isolation that is at least 20 dB lower than S12 of the transistor is su±cient for accurate device characterization 37 . Separation between wafer probes is achieved by adding lengths of 50 − on-wafer transmission line to the input and output of the device. At UCSB, a length of 230 ¹m transmission line at each port has been found to provide su±cient isolation.
Calibrating the network analyzer involves the measurement of various known calibration standards. A standard approach for device measurements is to utilize a separate calibration substrate that has on it an array of characterized calibration standards. These substrates are available commercially and cover various frequency ranges up to 110 GHz. The goal of these calibrations is to place the measurement reference planes at the wafer probe tips. There are several drawbacks to using this approach for precise device measurements.
As previously mentioned, the transistor is embedded on-wafer between lengths of transmission line. If we use a probe tip calibration, the e®ect of the embedding structures must be eliminated from the transistor measurements. An ad hoc approach often used is to measure the capacitance of an open circuit test structure and subtract this capacitance from the measured results. This approach can lead to considerable error as the pad capacitance may be of the same order as the input capacitance of a submicron device. This approach also ignores the series resistance of the embedding structure, and the series inductance which will have a considerable e®ect at mm-wave frequencies. A more precise determination of the electrical characteristics of the embedding structures may made by modeling the structures electromagnetically, or by measuring the test structures without devices and¯tting the results to a lumped element model. In either case, this adds a level of complexity and the opportunity for further error in extracting device parameters.
The second drawback of probe-tip calibration approach is that calibration substrates generally have a di®erent transmission line environment than the device under test 38 . A standard VNA calibration assumes that only a single propagation mode exists at the calibration reference plane for both measurement and calibration. The discontinuity at the probe/wafer interface does not meet these conditions, and the¯eld distribution at the discontinuity will depend on the transmission line environment that is being coupled into. As such, the probe-tip calibration on the calibration substrate need not apply to the substrate of the device under test. We expect discrepancies to increase at higher frequencies as the wavelength approaches the size of the probe tips.
The alternative to probe-tip calibration is to calibrate to the ends of the onwafer transmission lines. This places the measurement reference planes at the input and output of the device under test, but requires the realization of custom on-wafer calibration standards. Depending on the calibration used, di®erent types of standards are required. Certain VNA calibrations, such as the commonly used Short-Open-Load-Through (SOLT) calibration, require precise characterization of the electrical properties of the standards. Fringing¯elds and the distributed nature of the elements at mm-wave frequencies require that the elements be modeled electromagnetically or measured using an accurate on-wafer calibration. Again, this adds a level of complexity and the opportunity for further error in extracting device parameters.
The Through-Re°ect-Line (TRL) calibration is well-suited for an on-wafer measurement environment 39 . The calibration uses two transmission line standards one of which is designated \through", the other is designated \line" and di®ers from the through line by some electrical length ¢L. The¯nal \re°ect" standard may be an open or short termination. An advantage of the TRL calibration is that the solution for the calibration error terms is overdetermined, and the re°ection coe±-cient of the re°ect standard and propagation constant of the line standard can also be calculated. The only physical property that must be known is the characteristic impedance of the line standard. The characteristic impedance can be determined analytically from transmission line models or electromagnetic simulations, or alternatively, it can be calculated from measurements of the line's capacitance and propagation constant 40 , 41 . It is important to note that line loss will lead to a characteristic impedance that has frequency dependent real and imaginary parts. The imaginary part can be large at low frequencies and should be accounted for in the measurement calibration.
An often cited disadvantage of the TRL calibration is that one line standard can only cover a 1:8 frequency span, with the ideal ¢L being a quarter-wavelength at the center of the span. As such, multiple line standards are required to cover large frequency ranges, and low frequency line standards can take up a large amount of valuable wafer area. Multiple line standards may also be used to provide measurement redundancy in a band, and reduce the error due to probe placement repeatability 42 .
Quantitatively assesing the accuracy of a microwave calibration is di±cult. To partially verify the calibration accuracy, we have re-measured calibration standards after calibration in the 75-110 GHz and 140-220 GHz bands. Measurement of a through standard after calibration gives an indication of probe-placement repeatability, as the calibration de¯nes the measurement reference planes to be at the center of the through line. In the 75-110 GHz band, the measurement of a through line showed better than 35 dB return loss, and S21 had < 0.2 dB amplitude variation and < 0.3 degrees of phase variation. In the 140-220 GHz band, measurement of a through line showed better than 30 dB return loss, and S21 had < 0.1 dB amplitude variation and < 1 degree of phase variation. As discussed previously, the TRL calibration does not assume a known re°ection coe±cient of the re°ect standard. Measurement of the short or open re°ection standard after calibration, therefore, provides a good indication of the quality of the calibration. In the 75-110 GHz band, measurement of the open standard showed < 0.25 dB amplitude variation and < 1.5 degrees of phase variation. In the 140-220 GHz band, the calibration appeared slightly poorer. Measurement of the open standard showed < 0.4 dB amplitude variation and < 3 degrees of phase variation. Figure 12 shows measurements on a Smith chart of the calibration standards in both frequency bands.
Independent of re-measuring the calibration standards, a quantitative estimate of calibration accuracy would require measurement of known on-wafer elements. Measurements, to be presented in Section 6, show excellent agreement between electromagnetic simulations of passive matching network elements and measured results. Additionally, device measurements presented in the next section show smooth variation across all frequency bands.
Device Results
Depending on the circuit application transferred-substrate devices can be aggressively laterally-scaled for ultra-high f max , or both laterally and vertically scaled for simultaneously high values of f max and f ¿ . As we are concerned here with high frequency tuned circuit applications, we will consider only those devices scaled for high f max . These devices are typically fabricated with 400 º A base thicknesses for low base resistance, and 3000 º A thick collectors for low collector-base capacitance. This results in moderate, not high, f ¿ . Figure 13 shows microwave gains for a deep submicron single heterojunction transistor fabricated using electron-beam lithography, reported by Lee et. al. 43 . The emitter and collector junction dimensions are 0.4 ¹m £ 6 ¹m and 0.7 ¹m £ 10 ¹m, respectively. The measurements were made in the 10-50 GHz and 75-110 GHz frequency bands using the TRL calibration described in the previous section. At the time these measurements were made, the 140-220 GHz measurement set-up had not yet been obtained. With the device biased at V ce = 1:2 V and I c = 6 mA (J e = 2:5 £ 10 5 A=cm 2 ), the transistor exhibits an extrapolated f ¿ of 204 GHz. Mason's invariant (unilateral) power gain is measured to be greater than 20 dB at 100 GHz. If we extrapolate at -20 dB/decade, an f max of > 1 THz is predicted. However, recent device measurements have indicated that highly scaled devices do not show a well behaved roll-o® with frequency in the unilateral gain. Prior to considering these device measurements, we discuss the use of the Mason's gain to predict transistor f max .
For a general two-port network Mason's invariant (unilateral) power gain is given by
where Y 21 and Y 12 are network admittance parameters, and G 11 , G 22 , G 12 , and Mason's invariant, U, the unilateral gain 0) using lossless reactive feedback. The gain is invariant with respect to embedding the device in a lossless reciprocal network, and consequently is independent of pad inductive or capacitive parasitics and independent of the transistor con¯guration (common-emitter vs. common-base). For HBTs well-modeled by a hybrid-¼ equivalent circuit¯g. 3, Mason's gain conforms closely to a -20 dB/decade variation with frequency (¯g. 14). In marked contrast, the maximum available / maximum stable gain is a function of the transistor con¯guration, and shows no¯xed variation with frequency. f max is unique; at f = f max the MAG/MSG and U are both 0 dB. 
A=cm
2 ). The transistor measurements show a negative unilateral power gain across across the 75-110 GHz band and over parts of the 140-220 GHz band. Above ' 45 GHz, the unilateral power gain increases to in¯nity, and then becomes negative, a condition under which the addition of an appropriate small resistive attenuation results in in¯nite U. can be negative if a device has a negative real output conductance G 22 or a positive real feedback term G 12 . The transistor of¯g. 15 exhibits a very small negative output conductance that peaks at approximately -1 mS, leading to the observed negative unilateral gain. An HBT modeled by the hybrid-¼ transistor model cannot show a negative output conductance. We speculate that the e®ect arises from small secondary HBT transport e®ects in the collector region, either through the dynamics of base-collector capacitance cancellation as described in Section 2.2, or through weak IMPATT e®ects in the collector depletion region. These e®ects would not be seen in a typical III-V HBT because of the large positive output conductance arising from high-frequency feedback through R bb and C cb . In contrast, submicron transferred-substrate HBTs have an extremely small R bb C cbi time constant, and such e®ects can be observed.
The dynamics of capacitance cancellation may well be the cause of the negative unilateral gain. A dramatic decrease in measured base-collector capacitance is observed with increased bias current. The total collector-base capacitance C cb is determined from the measured variation with frequency of the imaginary part of the admittance parameter =[Y 12 ] = j!C cb . The total C cb determined from Y 12 (¯g. 17) shows a 2.4 fF decrease between 0.5 mA and 5 mA I c . The decrease in C cb results in greatly increased power gains at higher bias currents. Adding a series Another possible explanation for the negative output conductance is systematic errors in the microwave measurements. As discussed in sec. 4, a great deal of work in our group has been put towards developing an accurate calibration methodology. However, measurements of U are inherently di±cult, as both products in the denominator of eqn. 17 approach zero for an HBT with small collector-base parasitics. Note that the transistor measurements showed little variation over a number of measurements with di®erent calibrations. Further, the negative output conductance was observed for numerous devices of the same dimensions on the wafer. The transistor S-parameters also show relatively smooth variation across all of the measured frequency bands (¯g. 16), showing no evidence of resonances or calibration artifacts. Ultimately, we hope to¯nd further evidence of negative unilateral power gain in next-generation transistor designs.
A consequence of the observation of negative U is that we cannot predict f max of these highly scaled devices from a -20 dB/decade extrapolation. Nevertheless, the maximum stable / maximum available gain of these devices is very high even at Using the double heterojunction process large area power transistors have recently been fabricated. A multi-¯nger common-base device with a total emitter area of 128 ¹m has been measured with an extrapolated f max of 330 GHz 45 . The transistor has a breakdown voltage of 7 V and a maximum collector current of I c = 100 mA. These power transistors are being used for W-band power ampli¯ers. . To develop the model, the transistor S-parameters are measured at various DC bias conditions, and the measured Y-parameters are analyzed to extract the bias-dependent parameters, such as the transconductance and emitter-base di®usion capacitance, and the biasindepent terms, such as the extrinsic emitter resistance.
In general, we observe that the hybrid-¼ model of a submicron HBT shows good correlation with measured S-parameters, h 21 , and U in the DC-50 and 75-110 GHz bands 43 . The model parameters are also consistent with measured bulk and sheet resistivities and junction capacitances. Base-width modulation in HBTs is negligible, hence R ce is very large. C be;poly is a metal-polyimide-metal overlap capacitance between the emitter and base contacts (sec. 10).
As previously discussed, the negative output conductance observed in recently fabricated submicron devices cannot be modeled with a standard hybrid-¼ model. Additionally, we have found that device models developed in the 6-45 GHz band show poor agreement with measured device parameters in the 140-220 GHz band. Figure 20 shows measured and modeled S 11 and S 22 in the 6-45 GHz and 140-220 GHz bands for the transistor of¯g. 19. S 12 and S 21 are omitted from the graph for clarity, but show similiar discrepancy in the 140-220 GHz band. The poor agreement between model and measurements in the higher frequency band points to a weakness in extending the simple hybrid-¼ model to these frequencies.
The results of¯g. 20 are for a highly scaled device and the discrepancy may be due to collector-transport e®ects not included in the model. Comparison of model and measurements for less highly scaled devices in the 140-220 GHz band have not been made at the time of this writing, and to the best of our knowledge no work has been done to characterize mesa HBTs in the 140-220 GHz band. It should be noted that the hybrid-¼ model approximates to¯rst order in frequency the base and collector transit time e®ects. In contrast, the T-model (common-base) does not require such approximations. We note, however, that a T-model of the device, developed from 6-45 GHz measurements, also could not¯t the measured S-parameters in the 140-220 GHz band.
At the time of this writing, a physically justi¯able small-signal device model has not been developed for the 140-220 GHz band. This further complicates estimations of transistor bandwidth, as we cannot predict the power gain roll-o® versus frequency. Higher frequency ampli¯er and oscillator designs will also be limited by the lack of a predictive model. Wideband analog ampli¯ers and high frequency tuned ampli¯ers have been fabricated in the transferred-substrate HBT technology. Prior to presenting ampli¯er results, we consider some of the design issues faced speci¯cally in ultra-high frequency tuned ampli¯ers. Because of process complexity and yield issues,¯rst generation tuned ampli¯er designs have emphasized simple design strategies and low transistor counts. Circuit design is performed using Agilent's Advanced Design System software 46 . Smallsignal transistor models are developed in-house using the procedures described in Section. 5.1. Large-signal model development is more complex, and physically based models have been developed for power ampli¯er designs. As described in the previous section, poor correlation has been found between the small-signal hybrid-¼ model and measurements in the 140-220 GHz band. First generation designs in this frequency band were shifted from their design frequencies due to model discrepancies 47 , 48 . Improved designs are now being developed based on measured transistor S-parameters.
Tuned-ampli¯er designs have typically utilized transmission line matching networks, with lumped passive elements for stabilization and biasing. At mm-wave design frequencies, electromagnetic simulation of passive elements is essential. A planar method-of-moments electromagnetic simulator is used to model critical passive elements and transmission line discontinuities. The transferred substrate technology provides a low-dielectric (² ef f =2.7) microstrip wiring environment with a thin substrate thickness (5 ¹m). Standard microstrip CAD models have been found adequate to describe straight sections of transmission line to the frequency limits of our measurement system. The design approach of utilizing electromagnetic simulation of unique passive elements with standard microstrip models has shown excellent agreement with measurements. Figure 21 shows modeled and measured S-parameters for the matching network of the single-stage ampli¯er described in the next section. A matching network test structure without an active device was realized on-wafer, and the model and measurements show good agreement across the 140-220 GHz band.
The 5 ¹m BCB microstrip dielectric used in the transferred substrate process was selected to provide a low inductance, low cross-talk wiring environment for densely packed mixed-signal IC applications. The thin dielectric also improves thermal heatsinking, and provides low inductance access to the backside ground plane. In tuned ampli¯er design, these advantages are o®set by the high resistive losses incurred in the transmission line matching networks. For the ampli¯er using the matching network of¯g. 21, simulation of the circuit with lossless matching networks resulted a 2.0 dB increase in the gain. Given that resistive losses increase inversely with substrate thickness for a line of constant characteristic impedance, increasing the dielectric thickness may be bene¯cial for circuits utilizing transmission line matching networks. Figure 22 shows a chip photo and measured S-parameters of a single-stage G-band tuned ampli¯er 47 , 48 . The transistor used in the design had an emitter junction area of 0.4 ¹m £ 6 ¹m, and a collector stripe of 0.7 ¹m£ 6.4 ¹m. The ampli¯er employed a simple common-emitter topology. Shunt-stub tuning at the input and output of the device was used to conjugately match the transistor at the intended design frequency. A shunt resistor at the output was used to ensure low frequency stability, and a quarter-wave line to a radial stub capacitor bypassed the resistor at the design frequency.
Ampli¯er results
The ampli¯er exhibited a peak gain of 6.3 dB at 175 GHz, with a gain of better than 3 dB from 140 to 190 GHz. Both the input and output return loss were better than 10 dB at 175 GHz. The gain-per-stage of the ampli¯er is amongst the highest reported from any transistor technology in this frequency band. Multi-stage ampli¯er designs based on this¯rst generation design are currently being fabricated. Simulations of three-stage ampli¯er designs predict a peak gain of 20 dB at 175 GHz.
W-band HBT ampli¯ers are being developed in the transferred-substrate technology for phased-array antenna applications. First-generation designs utilizing the A cascode ampli¯er exhibited 8.5 dB gain at 75 GHz with a 1 dB gain compression output power of 9.4 dBm. A balanced ampli¯er employing two cascode stages had a gain of 7.9 dB at 78 GHz and a 1 dB compression output power of 10.7 dBm. Second generation power ampli¯er designs using the higher breakdown InP double heterojunction technology have recently been fabricated 50 . A common-base power ampli¯er exhibited 8 dB gain and a saturated output power of greater than 16 dBm.
In addition to tuned ampli¯ers, broadband analog ampli¯ers for optical¯ber receivers have also been fabricated. Results from this e®ort include: 80 GHz distributed ampli¯ers 51 (¯g. 24), 50 GHz di®erential ampli¯ers 52 , and Darlington and f ¿ -doubler resistive feedback ampli¯ers (¯g. 25) 53 . Greater than 400 GHz gain-bandwidth product has been obtain from a single Darlington stage 53 .
Conclusions
Extending transistor bandwidths towards terahertz frequencies requires agressive device scaling. High bandwidths are obtained with heterojunction bipolar transistors by thinning the base and collector layers, increasing emitter current density, decreasing emitter contact resistivity, and reducing the emitter and collector junction widths. HBT ampli¯ers have been demonstrated in the 140-220 GHz band, and transistors show high levels of available gain at the frequency limits of state-of-theart measurement systems. The physical characteristics of submicron HBTs make accurate measurement and modeling di±cult. Next generation ampli¯er designs will require further scaling of minimum device feature sizes, and accurate modeling of active and passive circuit components.
