Abstract. The scaling of complementary metal oxide semiconductor (CMOS) transistors has led to the silicon dioxide layer used as a gate dielectric becoming so thin (1.4 nm) that its leakage current is too large. It is necessary to replace the SiO2 with a physically thicker layer of oxides of higher dielectric constant (κ) or 'high K' gate oxides such as hafnium oxide and hafnium silicate. Little was known about such oxides, and it was soon found that in many respects they have inferior electronic properties to SiO2, such as a tendency to crystallise and a high concentration of electronic defects. Intensive research is underway to develop these oxides into new high quality electronic materials. This review covers the choice of oxides, their structural and metallurgical behaviour, atomic diffusion, their deposition, interface structure and reactions, their electronic structure, bonding, band offsets, mobility degradation, flat band voltage shifts and electronic defects. The use of high K oxides in capacitors of dynamic random access memories is also covered. 
Introduction

Scaling and gate capacitance
The most important electronic device is the complementary metal oxide semiconductor (CMOS) field effect transistor (FET) made from silicon. This has arisen because the performance of CMOS devices has continued to improve over a forty year time span according to Moore's Law of scaling. This notes that the number of devices on an integrated circuit increases exponentially, doubling over 2 or 3 year period, to allow this. The minimum feature size in a transistor has decreased exponentially with year. The semiconductor Roadmap defines how each design parameter will scale in future years to continue this, as shown in Table 1 and Figure 1 .
The scaling cannot go on forever, and the limits to Moore's law are often believed to be in lithography and the availability of sufficiently small wavelengths of light to pattern the minimum feature size. It turns out that materials are now also an important constraint. First, the maximum current density in interconnects between transistors recently led to copper replacing aluminium as the conductor used in interconnects. Then, the problem of RC time delays around the integrated circuit led to an effort to replace the silicon dioxide used as the inter-circuit passivant by a material of lower dielectric constant such as SiO 2 F x a e-mail: jr@eng.cam.ac.uk or SiOCH alloys. But the most serious problem in logic circuits is now in the FET "gate stack", that is the gate electrode and the dielectric layer between the gate and the silicon channel.
The thickness of the SiO 2 layer presently used as the gate dielectric is becoming so thin (under 2 nm) that the gate leakage current due to direct tunnelling of electrons through the SiO 2 will be so high, exceeding 1 A/cm 2 at 1 V (Fig. 2) , that the circuit power dissipation will increase to unacceptable values [1] [2] [3] [4] . In addition it becomes increasingly difficult to produce and measure accurately films of such small thickness. Finally, the reliability of SiO 2 films against electrical breakdown declines in thin films. Thus for these three reasons, but principally due to leakage, it is desired to replace SiO 2 as a gate oxide.
Tunnelling currents decrease exponentially with increasing distance. An FET is a capacitance-operated device, where the source-drain current of the FET depends on the gate capacitance,
where ε 0 is the permittivity of free space, K is the relative permittivity, A is the area and t is the SiO 2 thickness. Hence, the solution to the tunnelling problem is to replace SiO 2 with a physically thicker layer of a new material of higher dielectric constant (permittivity) K, Figure 3 . This will keep the same capacitance, but will decrease the tunnelling current. These new gate oxides are called 'high K oxides'.
266
The European Physical Journal Applied Physics Gate metal poly Si metal gate, e.g. TaSiNx For device design, all FET dimensions scale proportionately and the precise material does not affect electrical designs, so it is convenient to define an 'electrical thickness' of the new gate oxide in terms of its equivalent silicon dioxide thickness or 'equivalent oxide thickness' (EOT) as t ox = EOT = (3.9/K)t HiK .
Here 3.9 is the static dielectric constant of SiO 2 . The objective is to develop high K oxides which allow scaling to continue to ever lower values of EOT. The gate leakage problem has been apparent since the late 1990's [4] , but then the criteria for the choice of oxide were not known. In about 2001, the choice of oxide narrowed to HfO 2 , but the problems of making HfO 2 into a successful electronic material appeared extremely high. It was not particularly believed that high K oxides would be used, but instead that device engineers would use a novel device design to circumvent the problem. However, the increasing importance the low-power sector of electronics, where power dissipation is a key issue, in mobile phones, lap-tops etc., meant that the problem must be confronted [1] . Low standby power CMOS requires a leakage current of below 1.5 × 10 −2 A/cm 2 rather than just 1 A/cm 2 . The initial problems of manufacturing high K oxide layers of sufficiently low EOT have been overcome. Recent announcements of key firms such as Intel [5] indicate that enough of the problems are now solved that high K oxides will be implemented in 2007 at the 65 nm node.
Four key problems have been identified by the industry [6] . These are (1) the ability to continue scaling to lower EOTs, (2) the loss of carrier mobility in the Si when using high K oxides, (3) the shifts of the gate voltage threshold, and finally (4) the instabilities caused by the high concentration of electronic defects in the oxides. Thus, this paper reviews the choice of oxides, their deposition, thermal stability, stability in device structures, electronic structure, interface properties, band offsets, electronic defects, carrier mobilities to understand what we have achieved so far, and how to solve these four problems.
At the same time, the scaling of the main form of memory, dynamic random access memory (DRAM), also requires a change of dielectric [7] . In DRAM information is stored as charge in a capacitor which is periodically refreshed. The capacitor must retain charge during this time, so the leakage current density through the capacitor must be below 10 −7 A/cm 2 , lower than for gate dielectrics in logic circuits. The capacitance dielectric is presently Si oxy-nitride. This will have to be replaced in the same way by a material of higher K to continue the scaling. DRAMs can continue scaling by using more complex capacitor shapes with larger surface area to delay the transition, but again it will occur. Here, although the leakage current requirement is lower, the number of constraints on high K oxide are fewer, because the oxide is not in direct contact with any Si and it must only act as an insulator. The review will also cover this aspect.
EOT
In CMOS FETs, the gate capacitance is actually the series combination of three terms, the oxide capacitance, the depletion capacitance of the gate electrode, and the capacitance to the carriers in the Si channel [1] , as shown in Figure 4 . These three capacitances add as 1/C = 1/C ox + 1/C gate + 1/C Si .
As C varies as 1/t, capacitances in series can be represented by a sum of effective distances. Thus we can define an 'effective capacitance thickness' (of SiO 2 ) as ECT = EOT + t gate + t Si .
The channel capacitance arises because quantum delocalisation of the two-dimensional electron gas of electrons means that these electrons cannot lie infinitely close to the channel surface, but must delocalise a few Angstroms into the channel. This capacitance contribution is intrinsic and cannot easily be removed.
On the other hand, the gate electrode is presently made out of degenerately doped polycrystalline silicon, for engineering convenience. Poly-Si is a reasonable metal, but it is not the best metal. Thus, its low carrier density gives a depletion depth which is a fewÅ, whereas a good metal has a higher carrier density and has a depletion depth of only 0.5Å. This depletion effect can be removed by replacing the poly-Si with a normal metal. Typical metals for this use could be TiN, TaSiN and Ru.
The metal is chosen primarily for its work function. The work function of the gate electrode determines the gate threshold voltage needed to turn the device into inversion. There are three choices [1] . In CMOS there are NMOS and PMOS devices. The first choice is to use the same metal for both NMOS and PMOS devices, in which case its work function should correspond to the mid gap energy of Si, about 4.6 eV. This is the simplest, most easily manufactured choice, but also the worst in terms of turnon voltage. The harder choice is to use a different metal for NMOS and PMOS gates. This requires an NMOS gate metal with a work function close to the Si conduction band energy, 4.0 eV below the vacuum level. Such a metal will be quite reactive. For PMOS, this requires a metal with work function close to the Si valence band, or 5.1 eV. This metal would be very noble like Au, but such metals are difficult to etch. Thus, 'metal gates' is a separate topic, which turns out to be intimately linked to gate oxides and also requires considerable development.
Choice of high K oxide
Silicon dioxide is the key reason that microelectronics technology uses Si and not some other semiconductor. Si is an average semiconductor in performance, but in all other aspects SiO 2 is an excellent insulator. SiO 2 has the key advantage that it can be made from Si by thermal oxidation, whereas every other semiconductor (Ge, GaAs, GaN, SiC. . . ) has a poor native oxide. SiO 2 is amorphous, has very few electronic defects and forms an excellent interface with Si. It can be etched and patterned to a nanometer scale. Its only problem is that when very thin it is possible to tunnel across it. Hence, we must loose these advantages of SiO 2 and start to use a new high K oxide. We can in principle choose from a large part of the Periodic table.
The requirements of a new oxide are six-fold:
1. It must have a high enough K that it will be used for a reasonable number of years of scaling. 2. The oxide is in direct contact with the Si channel, so it must be thermodynamically stable with it. 3. It must be kinetically stable, and be compatible with processing to 1000 • C for 5 seconds. 4. It must act as an insulator, by having band offsets with Si of over 1 eV to minimise carrier injection into its bands. 5. It must form a good electrical interface with Si. 6. It must have few bulk electrically active defects.
K value
The first requirement means that the oxides K should be over 10, preferably 25−30. There is a trade off with the band offset condition, which requires a reasonably large band gap. Table 2 and Figure 5 shows that the K of candidate oxides tends to vary inversely with the band gap, so we must accept a relatively low K value [8] . There are of course oxides with extremely large K's, such as ferroelectrics like BaTiO 3 but these have too low band gap. In fact, a huge K is undesirable in CMOS design because they cause undesirably strong fringing fields at source and drain electrodes [9] .
Thermodynamic stability
The second requirement arises from the condition that the oxide must not react with Si to form either SiO 2 or a silicide according to the unbalanced reactions,
This is because the resulting SiO 2 layer would increase the EOT and negate the effect of using the new oxide.
In addition, any silicide formed by (6) would generally be metallic and would short out the field effect.
This condition requires that the oxide has a higher heat of formation than SiO 2 . Hubbard and Schlom [10, 11] found that this restricts the possible oxides to very few, from columns II, III and IV of the Periodic Zr and Hf are both from column IV and are generally believed to be the two most similar elements in the main Periodic table. However, it also turns out that the thermodynamic data for many oxides was not so accurate. It was subsequently found that ZrO 2 is actually slightly unstable [12, 13] [14] [15] [16] [17] [18] .
One way to represent the stability or not of an oxide in contact with Si is on a ternary phase diagram and tie lines [1] . Figure 6 shows the ternary phase diagrams for the Ta-Si-O and Zr-Si-O systems. A given point in the diagram represents a composition and the temperature must be specified. and indeed any composition in (ZrO 2 ) 1−x (SiO 2 ) x are connected by tie-lines and are in equilibrium in contact.
Kinetic stability
The third condition is to be compatible with existing process conditions. Assuming we choose an amorphous oxide, this requires that the oxide remain amorphous when annealed to up to 1000
• C for 5 seconds. This is a strenuous condition in that SiO 2 is an excellent glass-former but most other high K oxides are not. Al 2 O 3 is a reasonably good glass-former and is the next best in this respect. Ta 2 O 5 is moderately good glass former, but was eliminated because it is reactive. All the other oxides crystallise well below 1000
• C. This problem can be circumvented by alloying the desired oxide with a glass former -SiO 2 or Al 2 O 3 -giving either a silicate or an aluminate [19] . This then retains the stability against crystallisation to close to 1000
• C. However, it is with the significant disadvantage of a lower K value. If this were the main condition, aluminates would be preferable to silicates, because they have a higher K. The K value roughly follows a linear rule of mixtures with composition, although there has been discussion of this aspect in a few cases. The addition of some nitrogen is found to raise the crystallisation temperature further, and so Hf silicates can just pass this criterion [20] .
The other alternative is to use nano-crystalline oxides. This was originally thought to be a poor choice, because the grain boundaries would cause higher current leakage paths.
However, in practice, Lee et al. [21] found crystallised HfO 2 to have a similar leakage to amorphous HfO 2 .
Band offset
The high K oxide must act as an insulator. This requires that the potential barrier at each band must be over 1 eV in order to inhibit conduction by the Schottky emission of electrons or holes into the oxide bands [8, 22] , as shown schematically in Figure 7 . SiO 2 has a wide gap of 9 eV, so it has high barriers for both electrons and holes. However, if the oxide has a narrower band gap like SrTiO 3 , which is only 3.3 eV, its bands must be aligned almost symmetrically with respect to those of Si for both barriers to be [23] , ZrO2 [24] , Al2O3 [23, 25] and La2O3 [15] . (b) Leakage current density vs. EOT for HfO2 with poly-Si gates and TiN gates, after [26] . over 1 eV. In practice, the conduction band offset is usually smaller than the valence band offset. This limits the choice of oxide to those with band gaps over 5 eV. The oxides that satisfy this criterion are Al 2 O 3 , ZrO 2 , HfO 2 , Y 2 O 3 La 2 O 3 and various lanthanides, and their silicates and aluminates [8] . It is interesting that these are the same oxides as pass the thermal stability criterion. This is because a high heat of formation correlates with a wide band gap, in ionic compounds.
270
The European Physical Journal Applied Physics The leakage current for various high K oxides as a function of EOT is plotted in Figure 8 . Figure 8 (a) shows data for HfO 2 from Gusev [23] , for ZrO 2 from Gusev [24] , for Al 2 O 3 of Guha [23, 25] , and for La 2 O 3 from Iwai [15] . Figure 8 (b) compares data for HfO 2 films with polySi electrodes and HfO 2 with TiN electrodes, from Tsai et al. [26] .
Yeo et al. [27] have defined a scaling figure of Merit to compare leakage currents by combining the barrier height, tunnelling mass and K. Lanthanides have the lowest leakage in Figure 8 
Interface quality
The oxide is in direct contact with the Si channel. The carriers induced by the gate are induced within Angstroms of the Si-oxide interface. Hence, this interface must be of the highest electrical quality, in terms of roughness and the absence of interface defects. Extra defects are associated with oxide grain boundaries. Therefore, there are two ways to ensure a high quality interface, either use a crystalline oxide grown epitaxially on the Si, or use an amorphous oxide.
Using an amorphous oxide has many advantages over a poly-crystalline oxide. It is like the existing Si:SiO 2 situation. It is the lowest cost solution, most compatible with the existing process. Second, an amorphous oxide might be able to configure its interface bonding to minimise the number of interface defects. Third, it is possible to gradually vary the composition of an amorphous oxide without creating a new phase; for example as in silicate alloys, or interfacial layers, or when adding nitrogen. Fourth, an amorphous oxide and its dielectric constant is isotropic, so that fluctuations in polarisation from differently oriented oxide grains will not scatter carriers. Finally, amorphous phases have no grain boundaries. Grain boundaries in a polycrystalline oxide act as easy diffusion paths for dopants, such as B or P from a poly-Si gate electrode lying above.
The advantages of epitaxial oxides may come in the future, where their ability to create more abrupt interfaces allows us to reach lower EOTs.
Defects
Electrically active defects are defined as atomic configurations which give rise to electronic states in the band gap of the oxide. Typically these are sites of excess or deficit of oxygen or impurities. Defects are undesirable for four reasons. Firstly, charge trapped in defects causes a shift in the gate threshold voltage of the transistor, the voltage at which it turns on. Secondly, the trapped charge will change with time so the threshold voltage will shift with time, leading to instability of operating characteristics. Thirdly, trapped charge scatters carriers in the channel and lowers the carrier mobility. Fourthly, defects cause unreliability; they are the starting point for electrical failure and breakdown of the oxide.
SiO 2 is an almost ideal insulating oxide, in that it has a low concentration of defects which give rise to states in the gap. This is fundamentally because it has a low coordination number, so that its bonding can relax and rebond any broken bonds at possible defect sites. Any remaining defects are passivated by hydrogen. The high K oxides are not materials with a low intrinsic defect concentration because their bonding cannot relax as easily. Much of the present-day engineering of these oxides consists of pragmatic strategies of trying to reduce defect densities by processing control and annealing.
Materials chemistry of high K oxides
Deposition
The great advantage of SiO 2 is that it can be grown by thermal oxidation. In contrast, high K oxides must be deposited. Deposited oxides are never as good. The advantages and disadvantages of various deposition methods are summarised in Table 3 . Sputtering is one of a number of physical vapour deposition (PVD) methods. Its advantage is that it is broadly available and can produce pure oxides. Its disadvantages are that oxides are insulators so sputtered oxides tend to have plasma-induced damage. Also, PVD methods deposit in line of sight, so they do not give good coverage.
A method for producing highly pure, thin oxides is to evaporate metal by electron beam which is highly controllable to small thickness, and to oxidise the deposited metal by ozone or UV assisted oxidation. The advantage is that this produces less damage than oxide sputtering and should produce the purest oxide. But it is not a production method. One could also ion beam sputter the metal -ion beam on the sputter target, not on the substrate. This does not produce damage.
The preferred industrial scale methods are chemical vapour deposition (CVD) and atomic layer deposition. CVD uses a volatile metal compound as a precursor which is introduced into the chamber and oxidised during deposition onto the substrate. The advantages of CVD are that it is already widely used in the electronics industry for insulator deposition, it gives conformal coverage over complex shapes because it is not just line of sight, and that the growth rate is controllable over a wide range from very slow to high. The CVD precursors can be metal chlorides such as ZrCl 4 and HfCl 4 or metal organics such as tetrabutoxyl Zr, in which case it is called metal organo CVD (MOCVD).
Atomic layer deposition is a method of cyclic deposition and oxidation [28, 29] . As shown schematically in Figure 9 , the surface is exposed to the precursor which is absorbed as a saturating monolayer. The excess precursor is then purged from the chamber by an Ar pulse. A pulse of oxidant such as H 2 O, H 2 O 2 or ozone is then introduced which must then fully oxidise the adsorbed layer to the oxide and a volatile by-product such as HCl. The excess oxidant is then purged by a pulse of Ar, and the cycle is repeated.
The effective chemical reactions are
Here the existing ZrO 2 surface is assumed to be terminated by OH groups at about 300 • C. The ZrCl 4 chemisorbs exothermically onto the OH sites by the exothermic elimination of HCl. In the second stage, water oxidises the Cl atoms again with the elimination of HCl.
The precursor is designed so that both steps of absorption and oxidation are exothermic. The precursor must undergo self-limiting adsorption, be volatile, high purity, non-toxic, have no gas phase reactions, no selfdecomposition, and no etching of the existing oxide. The first precursors for ZrO 2 and HfO 2 were the chlorides. However, these have low volatility. A wide range of new precursors in being developed [28, 30] . ALD was developed to produce highly conformal, pinhole-free insulating films, as seen in Figure 10 . The advantages of ALD are that it is able to grow the thinnest films of all methods, and the most conformal films even into deep trenches. A disadvantage is its slow growth rate. A disadvantage of ALD and MOCVD is that they generally introduce impurities into the oxides, such as C, H or Cl, depending on precursor, whose electrical activity needs careful study. Careful annealing strategies are needed to densify the CVD and ALD oxides and remove impurities. ALD is an excellent method for producing Al 2 O 3 , using trimethyl-aluminium as precursor [28] . This and other reasons led to the adoption of ALD for many high K oxides.
Each cycle of ALD adds a layer of oxide which is usually much less than an atomic layer thick, despite its name. The precursor absorption saturates below one monolayer because of steric hindrance. This is not a significant disadvantage, it just takes more cycles to grow a certain thickness.
The most inert surface of Si is regarded as the H-terminated surface obtained by the HF-last cleaning procedure. In the development of the ALD, it was found that ALD of ZrO 2 and HfO 2 from chlorides or many organic precursors did not nucleate easily on HF-last Si surfaces and had a slow initial growth rate [31, 32] , as in Figure 11 . This meant that oxide films even 3 ML thick were not fully covered or 'closed' but islanded [31] . It was found that nucleation occurred much more readily on a slightly pre-oxidise Si surface [31] . Thus, ALD is usually carried out on a 'chemical oxide' (SiO 2 ) surface formed by ozone cleaning of Si. This limits the ultimate lowest EOT that ALD can presently achieve. However, the development of ALD precursors which do nucleate on H-terminated Si and different processing strategies will overcome this obstacle when needed [33, 34] .
Alloy crystallisation
Silicate and aluminate alloys of Zr, Hf and La oxides are often used instead of the pure metal oxides in order to have a higher resistance to crystallisation [19, 20, 35] . Zr silicate has been the most widely studied. Crystallisation directly to the crystalline silicate ZrSiO 4 is inhibited by kinetics. Instead, Maria et al. [36] showed that crystallisation occurred by the phase separation of the ZrO 2 and SiO 2 phases followed by the crystallisation of the ZrO 2 component. This can be seen for HfO 2 -SiO 2 alloys in the high-resolution transmission electron microscope images in Figure 12 for two different compositions [38] .
The phase diagram of the ZrO 2 -SiO 2 system is known reasonably accurately [36] [37] [38] , as shown in Figure 13 . That of HfO 2 -SiO 2 is not know as well, but it is assumed to be similar to ZrO 2 -SiO 2 because of the chemical similarity of Zr and Hf. The key factor is that ZrO 2 and SiO 2 liquids are immiscible over a small range of composition. This is attributed to the high ionic charge of Zr. This 'miscibility gap' can be continued to lower temperatures to define a miscibility gap in the solids. This also defines a spinodal region in which the alloy can spontaneous phase separate to lower its free energy [37] . The glass transition temperature is also marked in Figure 13 , it reduces in ZrO 2 rich alloys. Thus, crystallisation occurs by two mechanisms. For Zr contents between 20−60 mol% ZrO 2 will crystallise by spinodal decomposition followed by crystallisation. This tends to lead to small grain sizes. Films with over 60% Zr will crystallise by the kinetically limited nucleation and growth of crystalline ZrO 2 . This was confirmed by extensive TEM and x-ray scattering studies on Hf silicate alloys by Stemmer et al. [38, 39] . The La silicate phase diagram [36] is qualitatively similar to that of ZrSiO 4 except that the two-phase region is further towards SiO 2 ( Fig. 14) .
In contrast, the phase diagrams of aluminates such as ZrO 2 -Al 2 O 3 do not show miscibility gap [40] , as seen in Figure 15 , so they are more resistant to crystallisation [41] . However, it turns out that aluminates have higher densities of electronically active defects, so that silicates are preferred to aluminates for gate oxide applications.
Despite the use of silicates, they still cannot fully achieve the 1000
• C requirement. The final improvement in performance comes with adding a fraction of nitrogen to the alloy [15] . The N reduces the diffusion coefficient of oxygen in the alloys, and this reduces the crystallisation rate sufficiently that the alloy can now withstand 1000
• C.
Lee et al. [42] have studied the effect of adding nitrogen at either interface or in the bulk.
Atomic diffusion
We noted that a gate oxide must withstand processing to temperatures of order 1000
• C without changing its state. It must also not mix with either the Si channel or the poly-Si (or metal) gate electrode, or allow components of the gate electrode through to the Si. All these aspects require the gate oxide to have low atomic diffusion coefficients. Interestingly, the proposed oxides HfO 2 and ZrO 2 belong to the class of fluorite oxides like CeO 2 which are fast oxygen ion conductors, of interest in solid state fuel cells or high temperature sensors. Clearly, for our application oxide diffusion must be inhibited.
A great advantage of alloying with SiO 2 is that the Si sites in silicates are covalently bonded to oxygen. This greatly reduces the oxygen diffusion rate. The diffusion rates of Hf, O, B and P in HfO 2 and Hf silicate have been measured after implantation by secondary ion mass spectroscopy (SIMS) and nuclear reaction profiling [43] [44] [45] [46] [47] to confirm these observations. The mixing of oxide and Si layers has also been studied by Medium Energy Ion Scattering (MEIS) which measures the element profile.
The basic silicate is found to perform adequately in most respects. However, alloying with nitrogen is used to lower the diffusion rates still further, as seen, which further raises the crystallisation temperature. This is a general role of N. Si 3 N 4 is a much better diffusion barrier than SiO 2 , because it has no open channels for molecular or ionic diffusion, and the N site has a higher coordination and thus resists network diffusion. Of course, HfO 2 does not have an open lattice like SiO 2 , but still N seems to lower network diffusion in HfSiO 4 [21] .
Another key role of the oxide is to block dopant diffusion from any poly-Si gate electrode [48] . N is found to be very useful in blocking B diffusion through SiO 2 presumably because it forms bound pairs with B. In high K oxides, N is also efficient at blocking boron diffusion. A grain boundary would be a short circuit diffusion path, so here N acts to block diffusion by stopping crystallisation and the formation of any grain boundaries [21] .
The interfacial layer
An interfacial layer of SiO 2 often exists between the Si channel and the high K oxide layer. Figure 16 shows a cross-sectional of an example [49] . There are advantages and disadvantages for this, as long as its presence and thickness can be controlled. The overall EOT of a layer 1 of SiO 2 and a layer 2 of high K oxide is given by the series capacitance formula,
which becomes Thus, an extra SiO 2 layer is undesirable as it adds to the overall EOT. In fact, the K of SiO 2 (3.9) is so small that a SiO 2 layer can rapidly use up the EOT allocation. It is a severe impediment to scaling. The SiO 2 layer often arises not because of reaction of the HfO 2 with the Si, as the HfO 2 was chosen to avoid this. It arises because O diffuses through the HfO 2 layer to oxidise the Si underneath. Indeed ZrO 2 and HfO 2 are a catalyst for this oxidation process [50] . The SiO 2 layer usually grows during the post-deposition annealing stage, not during growth. Naraynan [51] proved this for the case of Y 2 O 3 . This can be avoided by adding silicate or N to the HfO 2 layer to reduce atomic diffusion. However, scaling requirements will reduce the ability to use silicates in the future because they lower K.
The second reason an SiO 2 layer exists is that is it beneficial and it was deliberately put there. Firstly, a 'chemical oxide' is presently used to act as a nucleation layer for ALD growth of HfO 2 and other oxides [31, 32] . With experience or the development of better ALD precursors, this need should decline.
The SiO 2 layer may also be introduced because it improves the electrical quality of the Si-oxide interface, as described later. The Si-SiO 2 interface is well understood and can be of high quality. In principle, it can be made with a very low defect concentration, and the defects can be passivated by forming gas annealing. The presence of a SiO 2 layer also spaces the Si channel from the high K oxide, which can stop the reduction in carrier mobility that high K oxides can cause, see later.
A disadvantage of this interfacial oxide is that it may not have the same quality as SiO 2 produced by thermal oxidation of Si [52, 53] . It may be defective. Copel [54] has used a number of techniques such as MEIS to study the profile and composition of interfacial oxides under HfO 2 . They found that they are SiO 2 despite sometimes appearing to have higher K values than thermal oxide. EELS found a similar result [55, 56] .
It is an advantage if we can control the thickness of the interfacial SiO 2 layer, and if necessary remove it entirely. This can be done in two ways. Firstly, Si and SiO 2 react to form volatile SiO within a range of temperatures around 900−1000
• C. The initial surface can be annealed to desorb its native oxide as SiO [57] . The SiO will also desorb from a buried layer through a high K oxide covering. The second way is to react the metal such as Hf with the SiO 2 to displace Si [58, 59] .
Bonding and electronic structure 4.1 Bonding
The oxides of interest are transition metal oxides except for Al 2 O 3 . Figure 17 shows the density of states (DOS) of Al 2 O 3 . The top of the valence band lies at 0 eV and the band gap lies from 0 to 8.8 eV. The bonding in Al 2 O 3 is more ionic than in SiO 2 , and its atoms have ionic coordinations. However, its electronic DOS does resemble that of A more typical example is ZrO 2 . ZrO 2 films are amorphous at lower temperatures, but crystallise relatively easily. ZrO 2 is stable in the monoclinic structure at room temperature, it transforms to the tetragonal structure above 1170
• C and it can be stabilised in the cubic fluorite structure by addition of Y [60] . HfO 2 is similar. In cubic and tetragonal ZrO 2 , Zr has 8 oxygen neighbours and each oxygen has four Zr neighbours, while in monoclinic ZrO 2 each Zr atom has 7 oxygen neighbours. Tetragonal ZrO 2 is related to cubic ZrO 2 by displacing oxygens along the z axis towards 4 of the Zr's. Figure 18 shows the density of states of cubic ZrO 2 . It has an indirect gap of 5.8 eV, the experimental value [60] . French [60] found that the gap is narrower in the lower symmetry forms of ZrO 2 . However, recent calculations find that the tetragonal phases have the widest gaps (Tab. 4) [61, 62] . Our calculated band structures are similar to those found by others. The valence band is 6 eV wide, and it has a maximum at X formed from O p states. The conduction band minimum is a Γ 12 state of Zr 4d orbitals. The Zr d states are split by the crystal field into a lower band of e states and an upper band of t 2 states 5 eV higher (at Γ ). The partial DOS shows considerable charge transfer, with the valence band being strongly O p states, and conduction band on Zr d states, with 30% admixture. The band structure of HfO 2 is very similar to that of ZrO 2 except that the crystal splitting of the Hf 5d states in the conduction band is larger than Zr's (Fig. 19) . Crystalline La 2 O 3 , has the La 2 O 3 structure in which La is 7-fold coordinated, with 4 short bonds and 3 longer bonds. The DOS of La 2 O 3 in Figure 20 shows that the valence band is strongly localised on O p states and the conduction band in on La d with some La s, p states starting at 8 eV [63] . The band gap is indirect and 6 eV. The valence band is now only 3.5 eV wide, narrower than in ZrO 2 . The band gap is indirect and 6 eV. The valence band is now only 3.5 eV wide, narrower than in ZrO 2 . The ionicity is higher than in ZrO 2 .
Of the group IIIA metal oxides, Y 2 O 3 has the cubic bixbyite (defect spinel) structure. This has a large 56 atom unit cell in which there are two types of Y sites, both 7-fold coordinated. This structure occurs because Y has a smaller ionic radius than La. The band gap of Y 2 O 3 is direct and is about 6 eV [63]. The valence band is again only 3 eV wide. The partial DOS shows the valence band is largely O p states. The conduction band minimum has mixed Y d, s character.
In each of these cases, the band gap is between O 2p valence states and metal d states. Thus the band gap is pro- portional to the metal atomic d orbital energy, as noted by Lucovsky [64] .
ZrSiO 4 is typical of the transition metal silicates. Crystalline ZrSiO 4 has the body-centred tetragonal structure. The Zr and Si atoms are organised in chains. Each Zr atom has eight O neighbours. Each Si has four O neighbours in a tetrahedral arrangement. These coordinations are expected to carry over to the amorphous phases and the amorphous alloys, although there has been debate about this. Its partial DOS is shown in Figure 21 . The band gap is about 6.5 eV [63] . The valence band is 7 eV wide [65] . The conduction bands form two blocks. The lower conduction band is due to Zr d states and lies between 6.5 eV and 8 eV, and a second conduction band due to Si-O antibonding states lie above 9 eV. This is an important general rule that the conduction band of Zr silicates forms two nonmixing ZrO 2 -like and SiO 2 -like bands. The states do not mix because the Si s, p states and metal d states have different local symmetry. Thus, the CB edge of the silicates retains its Zr d character as long as Zr is present, and the band gap increases only slowly, with very strong bowing below the virtual crystal model. Experiments confirm this [66] .
Another large class of possible gate oxides are the perovskites such as SrTiO 3 . In the ABO 3 structure, the smaller transition metal ion occupies the B site, which is octahedrally coordinated by six oxygens. The oxygens are bound to two B ions, while the A ion is surrounded by twelve oxygen ions. Figure 22 shows the partial DOS of SrTiO 3 . The band gap is direct and 3.3 eV wide. The lowest conduction bands are Ti d xy t 2 states followed by the Ti d z2 states. The next states above 7 eV are Ti p states followed by Ba s states. Thus, the A ion states (Ba or Sr) are well away from the band gap, and the ion can be considered to be essentially fully ionised and passive. On the other hand, the Ti-O bond is polar but only about 60% ionic. LaAlO 3 is another perovskite oxide, which is of importance as an epitaxial gate oxide because it has a large dielectric constant, and a close lattice match to Si. It is unusual in that the transition metal La occupies the A site and Al occupies the octahedral B site. The partial DOS of LaAlO 3 is shown in Figure 23 . The band gap is taken as 5.6 eV from recent ellipsometry work [67] .
Dielectric constants
The static dielectric constant of the oxides is the sum of the electronic and lattice contributions, κ = κ e + κ l . The electronic component κ e is also the optical dielectric constant ε ∞ and it is given by the refractive index squared, κ e = ε ∞ = n 2 . ε ∞ values are typically 4−5 for the wide gap oxides of interest. Thus they are not the main source of the high K in Table 2 . The large static dielectric constant arises from a large lattice contribution,
Here, n is the refractive index, N is the number of ions per unit volume, e is the electronic charge, Z * T is the transverse effective charge, m is the reduced ion mass and ω T O is the frequency of the transverse optical phonon. Large values of κ l occur when Z * is large and/or the frequency of a polar optical mode ω T O is small. This means that they are incipient ferroelectrics.
The dielectric constants of the various phases of HfO 2 and ZrO 2 have been calculated in the local density formal- ism [68] . This is a good means to understand the differences and the anisotropies. Rignanese [69] found that the tetragonal phase has the largest and most anisotropic K, but not by as much as found earlier by Vanderbilt [68] .
Band offsets
The band offset between oxide and Si defines the barrier for injection of electrons or holes into the oxide bands. The electron barrier or conduction band (CB) offset tends to be the smaller of the two. The CB offset is one of the key criteria in the selection a gate oxide. It must be over 1 eV to give adequately low leakage current [8, 18] .
The CB offset has previously been calculated for most candidate high K oxides and it can be measured by methods such as photoemission. The band line up at an interface is controlled by a dipole formed by charge transfer across the bonds at the interface. The band offset consists of two components, a component intrinsic to the bulk oxide and Si and a component which depends specifically on the interface bonding configuration [70, 71] . The intrinsic component is of interest because the specific bonding at the interface is generally not known. Usually, the intrinsic component is the main component. However, the interface specific component can be important. It means that there is no unique offset value for a given oxide on Si. This can be an advantage as it allows offsets to be controlled by varying the interface chemistry.
The band line up at an interface is controlled by a dipole formed by charge transfer across the bonds at the interface [72] . In the case of two non-interacting surfaces, the conduction band line up is given by the difference between the electron affinities (the energy of the conduction band edge below the vacuum level) (Fig. 24) . This is known as the Schottky limit. If the surfaces interact, an interface dipole due to charge transfer across the interface by modifies this offset. The charge transfer acts to align an energy level in each surface. In the limit of strong coupling, known as the Bardeen limit, these levels are fully aligned. The band offset is then given by the difference of this energy level below the two conduction bands, and is independent of the vacuum levels. Most high K oxides are intermediate between the two limits. A particular model is the model of metal induced gap states (MIGS) [73] [74] [75] [76] . This model says that the reference level is the so-called charge neutrality level (CNL) of the intrinsic surface states. A semiconductor surface has gap states due to the broken surface bonds. These are spread across the energy gap. The CNL is the highest occupied surface state on a neutral surface of a semiconductor. It is like a Fermi level of the intrinsic gap states.
The MIGS model says that for a metal on the semiconductor, the MIGS are like the plane waves of the metal decaying into the semiconductor gap. The interface dipole now tries to align the semiconductor's CNL to the metal Fermi level. The Schottky barrier height, the energy of the semiconductor conduction band above the metal Fermi level, is given by
where Φ M is the metal work function, Φ S is the charge neutrality level of the semiconductor, and χ S is the electron affinity (EA) of the semiconductor. S is a dimensionless pinning factor given by dφ n /dΦ M . S is given in the linear approximation by [77] 
where e is the electronic charge, ε 0 is the permittivity of free space, N is the density of the interface states per unit area and δ is their extent into the semiconductor.
In fact, this model is not strictly correct, as the whole occupied valence band states, not just those at the Fermi level contribute to S [72] . Nevertheless the MIGS model appears to give reasonably good predictions. The model is extended to the band offsets between semiconductors. Charge transfer tends to align the charge neutrality level (CNL) of the bulk oxide with the CNL of the bulk Si. The CB offset is given by [8] Here, χ a is the electron affinity (EA) of the oxide, χ b is the electron affinity of the semiconductor, and Φ Sa and Φ Sb are the charge neutrality levels of the oxide and semiconductor respectively. All the energies in (14) are measured from the vacuum level, except φ n which is measured from the conduction band edge. S is a constant, the Schottky barrier pinning factor, which is found by Monch [73] to vary empirically with the electronic component of the dielectric constant of the wider gap material (the oxide) as
The CNL model is a zeroth order but fully determined model of the band offsets, in which the CNL energy is determined by the bulk electronic structure of oxide and of Si. The local bonding at the interface does not enter in this model. The predicted CB offsets in this model [18, 63] are given in Table 5 and Figure 25 for the various oxides. Table 5 compares these to the experimental values measured by photoemission, internal photoemission or barrier tunneling [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] . Photoemission measures the VB offset, and this is converted into the CB offset by subtracting the oxide and Si band gaps. Internal photoemission measures the energy from the Si valence band to the oxide conduction band, or the Si conduction band to the oxide valence band, depending on the polarity of the Si and of the applied voltage. It is seen that the predicted and experimental offsets generally agree well. Those for HfO 2 and ZrO 2 from photoemission agree well [79, 84] . SrTiO 3 indeed has a small CB offset [78] . There is now recent data [88] for La 2 O 3 which agrees well with the prediction of 2.3 eV. La 2 O 3 and LaAlO 3 have a particularly large CB offsets [87, 88] which means they could be the second generation high K oxides with lowest leakage. The largest exception is the internal photoemission of Afanasev [83] for Al 2 O 3 . This is because these authors used Al 2 O 3 films grown by atomic layer deposition whose band gap is much less (6.8 eV) than that of the pure bulk oxide (8.8 eV).
It is seen that only Al, Y, La, Zr and Hf based oxides have CB offsets over 1 eV, which is the minimum needed to limit electron injection. The CB offsets decrease in the order of group III, IV, to IV metal oxides. This is because the CNL of the oxide rises in the gap along the sequence group III to V.
Lucovsky et al. [64, 85] have observed that the x-ray absorption thresholds of the metal d states of the various oxides track the changes in CB offset. This is because the lowest conduction band of the oxide is pure metal d, and so its energy tends to follow the band offset.
Interfacial bonding
The simple MIGs model of the oxide interface has been surprisingly successful. Nevertheless, future developments will need a more detailed description of the Si-oxide interface. It is important to know the detailed bonding at the Si-oxide interfaces for two reasons. Firstly, the band offset does depend on the interface bonding. Secondly, imperfect interfaces will have defects which can give rise to states in the gap which trap charge.
It is useful to consider epitaxial oxide systems in order to understand the bonding principles in more detail [90] [91] [92] [93] [94] [95] . We choose the Si:ZrO 2 system because it is a reasonably well lattice-matched interface and it has (when Y doped) the high symmetry cubic lattice. The lattice constants of Si and ZrO 2 are 5.43Å and 5.07Å respectively. This allows ZrO 2 to be grown epitaxially on the Si(100) cube face [96, 97] , with the ZrO 2 cube face lying directly on top of the Si cube face. This is expressed as ZrO 2 (100)//Si (100) [98] [99] [100] [101] .
Our understanding of the Si:ZrO 2 interfaces can be guided by those of other fluorite compounds such as metal silicides NiSi 2 and CoSi 2 , and CaF 2 . They each form epitaxial interfaces with Si which have been intensively studied previously. It is possible to construct an Si:NiSi 2 (111) interface in which the last Ni atom is 5, 7, or 8-fold coordinated [101, 102] . The most stable interface of these metallic silicides can be understood in terms of the occupation of its bonding states.
The CaF 2 interfaces are more complex than NiSi 2 interfaces because CaF 2 has no common element with Si. The ideal (100) and (111) [100] or [111] directions, in which alternate F ions are assigned to Ca above or below. These (100) or (111) faces now contain half the number of F ions, and are now non-polar (Fig. 26) .
On the Si(111) surface, each Si atom has one broken or 'dangling' bond (DB), Figure 26 . This state is half occupied, and it will give a metallic interface if it is left like this. We could consider making an Si:CaF 2 (111) interface by joining CaF 2 using one of these non-polar FCaF units, to give a SiFCaF layer structure. Counter-intuitively, it turns out that this would be bad! It would leave the Si DBs all half occupied, and a metallic interface [103] , so it not good for a device.
What is needed is to join a polar FFCaF unit to the Si(111), as in Figure 26(b) . The extra F of the FFCaF unit will form a strong Si-F bond with the Si DB, and this bond sweeps the DB state out of the gap. This can be considered as a ≡Si + F − F − Ca 2+ F − unit (each dash denotes a Si-Si back-bond). An alternative is to use a polar CaF unit. This CaF unit is Ca + F − and the Ca + therefore has a spare electron. The CaF unit on the Si(111) would donate the spare electron to the Si DB to make a Si − dangling bond. As important, the Ca s orbital energy lies above the Si gap and it will repel the Si − DB level into the valence band, so removing all DB states from the gap. This would give a Si − Ca 2+ F − interface with no gap states and filled valence states -an insulating interface. In practice, experiment shows that the CaF terminated interface is formed [104] . Now extend this idea to the Si:ZrO 2 interfaces [90] , as in Figure 27 . As noted by Chang [105] If instead we put a polar OOZrO unit on the Si(100), the first O forms two strong Si-O bonds with each silicon. This O, being divalent, saturates the two DBs of the surface Si to form a structure like a Si-O-Si bridge. Then, the non-polar OZrO unit is added on top of this. The whole ZrO 2 lattice can then be built up on top of this interface by adding further non-polar OZrO layers.
This also works with a ZrO terminating unit. In this case, the ZrO is formally Zr 2+ O 2− and the Zr has two unsatisfied valences. These can be used to make two polar Zr-Si to the Si DBs. This gives an insulating interface with all valences satisfied and a chemical formula =Si=ZrO. The two examples show that epitaxial growth of ZrO 2 on (100)Si is possible, with valence satisfaction and insulating interfaces, provided that the polar faces of ZrO 2 are used.
We have carried out detailed total energy pseudopotential, local density approximation (LDA) calculations of various atomic models of (100) interfaces to test these ideas [90, 92] . Some of the interfaces are shown in Figures 28 and 29 . Figure 28(a) shows the ideal Si:OZrO interface, which has only one layer of 4-fold coordinated oxygen sites at the interface. We find this interface to be metallic, as expected from the above discussion. Figure 28(b) shows the ideal Si:OOZrO interface, with a double oxygen layer at the interface. Here the interfacial oxygens are 6-fold coordinated initially, bonded to two Si's and four Zr's. It is found that the interfacial oxygens relax to form the structure shown from two directions in Figure 28(c, d) . Those oxygens lying in the Si-O-Si bridges relax downwards towards the silicon layer. The other two oxygens relax upwards towards the ZrO 2 layer. Hence, this repli- cates the discussion above. This interface is denoted the O 4 .
Another interface can be constructed with the oxygens being initially 3-fold coordinated, to one Si atom and two Zr atoms. This is denoted the O 3 interface. The oxygen bonding is then similar to that in ZrSiO 4 . This interface structure relaxes to the configuration shown in Figure 29(a, b) . Here, half of the oxygens are bonded to 2 Si's and 1 Zr, and the other half are bonded to 2 Zr's and one Si. The top layer Si's are each 5-fold coordinated. This interface is also insulating.
A third O-terminated interface with 3-fold coordinated oxygens is possible as shown in Figure 29 (c). The ZrO 2 lattice is displaced 1/2a along [100] . It has a lower symmetry than the O 3 . The interfacial O is bonded to one Si atom and two Zr atoms as in ZrSiO 4 but the O 3 sites are now no longer planar and this allows it to gain stability.
A fourth O-terminated structure is shown in Figure 29(d) . Here, one DB of each Si is used in a lateral Si-O-Si bridge [90] . This leaves one DB to bond to the ZrO 2 layer. However, this needs an extra half monolayer of oxygen to saturate its bonding, to give overall a Si + (O 2− ) 0.5 OZrO configuration. This is denoted the O 3B interface (B for bridge).
Finally, there is a partly covalent interface which has been studied by Fonseca et al. [106] . They created an interface where the ZrO 2 is ionic above the first Zr layer, but resembles the Si:SiO 2 interface on the Si side. We denote this as the O 2B interface, Figure 29 neighbour in a Si-O-Si bridge. This also occurs at the (100)Si:SiO 2 interface [107] . The other Si DB then forms a Si-O-Zr bridge to the first Zr layer. The Si-O-Zr bridge is a covalent unit. Above this Zr, the rest of the ZrO 2 bonding is ionic, as in bulk ZrO 2 . This interface has 2 × 1 symmetry. The interesting thing here is that this interface could be formed by ALD deposition, according to molecular dynamics simulations. The precursor ZrCl 4 is a covalently bonded molecule, and ALD is carried out on a partly preoxidised Si surface. The two-step process of ALD is likely to retain the initial covalent bonding of the Si-O-Zr bridge units, and then the greater stability of ionic bulk ZrO 2 will exert itself and enforce the denser ionic structure after the first monolayer.
Overall, these interfaces have the same number of oxygen atoms at the interface. The O 3 interface is found to be the most stable structure. The O 4 interface is marginally less stable than O 3 . Extensive testing finds that the O 2B is as stable as the O 3 . This is surprising, because ZrO 2 is 2 eV less stable in the covalent quartz structure. It must arise because this interface configuration allows more structural relaxation at the interface, as the two lattices Si and ZrO 2 are not so well lattice matched.
Experimentally, Wang and Ong [97] measured the interface configuration at (100)Si:ZrO 2 by high-resolution transmission electron microscopy. They found it to have Zr-terminated interfaces are also possible. The simplest has a 6-fold coordinated Zr 6 , as in Figure 30(a, b) . This structure relaxes so that the terminal Zr-Si bond lengthens. Figure 30(c, d) shows another interface in which Zr is 10-fold coordinated, with the Zr bonded to four oxygens, four Si's in the top layer and to two more Si's in the layer under that. This bonding is similar to that in ZrSi 2 . Our calculation finds that the Zr 10 is slightly more stable of these two Zr-terminated interfaces.
The calculations find that the three interfaces, O 4 , O 3 and O 3B and Zr 6 are insulating. They have no states in the Si band gap. However, the Zr 10 interface is metallic. Thus, only O-terminated interfaces are useful in devices. Chang et al. [105] calculated the surface electronic structure of some Si:ZrO 2 interface configurations. However, they chose some configurations which were metallic. Similarly, Fiorentini [108] calculated the stabilities of some interfaces of Si:HfO 2 but their interface denoted M/O-vac we find to be metallic.
The band offsets have been derived from the calculations of the various interface structures of Si:ZrO 2 from the calculated alignment of the bands. This gives the offset of the valence bands. The offset of the conduction bands is not given well by the LDA calculations, as LDA underestimates the band gap. The offset of the conduction bands must be found instead by adding the experimental band gaps to the VB offset.
It is found that the VB offset is quite similar for the various O-terminated interfaces of ZrO 2 . It is also similar to the bulk CNL value of VB offset of 3.3 eV. The offset for Zr-terminated interfaces is different, for Zr 10 it is less, for Zr 6 it is more. An interface dipole has been formed which causes these differences. Thus, there is no interface-specific interface dipole for the O-terminated interfaces, but there is for the Zr-terminated interfaces.
The constancy of the band offset for O-terminated interfaces is valuable technologically. It means that the band offset of a ZrO 2 gate oxide does not depend on the surface orientation. It is therefore constant for the polycrystalline or amorphous oxide interfaces. This is very convenient, as it means there will be a larger process window for oxide formation. It is also similar to the established case of Si:SiO 2 where the band offset is constant between Si faces [109] . On the other hand, the band offsets at the two Zr terminated interfaces differ.
Electronic structure of defects
One problem with high K oxides is that they contain much higher defect concentrations than SiO 2 . The SiO 2 possessed such a low concentration of defects for three reasons. First, its high heat of formation means that offstoichiometry defects such as O vacancies are costly and so are rare. The second is that SiO 2 has covalent bonding with a low coordination. The covalent bonding means that the main defects are dangling bonds, and the low coordination allows the SiO 2 network to relax to remove any dangling bonds by rebonding the network. This occurs in particular for defects at the Si:SiO 2 interface.
The high K oxides differ in that their bonding is ionic, and they have higher coordination number [91, 110] . The greater ionic character of the bonding and the higher coordination numbers mean that the high K oxides are poorer glass formers [110] . The effect of poor glass forming ability and high coordination is that the oxides have higher defect concentrations. The oxides have very high heats of formation, so the equilibrium concentration of non-stoichiometric defects should be low (except where mixed valence is possible, such as TiO 2 ). However, the non-equilibrium concentration of defects is high, because the oxide network is less able to relax, to rebond and remove defects.
The structure and electronic structure of the oxygen vacancy and oxygen interstitial in ZrO 2 and HfO 2 have been calculated by Foster et al. [111, 112] and by Xiong [113] . Recall that the valence band of ZrO 2 consists mainly of O p states and the conduction band mainly of Zr d states. Also in the conduction band, the d z2 and d x2−y2 (e) states are the lowest conduction band and the d xy (t 2 ) states are higher, due to crystal field splitting. This is the simple model of ZrO 2 as O 2− and Zr 4+ ions. Surrounding an oxygen vacancy are the 4 metal atoms (or 3 for some sites in monoclinic). In a semiconductor like GaAs, an anion (As) vacancy is surrounded by four dangling bond orbitals on the neighbouring metal (Ga) sites. Hence, an As vacancy gives rise to states localised in the four Ga DBs and these would lie near the conduction band as the Ga forms the conduction band. In an insulator like MgO, an O vacancy again leaves metal states pointing into the vacancy. However, MgO is an insulator and the screening is poor, so the stronger vacancy potential now causes the vacancy state to lie deeper in Fig. 31 . The relaxed structure of (a) neutral oxygen vacancy and (b) the neutral oxygen interstitial in ZrO2. the gap, near midgap, not close to the conduction band edge [114] . The case of ZrO 2 is closer to the MgO than the GaAs case. The vacancy leaves Zr d states on surrounding atoms, Figure 31 The energy levels of this state have been calculated by Foster et al. [111] . They calculated the band structure by LDA and they found the energy levels by calculating the ionisation energies and electron affinities of electrons in that state, rather than calculating an energy level as an eigenvalue. It is well known that LDA under-estimates the energies of unoccupied states. It also has difficulty with the localisation of defect states even when filled. Thus it is necessary to correct the value for the large underestimate of the band gap by LDA, and they did this by the scissors operator (moving the gap to fit the experimental value) and interpolation of the defect level. They found the neutral vacancy level to lie at 2.2 eV above the VB edge in ZrO 2 . Aligning the bands of ZrO 2 and Si using band offsets, this sets the neutral V O energy level as being below the Si VB edge.
Kralik et al. [61] also calculated the energy level of the neutral O vacancy by the GW approximation, which is generally regarded as the most accurate but most expensive method to calculate empty energy states. They found the energy level of the unrelaxed vacancy to be at 3.4 eV above the VB edge in a gap of 5.4 eV, corresponding to about 3.7 eV in a 5.8 eV gap.
Xiong [113] instead used the screened exchange (sX) method [115] and the weighted density approximation (WDA) [116] to calculate the defect excitation energies more correctly than by LDA. No scissors correction is needed. A supercell of 24 atoms was used. The sX method gives the gap of ZrO 2 as 5.2 eV compared to experiment and the neutral vacancy level as 3.5 eV above the VB edge. If a small correction to the gap is applied proportionately, the level then lies at 3.9 eV above the VB edge. This is 0.6 eV above the Si VB edge, or midgap.
The WDA method is more efficient than sX and a cell of 48 atoms is used. The level is found to lie at 4.0 eV above the ZrO 2 VB edge. The structure was relaxed, the neighbouring Zr atoms were found to relax outwards by 0.1A, and the energy level moved up to 4.1 eV. This is 0.8 eV above the VB edge of Si. The latter values are closer to the recent experimental result of Takeuchi [117] .
The oxygen interstitial can have a number of charge states, Figure 32 . The simplest is the closed shell species O 2− . In this state, it lies away from other oxygen anions and it adds filled O 2p states just to the valence band. Removing 1 electron to give O − leaves a hole at the VB edge. Foster [111] notes that this ion moves slightly closer to another O 2− , the O-O distance is 2.0Å. The neutral O interstitial has 2 holes. This now forms the superoxy anion O 2− 2 which has an internal O-O bond. The O-O bond length is now 1.49Å. This bond creates a filled bonding σ orbital at −6.0 eV just below the main valence band and an empty antibonding σ * orbital at 4.1 eV in the upper gap region. It also has filled double degenerate pπ and π * orbitals at −3.0 in the valence band and at +0.3 eV just above the VB edge (Fig. 32) .
The σ * state could trap an electron, in which case this would partly break the O-O bond and the σ * state would fall towards the VB edge. Alternatively, the π * state could trap a further hole to give the O + interstitial, or superoxy radical. The hole resides in one of the π * states, breaking their degeneracy. This radical has a characteristic g factor and has been seen by electron spin resonance in HfO 2 thin films [118] .
Electrical quality
We have so far described the production, characterisation and bonding of high K oxides. We now continue with their use as electronic materials. It was noted that high K oxides presently perform less well than SiO 2 . There are three aspects to this, mobility, gate threshold shifts and charge trapping. 
Mobility degradation
The objective of device scaling is to create smaller, faster devices. Speed follows source-drain drive current, which in turn depends on the carrier mobility. Carriers in the FET behave like a two-dimensional electron gas. The carrier density is determined by the vertical gate field which induces them, by Poisson's equation. The carrier mobility in 2D gases is found to depend in a universal way on this gate field, according to the so-called universal mobility model. This idea developed from observations by Sah, Plummer [119] and others. The most recent version is by Takagi et al. [120] in which the mobility of electrons and holes depends only on the effective gate field and the Si face, [100] , [110] or [111] . The individual components of mobility add according to Matthiessen's rule,
The mobility is limited by different mechanisms at different fields, as each obeys a different power law with field, see Figure 33 . At low fields, mobility is limited by Coulombic scattering (C) by trapped charges in the oxide and/or channel and/or the gate electrode interface; at moderate field it is limited by phonon scattering (PH), and at high fields by scattering by surface roughness (SR). CMOS devices with a SiO 2 gate oxide have a mobility close to the universal limit. The mobility is limited mainly by interface roughness over the range of interest. The mobilities in devices with high K gate oxides presently lie well below the universal curve [6, 23, 32, [121] [122] [123] [124] [125] . This is particularly true of NMOS devices. The reduction in mobility for PMOS devices is fractionally less. Figure 34 shows typical examples. A major objective of present research is to understand the cause of this lowered mobility and to try to correct it.
The cause is presently not well understood. There are two likely causes. First, there could be scattering by excessive amounts of trapped charge and interface states [6] . This is clearly true as other measurements show that high K oxides have much more trapped charge than SiO 2 . Secondly, there is the possibility of remote scattering by Fig. 34 . Carrier mobility of n-type Si, for various gate oxides, after Gusev et al. [23] .
low lying polar phonon modes, as noted by Fischetti et al. [126] . The two contributions can be distinguished by their temperature and by their thickness dependence.
It is also possible that the reduced mobility is due to a reduced induced channel carrier density in inversion, due to the filling of interface traps. This effect has been analysed in detail by Ma et al. [127] . It can be excluded by direct measurements of Hall effect mobility which also shows a reduction [128] .
Fischetti [126] noted that in most high K oxides of interest, the high K arises from the low-lying polar lattice vibration modes, see Section 4.2. These polar modes can be effective scatters of carriers in the Si channel -hence 'remote scattering'. The oxides are incipient ferroelectrics and these soft modes would drive the ferroelectric instability if their frequency fell to zero. On the other hand, in SiO 2 such polar modes have a much higher frequency and do not have a large coupling. Fischetti [126] modelled the effect for various oxides and SiO 2 . It was found to be pronounced in ZrO 2 and HfO 2 . The effect is smaller in ZrSiO 4 or HfSiO 4 which are now covalently bonding without soft modes. It is also small in Al 2 O 3 which has no soft modes. The importance of the effect is that it is intrinsic for those higher K oxides such as HfO 2 and can only be moderated by using HfSiO 4 , or by including a SiO 2 interfacial layer to separate the HfO 2 away from the channel. Both methods are undesirable as they increase EOT [6, 32] .
The two mechanisms can be distinguished by their temperature and their thickness dependence. Phonon scattering is the only mechanism whose mobility decreases as the temperature is raised, because the phonon numbers increase with T . Surface roughness is independent of T , and mobility limited by Coulombic scattering increases at higher temperatures (see Fig. 33 ). Zen et al. [129] and Chau et al. [5, 130] have measured the T dependence. They found there is indeed a T dependence of 1/mobility in the mid-field range where it is expected, as seen in Figure 35 . Thus, the remote phonon scattering mechanism is important. Ren et al. [129] used HfO 2 gate oxide. Chau [5] did not specify which but it is likely to be HfO 2 . Ren's analysis is more complex in that they distinguish scattering by phonons in the oxide and the Si. The second method is to plot mobility against oxide thickness, and also against thickness of any SiO 2 interlayer oxide, as in the work of Murto [6] and Ragnarsson [32] . The reduction is seen to be greatest in thin high K oxide [6] , see Figure 36 . Defect scattering would be dominant at lower fields and would increase with thicker oxide layers. These groups interpret their results as showing the importance of Coulombic scattering. Hence, the T -dependence and thickness data indicate that both mechanisms are operative.
Devices from some groups show only small reductions in mobility. This is after considerable processing. Generally, those devices showing small mobility reductions are because the processing has grown an extra SiO 2 interlayer which moves the HfO 2 away from the channel and reduces the remote scattering. Thus, evidence does point to some remote scattering.
Devices using Al 2 O 3 gate oxide prove the importance of the charge scattering contribution. Al 2 O 3 does not have soft modes, but it does have a high defect concentration. Thus, the reduced mobility seen in those devices [122] can only arise from charge scattering. [123, 131] introduced a general model including the above effects. Most of the scattering arises from charge defects in the oxide and from fluctuations in the dielectric constant from anisotropic oxide crystallites.
Saito et al. of Hitachi
Chau et al. [5] suggest that metal gate electrodes would help to screen the dipole coupling of remote phonon scattering. Hence they suggest that this is a further reason for using metal gates with high K oxides.
V T stability
The third major problem for high K oxides is the shift of flat band voltages. The flat band (V FB ) voltage is derived from the capacitance-voltage curve of a CMOS capacitor. By Poisson's equation, the FB voltage measured for a range of film thickness t obeys
Here, Φ ms is the difference in work functions of the Si and the gate electrode, Q is the interface fixed charge (or trapped charge) density in the film, and K is its dielectric constant. Now, high K oxides have a large defect density, but if we assume that the density is independent of thickness, this plot will be a straight line. Extrapolating to zero t for HfO 2 gate oxide MOS capacitors gives a large V FB value. This compares with small values for SiO 2 gate oxides. The value is of order 0.5 to 1 V for high K oxides on p-type Si and less on n-type Si. Given the rather small operating voltages now for CMOS, these values are large enough to make high K oxide devices inoperable, so the cause must be found. A series of experiments were carried out varying the polarity of Si substrate, the polarity of poly-Si gate, the thickness of the HfO 2 gate oxide and depositing HfO 2 layers on top of SiO 2 layers, particularly by Hobbs et al. [132] [133] [134] [135] . They indicated that the problem arises from an interaction between the HfO 2 and the poly-Si gate material. In principle, the data could be accounted for by fixed charges, dopant diffusion or interface traps [136] . However, the range of tests [132, 134, 137] suggests that the origin is the interaction of the gate and the HfO 2 gate oxide.
The purpose of the gate electrode in CMOS is to swing the Fermi level of the Si channel to the appropriate band edge to invert it. In the Schottky limit, a change in the gate electrode's work function of 1.1 eV would be needed to swing E F across the 1.1 eV gap of the underlying Si channel. If we have CMOS with metal gate electrodes and in the Schottky limit, for PMOS with a n-Si channel, a metal electrode with work function 5.1 eV would invert the channel and make it strongly p-type. On the other hand, for NMOS with an initially p-type Si channel, a metal electrode with work function 4.0 eV would invert the channel and make it strongly n-type. In each case, the metal electrodes can be replaced by highly p-type and n-type poly-Si respectively. SiO 2 is a wide gap oxide, and in fact CMOS with SiO 2 or SiO 2 N x does operate close to the Schottky limit, and this is what happens. Now consider what happens if the gate oxide is a thin layer of HfO 2 . We can deposit metals of different work functions onto HfO 2 on Si. The barrier height of the metals to the HfO 2 valence band edge can be measured by photoemission, or the barrier height to the conduction band edge can be measured by tunnelling or by internal photoemission, or the band alignment can be deduced from CV measurements. These results indicate that the barrier heights change with metal by much less than the change in the work function.
As for band offsets, we can define a pinning factor as the change of VB offset divided by the change in the metal's vacuum work function,
Sayan [138] , and found a similar S value. However, the actual size of the offsets are different to Sayan's, as these are also included in Figure 37 (a). The barrier heights for ZrO 2 are shown in Figure 38 (b), and these also give a value of S ∼ 0.5. Yeo [139] derived the effective work function of various metals on HfO 2 from literature data on CV measurements and tunnel barrier heights, as shown in Figure 38(a) . The effective work function is defined as the barrier height to the Si CB plus the real electron affinity of Si (4.05 eV). They found an S value of about 0.5.
On the other hand, Schaeffer et al. [140] derived the flat band voltage of various metal electrodes on HfO 2 /Si MOS capacitors by CV measurements. They found that V F B changed by less than 0.5 of the change in metal work function. An extreme case is LaB 6 which has a very low work function of 2.6 eV. Schaeffer [140] found a pinning factor closer to 0.2 than 1. Thus their data show a much weaker dependence than that collected by Yeo et al. [139] .
The experimental value of S is found to lie in the range 0.1 to 0.5, depending on experimental method used. One could argue that the photoemission measurements are direct and more reliable, while the CV measurements rely on an unproven constancy of Q in equation (17) to extract a value of Φ ms . Given the disagreement between the more direct internal photoemission method and the CV method, this would argue that there is a flaw in effective work functions extracted from CV at present. On the other hand, CV does correspond to the situation in a real device.
This means that metals with a larger range of work function should be needed to drive NMOS and PMOS using HfO 2 gate dielectrics than for SiO 2 . Engineers call this 'V T shifts' when referenced to the SiO 2 case. Engineers always think in the 'Schottky limit'.
To an extent, the observed pinning behaviour is expected from the MIGS model of Schottky barriers, as the However, this is not quite what is observed in the Hobbs experiments. Figure 39 shows how the flat band shift varies for a case of 20Å of SiO 2 layer plus a variable thickness of HfO 2 on top, for n-poly and p-poly gate electrodes [134] . The flat band shift is seen to be larger for p-poly than n-poly. It is converging towards the upper Si gap region. On the other hand, the band alignment of HfO 2 on the Si channel is such that their charge neutrality levels tend to align. The Si CNL is about 0.2 eV above its valence band edge, and thus the CNL of HfO 2 is also close to this energy, when referred to the Si gap. On the other hand, the data is being 'pinned' towards an energy in the upper gap, about 0.3 eV below the CB edge.
A possible explanation was provided by Hobbs et al. [132, 135] . The SiO 2 -Si interface is chemically rather simple, as it consists of only two elements. The HfO 2 -Si interface is more complicated, as it contains three elements. It is assumed that an ideal, abrupt HfO 2 -Si interface consists of O-terminated HfO 2 in contact with Si. It would have only Si-O bonds at the physical interface. Of course, this abrupt situation does not yet happen at the channeloxide interface because there is usually an interlayer of SiO 2 present. In contrast, the abrupt interface is possible at the gate electrode interface, because the gate is deposited after the oxide, and there is no need for a graded layer for nucleation purposes.
If the ideal abrupt interface consists of O-terminated HfO 2 on Si, with only Si-O interface bonds, then nonideal interfaces are those with Hf-terminated HfO 2 or with mixed O and Hf termination next to Si. Both cases would place some Hf atoms next to Si and create Hf-Si bonds. Poly-Si is grown from silane, and its reducing atmosphere is likely to give an O-poor top interface and hence HfSi bonds. Thus, Hobbs [135] and also Chau [5] suggested that the Hf-Si bonds at the gate electrode interface lead to pinning of the Fermi level of the gate electrode.
This was supported by Fonseca's calculations reported in Hobbs et al. [135] . These calculations were extended to a much wider range of interface configurations by Xiong et al. [141] . Figure 40 compares model [100] HfO 2 :Si interfaces without and with Hf-Si bonds. It was noted that the most symmetric O 4 interface could be continuously transformed into the Hf 10 interface by removable of interface O atoms. The O 4 interface when relaxed has 2 Si-O bonds, the Hf 10 interface has no Si-O bonds and 6 Hf-Si bonds, and is metallic. An intermediate case is shown below with 4 Hf-Si bonds and 2 Hf-O bonds. This interface structure was relaxed to minimise its total energy. The local density of states was calculated, and it was found that an interface state cause E F to lie at about 0.3 eV below the Si CB edge. This causes a very short band bending in the poly-Si, depleting the poly-Si, so that its bulk E F lines up with the interfacial E F which is pinned by this interface state.
A number of other interface configurations were tried. Figure 40 (e) shows the 2 × 1 symmetry 2-fold coordinated O-terminated interface studied by Fonseca [135] , but with a better picture. An O vacancy is created, and the Hf and Si atoms are rebonded. This case also gives an interface where E F is pinned in the upper gap. Thus, the calculations support the proposal that Fermi level pinning by Hf-Si bonds at the gate electrode-oxide interface is the cause of the large V t shifts which appear when poly-Si gates are used with HfO 2 gate oxide. The specific interface configuration is not restrictive.
Hobbs [132] This V T only arises because poly-Si is not a 'real' metal. It can have dangling bond states which do not lie at its Fermi level. The problem can be removed by using real metals which can be elemental metals, or metal nitrides, silicides or metal nitride silicides. These have only a Fermi level. The metals must be chosen for their desired work function -high for PMOS and low for NMOS. On HfO 2 there still remains the problem that the work function range is reduced by intrinsic E F pinning by S. However, this problem appears to have been circumvented by some form of interface design or 'work function engineering' by Intel, as the recent announcement shows FETs with nand p-metal gates with low V T offsets [5] .
Charge trapping
We have already noted that high K oxides possess a larger bulk density of defects and trapped charge than SiO 2 [142] . Charge trapping leads to instability in the flat band voltage and gate threshold voltage. It is seen as hysteresis on a drive current vs. gate voltage plot. The effect can be demonstrated by charge pumping experiments. It is notable that HfSiO x gate oxides have less hysteresis than HfO 2 and also that nitrogen addition reduces it below 70 meV. The amount of trapped charge can be reduced by various annealing cycles and by design of the oxide. It would also be helped by a clearer understanding of its origin.
The origin of this trapped charge is becoming clearer. The first source is intrinsic defects in the oxide and interface traps. Zafar et al. [143] showed that trapping in HfO 2 and Al 2 O 3 occurs by the filling of existing defect levels rather than the creation of new defects. This indicates that bulk defects in high K oxides are a serious problem. Kumar [144] showed that hot carriers can create additional defects, but this is an additional effect. Figure 41 shows the effect transient charge trapping in the gate oxide has on a device characteristics, from Bersuker et al. [145] . The gate voltage was cycled and plotted against the resulting FET drain current. The hysteresis between up and down ramps shows that the oxide traps electrons (going positively) and releases electrons (going back). The curves follow the same cycle showing that no new defect traps are formed. Kerber et al. [146] interprets this as fast trapping and detrapping in the oxide. Similar results are found by Shanware et al. [147] .
The nature of intrinsic defects in ionic oxides differs from those in SiO 2 . They are oxygen vacancies, oxygen interstitials, or oxygen deficiency defects. The chemical nature of the defects can be detected in their paramagnetic configuration by electron spin resonance (ESR). So far, most of the defects found by ESR have been those related to the Si dangling bond at the interface, called the P b centre [148] . Recently, Lenahan et al. [118] identified three paramagnetic defects by ESR in bulk HfO 2 produced by ALD and subjected to corona discharging; the O vacancy, the Hf 3+ ion (an electron trapped at Hf 4+ ) and the superoxy radical (or oxygen interstitial). These are the same centres which were previously identified in ZrO 2 powder used in catalysis [149] .
The energetics and energy levels of oxygen vacancies and oxygen interstitials in ZrO 2 and HfO 2 were calculated by Forster et al. [111, 112] and Xiong [113] , as described in Section 4.5. Experimentally, Takeuchi et al. [117] recently used spectroscopic ellipsometry on HfO 2 films oxidised to different levels to identify an absorption band in the gap at 4.5 eV. They attribute this to transitions from the valence band to the oxygen vacancy, and so place the V O level at 4.5 eV in the gap. Charge pumping experiments would place a defect level close to the Si conduction band, say 4.4−4.5 eV above the HfO 2 VB maximum, which is close to that found by Takeuchi. This is higher than the WDA calculation [113] and a lot higher than in Foster [111] . Kerber et al. [146] noted that the instability data were consistent with an electrical level lying just above the Si conduction band edge. Kumar's [144] data is not consistent with a level lying below the Si valence band edge. The oxygen interstitial configuration is shown in Figure 31(b) . The extra oxygen lies next to bulk oxygen, and the two form a superoxy radical, with a bond of length 1.49Å for the neutral case. The resulting covalent O-O bond gives rise two π and π * states at −3 eV and 0.5 eV with respect to the HfO 2 VB edge, and single σ and σ * states at −8 eV below the main VB and at 5 eV close to the CB edge, Figure 32 . The π * states are filled and the σ * state is empty for the neutral interstitial. The positively charge I + O has a hole in one of the π * orbitals. This orbital rises further above the VB edge. It has a unique ESR signature which has been detected in HfO 2 films by Lenahan [118] .
The trapped charge can be reduced by annealing. This can be carried out in forming gas (N 2 /H 2 mixture), or other nitrogen containing gases such as ammonia. The objective is to reduce the hysteresis in Figure 41 to 7 mV. This is only so far possible in the silicates. Annealing is useful for ALD films because it compacts them and removes possible impurities such as Cl, C and H. The understanding of this process is presently low. Figure 42 shows the variation of trapped charge and interface state density in ALD ZrO 2 with annealing temperature [150] . It is interesting that the trapped charge changes sign at 500
• C when annealed. Houssa [150] speculates that the positive charge can be due to protons in the oxide (that is OH − ions). Figure 43 shows that fixed charge of 10 11 cm −3 has been achieved with HfO 2 gate oxide by Datta [130] .
288
The European Physical Journal Applied Physics
DRAM oxides
It was noted in the introduction that replacing silicon oxynitride as the dielectric in the storage capacitor of DRAMs is an equally pressing problem. Some years ago, the roadmap was to use first Ta 2 O 5 with a K of 22 and then develop (Ba,Sr)TiO 3 or BST with a very high K of order 2000 [7] . In practice, the ability to use capacitor geometries with high surface area delayed the introduction of high K oxides until recently in DRAM. The work on gate oxides has allowed ALD as a process to mature. ALD is particularly good a coverage of complex shapes without pin-holes, a key requirement for DRAM. The ALD of Al 2 O 3 is the most well developed. In addition it is realised that retaining an amorphous dielectric is very useful in DRAM, as it helps coverage and reduces possible electrical leakage paths. This also favours use of Al 2 O 3 . Thus the favoured dielectrics for DRAM appear to be Ta aluminate followed by Hf aluminate. The presence of more electronic defects in aluminates is less of a problem in DRAM.
It is important to realise that the requirements for a capacitor dielectric in DRAM are significantly different from those for gate dielectrics. First they are not in contact with Si. Second, the capacitor electrodes are metals, so the band offset requirement is easier. Third, the capacitor is a back end component so that it only needs to withstand lower temperature processing (600 • C). Finally, it should be resistant to hydrogen-induced degradation, and usually this requires forming a hydrogen diffusion barrier around it.
Achieving lower EOT
Scaling beyond 2009 requires EOT values below 0.8 nm. As well as metal gates, this will require more abrupt interfaces at the Si channel and higher K values for the bulk oxides. The group III oxides such as LaO 3 or LaAlO 3 have higher K than HfO 2 itself. It will also be necessary to use oxides rather than silicates. The 2003 roadmap suggests that epitaxial oxides such as LaAlO 3 could be used. The most well developed epitaxial oxide which is lattice matched to Si is SrTiO 3 . The lattice matching of (001) faces involves a 45
• rotation of the SrTiO 3 lattice on Si, so that (110)SrTiO 3 //(100)Si. However, the problem with this interface is that SrTiO 3 is not thermodynamically stable next to Si, nor does it have a large enough band offset. It has been possible to create an interface of SrTiO 3 on Si, by passing through intermediate SrO and silicide layers [151, 152] . It has even been possible to control the band offset somewhat [152] . But it unclear that this interface will pass the requirement of processing up to 1000
• C. LaAlO 3 passes most of the requirements [87] ; it is also closely lattice matched to Si, La and Al oxides are both stable next to Si [153] , they have low oxygen diffusion coefficients, and it has a conduction band offset of about 1.8 eV. Unfortunately, it has so far not been possible to grow LaAlO 3 crystals directly on Si, it grows amorphous. Thus, the future is not clear.
Summary
This paper has reviewed the materials chemistry, bonding and electrical behaviour of oxides needed to replace SiO 2 as the gate oxide in CMOS devices. The new oxides must satisfy six conditions to be acceptable as gate dielectrics, a high enough K value, thermal stability, kinetic stability, band offsets, good interface quality with Si, and low bulk defect density. HfO 2 and Hf silicate have emerged as the preferred oxides. The necessary deposition and processing to produce working devices has been achieved. However, the oxides need to optimised substantially further, in order to achieve high performance devices. This requires improvement of flat band voltage and lower defect densities. The flat band voltage shift may be due to interface defects and interface behaviour at the gate oxide/gate electrode interface. The main defects in the oxides are oxygen vacancies and interstitials. The oxygen vacancies are most problem as they give rise to defect levels close to the Si conduction band.
