A single-transistor silicon synapse by Diorio, Chris et al.
1972 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 43, NO. 11, NOVEMBER 1996 
A Single-Transistor Silicon Synapse 
Chris Diorio, Member, IEEE, Paul Hasler, Student Member, IEEE, Bradley 
A. Minch, Student Member, IEEE, and Carver A. Mead, Fellow, IEEE 
Abstract-We have developed a new floating-gate silicon MOS 
transistor for analog learning applications. The memory storage 
is nonvolatile; hot-electron injection and electron tunneling per- 
mit bidirectional memory updates. Because these updates depend 
on both the stored memory value and the transistor terminal 
voltages, the synapse can implement a learning function. We have 
derived a memory-update rule from the physics of the tunneling 
and injection processes, and have investigated synapse learning 
in a prototype array. Unlike conventional EEPROM devices, 
the synapse allows simultaneous memory reading and writing. 
Synapse transistor arrays can therefore compute both the array 
output, and local memory updates, in parallel. The synapse is 
small, and typically is operated at subthreshold current levels; it 
will permit the development of dense, low-power silicon learning 
systems. 
I. INTRODUCTION 
E have fabricated single-transistor learning devices that W integrate nonvolatile analog memory storage with bidi- 
rectional memory modification; we call these devices silicon 
synapses. Like a neural synapse [ 11, our silicon synapses com- 
pute the product of the stored analog memory value and the 
applied input. Also like a neural synapse, they can leam from 
the input signal without interrupting the ongoing computation. 
Although we do not believe that a single transistor can model 
completely the complex behavior of a neural synapse, our 
transistors do implement a local learning function. With them, 
we intend to construct autonomous silicon learning systems. 
We have described previously a four-terminal nFET synapse 
[2]-[4]; it is a high-threshold floating-gate MOSFET with an 
associated tunneling junction. It shows promise for a range 
of applications, including high-resolution analog memories 
[5] and analog learning arrays 131. The synapse described 
here integrates the tunneling function within the transistor 
drain, yielding a three-terminal device. Like the four-terminal 
device, the three-terminal device possesses five attributes 
that we believe are essential in a silicon synapse. First, 
when the synapse is not learning, the analog memory is 
nonvolatile; when the synapse is learning, memory updates can 
be bidirectional. Second, the synapse output is the product of 
the input signal and the stored memory value. Third, memory 
reading and writing can occur simultaneously. Fourth, the 
Manuscript received December 5 ,  1995; revised April 1, 1996. The review 
of  this paper was arranged by Editor C.-Y. Lu. This work was supported by 
the Office of Naval Research, by the Advanced Research Projects Agency, 
by the Beckman Hearing Institute, by the Center for Neuromorphic Systems 
Engineering as a part of the National Science Foundation Engineering 
Research Center Program, and by the California Trade and Commerce Agency, 
Office of Strategic Technology. 
The authors are with the Physics of Computation Laboratory, California 
Institute of Technology, Pasadena, CA 91 125 USA. 
Publisher Item Identifier S 00 18-9383(96)07734-9. 
memory updates vary with both the input signal and the stored 
memory value. Fifth, the synapse is compact, and operates off 
a single-polarity supply with low power consumption. 
Our synapse differs from conventional EEPROM transistors 
both in its function and in its potential applications. Not 
only does it provide nonvolatile analog memory storage, and 
compute locally the product of its stored memory value and 
the applied input, but it also permits simultaneous memory 
reading and writing, and can even compute locally its own 
memory updates. We anticipate building synapse-based learn- 
ing systems in which both the system outputs, and the memory 
updates, are computed both locally and in parallel. By contrast, 
because conventional EEPROM transistors are optimized for 
digital programming and binary-valued data storage [6], they 
typically possess few if any of these features, and therefore 
have seen only limited use in silicon learning systems. 
11. THE SILICON SYNAPSE 
The silicon synapse is an n-type MOSFET with a polyl 
floating gate, a poly2 control gate, a moderately doped channel, 
and a lightly doped drain (LDD). It uses channel hot-electron 
injection (CHEI) to add electrons to its floating gate, and 
Fowler-Nordheim (FN) tunneling [7] to remove them. It has 
been fabricated in the 2 pm n-well Orbit BiCMOS process 
available frotn MOSIS. Top and side views of the device are 
shown in Figs. 1 and 2, respectively. Its principal features are 
001 8-9383/96$05.00 0 
Electrons tunnel from floating gate to drain through 350 
A gate oxide. High drain voltages provide the oxide E- 
field required for tunneling. The lightly doped ("5 x 
~ m - ~ )  well-drain prevents pn-junction breakdown. 
Electron tunneling is enhanced where the polyl floating 
gate overlaps the heavily doped ("1 x lo1' cmP3) 
well-drain contact, for two reasons. First, the gate cannot 
deplete the n+ contact, whereas it does deplete the n- 
well. Thus, the oxide E-field is higher over the n+. 
Second, enhancement at the gate edge further augments 
the oxide field. 
Electrons inject from the channel-to-drain apace-charge 
layer to the floating gate. To facilitate injection, we apply 
a bipolar-transistor base implant (-1 x 1017 cm-') to the 
MOS channel region. As a result, the channel-to-drain 
depletion layer approximates a one-sided step junction, 
increasing the injection likelihood. The channel implant 
also raises the transistor's threshold voltage V,, favoring 
the collection of the injected electrons by the floating gate. 
The channel-to-drain space-charge layer appears primar- 
ily on the drain side of the junction. We extend the 
MOS gate oxide 2 pm beyond the channel-drain edge, 
1996 IEEE 
DIOI!IO et al.: SINGLE-TRANSISTOR SILICON SYNAPSE I913 
source contact polysilicon n- n+ drain 
metal cut floating gate well-drain coytact 
\ / \  
/ / \ \. source substrate gat2 oxide gate oxide 
diffusion implant over p over n- 
Fig. I .  Synapse transistor, top view. The poly2 control gate is not shown. 
In the Orbit 2 p m  process, the channel width is 8 pm, and the channel length 
is 1 I pm. 
over the space-charge layer. Because the injected channel 
electrons encounter gate oxide, rather than a field-oxide 
channel stop, CHEI is greatly facilitated by this gate-oxide 
extension. 
Oxide uniformity and purity determine the initial match- 
ing between synapses, as well as the learning-rate degra- 
dations due to oxide trapping. We therefore use the 
thermally grown gate oxide for all Si02 carrier transport. 
The stored memory value is represented by the floating-gate 
charge. Either channel current or channel conductance can be 
selected as the synapse output. Inputs are typically applied to 
the poly2 control gate, which capacitively couples to the polyl 
floating gate. From the control gate's perspective, altering the 
floating-gate charge shifts the transistor's threshold voltage 
V,. CHEI adds electrons to the floating gate, reducing the 
charge and raising the threshold; tunneling removes electrons, 
increasing the charge and lowering the threshold. 
We typically operate the synapse in its subthreshold regime 
[SI, to limit the power consumption, and typically select either 
drain current or source current to be the synapse output. When 
operated in this fashion, the synapse output is the product of 
a stored memory value and the applied input as follows: 
nQr:. "C,"V,, *&  
.'VI, 
1, z I o e m e T  = Ioe Q I  e v i  (1) 
= I T r L e r  (2) 
where I ,  is the source current, IO is the pre-exponential 
current, K is the floating-gate to channel-surface coupling 
coefficient, Qfrs is the floating-gate charge, CT is the total 
capacitance seen by the floating gate, Ut is the thermal voltage 
kT/q, C,,, is the input (poly1 to poly2) coupling capacitance, 
V,, is the signal voltage applied to the control gate, QT 5 
The current I ,  is a learned quantity; its value changes with 
synapse use. The synapse output is the product of I,, and 
the exponentiated gate input. Because the CHEI and tunneling 
gate currents vary with the synapse terminal voltages and 
channel current, I ,  varies with the terminal voltages, which 
are imposed on the device, and with the channel current, which 
is the synapse output. Consequently, the synapse exhibits a 
type of learning by which its future output depends on both 
the applied input and the present output. 
and K' K c l n / C ~ .  
polyl floating gate ~011~2 field-oxide contact metal 
contact metal 
Fig. 2. Synapse transistor, side view, showing the tunneling and injection 
locations. As a consequence of the p-type substrate implant, the floating-gate 
to channel-surface coupling coefficient, K ,  is 0.2. 
111. THE (ZHANNEL ENERGY PROFILE 
To be injected onto the floating gate, channel electrons must 
(1) acquire the 3.2 eV required to surmount the Si-SiO2 work- 
function barrier, (2) scatter iupward into the gate oxide, and (3) 
be transported across the oxide to the floating gate. CHEI in 
conventional MOSFET'Y is well known [9]. It occurs in short- 
channel devices with continuous channel currents, when a high 
gate voltage is combiined with a 3.2 V drop across the short 
channel. It also occurs in switching transistors, when both the 
drain and gate voltages are transiently high. In neither case is 
the CHEI suitable for use in a learning system. The short- 
channel CHEI requires large channel currents, consuming 
too much power; the swiiching-induced CHEI is a poorly 
controlled transient phenomenon. 
We impart 3.2 eV to the channel electrons by accelerating 
them in the synapse transistor's channel-to-drain E-field. How- 
ever, merely generating a 3 2 eV electron population is not, by 
itself, sufficient for CHEI. 14s shown in Fig. 3, a conventional 
well-drain MOSFET can experience a channel-to-drain E- 
field exceeding 10 Vlpm, thereby inducing a large 3.2 eV 
carrier population. Stiill, when operating in the subthreshold 
regime, this device experiences little or no CHEI. Under 
similar conditions, the synapse transistor's injection efficiency 
(gate current divided by source current) can exceed 1 x lops.  
This efficiency improvement is a consequence of the synapse 
transistor's higher p-type substrate doping, for two reasons. 
First, the synapse transistor's channel-to-drain depletion 
region is one-sided, with 95% of the space-charge layer 
appearing on the draiin side of the junction. When V d ,  = 30 
V, peak field occurs a mere 0.14 pm into this space-charge 
layer. At peak field, the conduction-band potential rises 3.2 
V in 25X (where X r w  7 nm is the electron mean-free-path 
length [IO]). A hot-electron population is therefore available 
near the channel edge of the space-charge layer. By contrast, 
in the conventionally dopecl well-drain transistor, the channel- 
to-drain depletion region is symmetric rather than one-sided; 
peak field is not reached unfil2 p m  into the space-charge layer. 
Second, the higher 'surface-acceptor concentration raises the 
synapse transistor's threshold voltage V, from 0.8 V to 6.2 
V. It is evident from Fig. .3 that electron transport within the 
Si02 depends on the direction of the oxide E-field. Where 
the gate voltage exceeds the surface potential, the oxide field 
1914 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 43, NO. 11, NOVEMBER 1996 
0 
conduction- 
band 
potential 
(V) 
+30 
0 
electric 
field 
(V/pm) 
-21 
SYNAPSE TRAN SISTOR 
channel electron floating electron 
potenfial=O) injyction ,gate tunneling 
 
drain \ / ) 3 , 2 v /  
0 0.14 2.9 position z (pm) 
0 
conduction- 
band 
potential 
(V) 
+30 
0 
electric 
field 
(V/pW 
- 15 
CONVENTIONAL WELL-DRAIN MOSFET 
channel floating electron 
drain 
sio, 
barrier 
e 
0 2.0 4.0 position z bm)  
Fig. 3 .  Drain-to-channel space-charge conduction-band potential and electric field, calculated from implant impurity concentrations [ 1 I]. The transistors are 
identical except for their channel impurity doping, which in the synapse is 1 x 10” cm-’ and in the conventional well-drain MOSFET is 5 x 10” cmp3.  
Both a step doping profile and subthreshold operation ( I ,  < 100 nA) are assumed. The synapse impact-ionization data of Fig. 8, and an observed 70 V 
drain-avalanche onset, are both consislent with the step-junction approximation. All voltages are referenced to the channel potential; all positions are measured 
from the channel edge of the channel-to-drain space-charge layer. Although the gate-oxide band diagrams actually project into the plane of the page, for 
convenience they have been rotated 90” and drawn in the channel direction. Because, for both devices, the conduction-band edge provides the reference 
potential for the oxide barrier’s leading edge, the barrier shape varies with position z along the channel. For clarity, oxide barriers are drawn for only a single 
channel position, to. At z = t’, the oxide voltage is zero; for z > z ‘ ,  the oxide field opposes the transport of injected electrons to the floating gate. 
sweeps injected electrons across the Si02 to the floating 
gate. Where the surface potential exceeds the gate voltage, 
injected electrons tend to return to the silicon surface. When 
= 30 V, the synapse’s conduction-band potential is 3.2 V 
at z = 0.22 pm, whereas the surface potential does not exceed 
the gate voltage until z = 0.37 pm. The gate current arises 
primarily in the intervening region (0.22 < z < 0.37 pm). By 
contrast, in the conventional well-drain transistor with V,, = 
30 V, the conduction-band potential does not reach 3.2 V until 
0.9 pm into the space-charge layer. Here the surface potential 
exceeds the gate voltage by 6.5 V, preventing a gate current. 
IV. THE GATE-CURRENT EQUATION 
We would like to use our synapse to build a silicon learning 
system. Because the learning behavior of any such system is 
determined in part by the CHEI and tunneling processes that 
alter the stored memory, we have investigated these processes 
over the anticipated synapse operating range. Based on a 
preliminary analysis, and on data taken from four-terminal 
synapses fabricated in a 1.2 pm process, we believe that the 
gate-current equation derived here, with modified fit constants, 
will describe generally the learning behavior of three-terminal 
well-drain synapses fabricated in more modern processes. 
A. Hot-Electron Injection 
To measure CHEI, we fabricated the device of Fig. 1 
without a tunneling junction. We fit the measured injection data 
I 1  - 1  
; I  P 
-0.155 -0.145 -0.135 -0 125 
-1 / gate-to-channel voltage (IN) 
Fig. 4. 
for I’d,. = 20 V and I ,  = 2pA. 
Synapse-transistor CHEI efficiency versus gate-to-channel voltage, 
empirically; we are currently analyzing the relevant transport 
physics to derive equivalent analytic results. Because the CHEI 
probability varies with channel potential, we reference all 
terminal voltages to the channel. We can re-reference our 
results to the source terminal using the relationship between 
source and channel potential in an MOS transistor [12], [13]. 
In Fig. 3, we define z’ to be that location where the oxide 
E-field is zero. Because z’ increases with gate voltage, the gate 
current also increases with gate voltage. Fig. 4 shows CHEI 
efficiency versus gate-to-channel potential. The data are fit by 
V, 
I ,  = aI,e-% (3) 
DIOR10 ef al.: SINGLE-TRANSISTOR SILICON SYNAPSE 1975 
Fig. 5. Synapse-transistor CHEI efficiency versus drain-to-channel voltage, 
for V, = 6.7 V and I, = 2pA. 
Fig. 7. Synapse-transistor injection efficiency versus drain-to-channel and 
gate-to-channel voltages. The RPdS deviation between these data and (5) is 
1.2 x 10-9. 
source current (A) 
Fig 6 Synap\e-transistor gate current versuq source current The 
drain.to.bulk and gate to.bulk voltageq were heid fixed at vdb = 35 v and 
"k, = 7 V during the experiment 
Fig 8. Synapse-tramistor ~mpact ionization versus drain-to-channel voltage, 
for V, = 6 15  V m d  I ,  := 100 nA The fit function is independent of source 
current for 2 nA < I, < 2uA 
whiere Ig is the gate current, I ,  is the source current, V, is 
the gate-to-channel potential, and Q and V, are constants. 
Because the channel-to-drain E-field increases with drain 
voltage, the gate current also increases with drain voltage. 
Fig. 5 shows the injection efficiency versus drain-to-channel 
potential. These data are fit by 
where Vdc is the drain-to-channel potential and /3, V,, and V, 
are constants. 
Fig. 6 shows the gate current to be directly proportional to 
the source current. Fig. 7 shows the CHEI efficiency versus 
both drain-to-channel and gate-to-channel potential. We fit 
these data by combining (3) and (4) 
(5) 
where T/ = 3.63 is a fit constant, and V,, Vo, and V ,  remain 
unchanged from (3) and (4). 
We equate channel current with source current. Because 
the activation energy for impact ionization in silicon is less 
than 3.2 eV, a channel-to-drain E-field that generates 3.2 
eV carriers must create additional electron-hole pairs [14] at 
the drain. We show synapse impact-ionization data in Fig. S. 
The drain current is determined from the source current and 
drain-to-channel potential, [ 151 by 
where I d  is the drain current and y, V,, and V, are constants. 
B. Tunneling 
The FN-tunneling process is illustrated in Fig. 3. The drain- 
to-gate potential reduces I he effective oxide thickness, facil- 
itating electron tunneling from the floating gate, through the 
SiOz barrier, into the oxide conduction band. The electrons 
are then swept by the oxide E-field over to the synapse drain. 
To measure the tunneling current, we fabricated a separate 
tunneling junction. Fig. 9 shows gate tunneling current versus 
oxide voltage. We fit these data with a modified FN fit, which 
employs a built-in potential, &,, to account for oxide traps 
"0 
(7) 
where Vdg is the drain-to-gate potential and E ,  Vbl, and VO are 
constants. For comparison, we also show the conventional FN 
2 -- 
I g  = [ (vdg  + vbi) e 
1976 lEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 43, NO. 11, NOVEMBER 1996 
-111 1 learning, drain voltages that continuously exceed 40 V 
can lead to excessive power dissipation, damaging the 
synapse. 
V. FUTURE DEVELOPMENT 
Although the synapse already possesses those attributes that 
we believe are essential for building a silicon learning system, 
further development will improve the device substantially. We 
identify three areas for improvement: (A) drain voltage, (B) 
drain-to-gate capacitance, and (C) drain leakage current. More 
modern processing will readily allow these improvements. 
A. Drain Voltage -1 I drain-to-gate voltage (1N) 
Fig. 9. Synapse-transistor tunneling gate current versus drain-to-gate volt- 
age, normalized to the tunneling-junction edge length in lineal microns. 
The modified Fowler-Nordheim fit (solid line) employs a built-in voltage to 
account for oxide traps; the conventional Fowler-Nordheim fit (dashed line) 
is shown for comparison. 
fit U], C161 
"1 
I~ = 'pV,2,e-& (8) 
where Vf = 928 V is consistent with a recent survey [I71 
of Si02 tunneling, given the synapse transistor's 350 8, gate 
oxide, and cp is a fit parameter. 
The data of Fig. 9 are normalized to the gate-to-n+ edge 
length, in lineal microns. The reason is that the floating 
gate induces a depletion region in the lightly doped n- 
well-drain, reducing the effective oxide voltage, and therefore 
also the tunneling current. Because the gate cannot appreciably 
deplete the n+ drain contact, the oxide field is higher where 
the self-aligned floating gate overlaps the n+. Because the 
tunneling current is exponential in the oxide voltage, gate- 
oxide tunneling in the synapse transistor is primarily an edge 
phenomenon. 
C. The Gate-Current Equation 
Because the tunneling and injection gate currents are in 
opposite directions, we obtain the complete synapse gate- 
current equation by subtracting ( 5 )  from (7) 
The synapse exhibits four operating regimes: 
1) V d ,  < 10 V: The tunneling and injection gate currents 
are both exceedingly small; the floating-gate charge is 
retained in a nonvolatile state. 
2)  10 v < Vdc < 30 v: The tunneling current is small, but 
the injection current is not small; electrons are added to 
the floating gate, increasing the threshold voltage. 
3 )  30 V < V,, < 40 V: Neither the tunneling nor the 
injection current is small; the floating-gate asymptotes 
to a voltage where the gate current of (9) is zero. 
4) Vdc > 40 V: The tunneling current is larger than the 
injection current; electrons are removed from the floating 
gate, decreacing the threshold voltage. Although drain 
voltages that transiently exceed 40 V are useful for 
The present synapse requires drain voltages up to 45 V. 
Although such high voltages limit potential applications, the 45 
V requirement is a consequence of the 350 A gate oxides found 
in the Orbit 2 pm process, rather than an inherent limitation in 
the synapse itself. If the synapse were fabricated in a modern 
EEPROM process with 80 8, oxides, it would operate from 
a 12 V supply. In addition, at lower voltages, the well-drain 
structure that we use to prevent drain breakdown could be 
replaced with a graded junction, reducing the synapse size. 
B. Drain-to-Gate Capacitance 
The synapse transistor's parasitic drain-to-gate capacitance 
Cd, is approximately 5 fF. Because the gate is floating, and the 
drain-voltage range is 0 < Vd < 45 V, drain-to-gate coupling 
significantly affects the channel current. We reduce this effect 
by using a large (1 pF) gate capacitor. In a more modern 
process, however, two improvements are possible. First, the 
drain-voltage range can be smaller. Second, replacing the 
well-drain with a graded drain can reduce Cdg. These changes 
will permit us to use a substantially smaller gate capacitor. 
C. Drain Leakage Current 
As shown in Fig. 2, the poly1 gate extension that forms 
the tunneling junction crosses a region of field oxide (FOX). 
This FOX was intended to form a channel stop, preventing 
the channel-surface depletion layer from reaching the n+ 
well-drain contact. Unfortunately, in the Orbit process, the 
FOX transistor threshold voltage V, M 20 V. For V d ,  > 20 V, 
a parasitic p-type MOS channel forms in the n- well, beneath 
the channel stop. For V d g  > 35 V, pn-junction breakdown 
occurs where the FOX-induced, p-type channel meets the n+ 
well contact [ I  11. 
The drain leakage current is shown in Fig. 10. Because 
the FOX-transistor channel conductance restricts the leakage 
current, the breakdown process is self-limiting. Unfortunately, 
junction breakdown induces a hot-electron gate current not 
included in (9). Although we could model this effect, we prefer 
to eliminate it in future synapses by using lower drain voltages 
or an improved channel stop. 
VI. A SYNAPTIC ARRAY 
A synaptic array, with a synapse transistor at each node, can 
form the basis of a silicon learning system. We fabricated the 
DIORIO ef al.: SINGLE-TRANSISTOR SILICON SYNAPSE 
.................. 
read 
tunnel 
inject 
1977 
- 
col 1 gate / col 2 gate j row 1 drain i row 2 drain i row 1 source i row 2 source - 
+5 j 0 j +5 ; 0 i 0 j 0 
0 j +4.5 i +35 i 0 j +2 j 0 
........................................................ .................................................................................................. 1. .............................. 
........................................... ; .......................... <... ................ ........................................ 4 .................................. 4. ............................. 
+5 i 0 j +25 j 0 0 0 - 
(2.1) and (2.2) synapses. 
field-oxids-mduced . 
drainleakage . 
10-~t) I I 5 10 15 20 25 30 35 40 45 50 
drain voltage (V) 
Synapse-transistor drain current versus drain voltage, with the Fig. 10. 
source and bulk terminals grounded. 
column 1 column 2 
{ I , l , L  It-) r ----- row 1 source > 
row2drain > ----- 
----- row 2 source > 
I I 
Fig. 11. A 2 x 2 synaptic array. The row synapses share a common drain 
wire, so tunneling at one synapse can cause undesired tunneling and injection 
at other row synapses. 
simplified 2 x 2 array of Fig. 11 to investigate synapse isolation 
during tunneling and injection. Because a 2 x 2 array uses the 
same row-column addressing employed by larger arrays, it 
allows us to characterize completely the synapse isolation. 
We chose, from among the many possible ways of using the 
array, to select the source current as the synapse output, and to 
turn off the synapses while tunneling. We applied the voltages 
in Table I to read, tunnel, or inject synapse (1, I} selectively, 
while ideally leaving the other synapses unchanged. 
Because the synapse drain terminals are connected within 
a row, but not within a column, crosstalk between column 
synapses is small. Crosstalk between row synapses depends on 
tlhe operation being performed. When a row synapse is read or 
injected, crosstalk to the other row synapses is small. When a 
t i 
250 5oC1 750 1000 1250 1500 
time (s) 
Fig. 12. Isolation in a 2 x 2 s,ynaptic array. Source current is the synapse 
output. The (1.1) synap:je first is injected down to 30 nA, then is tunneled 
back up to 2 PA. Crosstiilk to the { 1 , 2 }  synapse, defined as the fractional 
change in the { 1 , 2 }  synapse divided by the fractional change in the { 1,l) 
synapse, is 0.43%. 
row synapse is tunneled, the high drain voltage can cause both 
parasitic tunneling and FOX injection at other row synapses. 
To obtain the data in Fig. 12, we initially set all synapses 
to I ,  = 2 PA. We injected the (1,l) synapse down to 30 nA, 
and then tunneled it back up to 2 FA, while measuring the 
source currents of the other three synapses. As expected, the 
row 2 synapses were unaFfected by either the tunneling or the 
injection. The {1 ,2]  synapse was similarly unaffected by the 
injection, but during tunneling experienced both FOX injection 
and parasitic tunneling. A 4.7 V signal on the column 2 gate 
input exactly balanced these parasitic effects; unfortunately, 
this optimum gate voltage varied with the {I, 2 }  synapse 
memory value. We chose a 4.5 V gate signal, so FOX injection 
slightly exceeded parasitic tunneling at the { 1,2} synapse. 
To obtain the data in Fig. 13, we first set all four synapses to 
I ,  = 30 nA. We tunneled the (1, I} synapse up to 2 /*A, and 
then injected it back down to 30 nA. Like the experiment of 
Fig. 12, when the { 1, l} synapse tunneled, the { 1,2} synapse 
experienced both FOX injection and parasitic tunneling. A 4.3 
V gate input exactly balanced these parasitic effects. With the 
chosen 4.5 V gate signal, parasitic tunneling slightly exceeded 
FOX injection at the { 1,2} synapse. 
The measured crosstalk between row synapses was -0.5% 
during tunneling, and +<0.1% for all other operations. We 
anticipate that, with an improved channel stop and thinner gate 
oxide, we can achieve < 0.1 % crosstalk for all operations. 
In the experiments of Figs. 12 and 13, the synapse tunneling 
and injection rates were small, for two reasons. First, the 1 
pF gate capacitors that we employed to reduce the drain-to- 
1978 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 43, NO. 11. NOVEMBER 1996 
time (s) 
Fig 13 The same experiment as in Fig 12, but here the { 1. l} synapse first 
IS tunneled up to 2 PA, then IS injected back down to 30 nA. Crosstalk to 
the { 1 , 2 }  synapse IS 0 52% 
source current (A) 
efficiency of (5). We add a (-) sign to the gate current, because 
CHEl decreases the synapse weight 
- L + q  
= - f (vdc .  Vgc)Is. ( 1  1) vg'9-. "<lC+V.I I, = -qIse 
We substitute (11) into (IO),  replacing I, with w 
For fixed drain and source voltages, V, increases with w ,  
whereas V d C  decreases with w; ,f, which depends on both, 
typically increases with w. As a result, the subthreshold 
weight-decrement rate varies as w ( ' + ~ ) ,  where 2 represents 
a positive-valued correction term. However, we often operate 
the synapse near threshold, to increase the leaming rate. For 
source currents near threshold, the aI,  slope declines 
relative to its subthreshold value. For 1 nA .< w < 1 PA, 
the decreasing aI,/aQfg slope counteracts the effects of the 
increasing f .  If we assume a perfect cancellation, the weight- 
decrement rule, with f(V,,, Vdc) = p ,  models accurately the 
data of Fig. 14. 
B. Tunneling 
tuting for the gate current using (7) 
We begin by taking the temporal derivative of ( I ) ,  substi- 
We approximate Vdb + Vbi >> V, (where Vdb is the 
Fig. 14. Synapse-transistor delta-weight versus weight, with source current 
chosen as the synapse weight. We tunneled and injected synapse { 1, l}, as 
in Fig. 13, with the source at ground and the drain at the indicated tunneling 
and injection voltages. We here plot the magnitude of the temporal derivative 
of the weight value as a function of the weight value. The mean injection 
slope is 2.01. 
drain-to-bulk voltage, V, is the gate-to-bulk voltage, and 
Vdg = Vdh - Vb)' expand the tunneling by 
(1 + X>-' 1-K and solve for the weight-increment rule 
gate coupling slowed the leaming rate. Second, we limited the 
synapse drain voltage to 35 V, to avoid FOX injection. We 
anticipate much faster leaming rates in an improved synapse. 
VII. A SYNAPSE LEARNING RULE 
We repeated the experiment of Fig. 13, for several tunneling 
and injection voltages; in Fig. 14, we plot the temporal deriva- 
tive of the source current as a function of  the source current. 
If we equate a weight w with the source current I s ,  these data 
show the synapse weight-update rate. Starting from the gate- 
current equation, (9), we now derive a synapse learning rule 
that fits these data. 
A. Injection 
We begin by taking the temporal derivative of (I). 
As shown in Fig. 6, the gate current I, is proportional to 
the source current I,; the proportionality factor is the CHEI 
where o V0ljt/n(Vdb + Vbi)'. Because, for subthreshold 
source currents, the floating-gate voltage changes slowly, we 
approximate (vdg  + &,;)z to be constant. We combine the 
constant terms into a single parameter E ,  and replace I ,  with 
W 
Equation ( 16) models accurately the weight-increment data 
for s u b t h r e s h o l d  source currents. For source currents ricar 
threshold, however, the fit is poor. As the weight w increases, 
the floating-gate voltage increases, causing (1) the tunneling 
current to decrease, and (2) the aI,9/aQfg slope to decrease. 
Whereas this first effect is included in (16), the second is not. 
In addition, our approximation that (Vdg + Vbi)' is a constant 
becomes less valid for above-threshold source currents. We 
therefore extend ( 1  6) with the following approximation, which 
models accurately the weight-increment data for channel cur- 
rents up to l pA 
DIOR10 et al.: SINGLE-TRANSISTOR SILICON SYNAPSE 1979 
We find the maximum weight change (A~u),,,,, and the 
saturation weight value wcorrlrr, by empirical measurement; 
the values vary with the tunneling voltage. 
C. l h e  Learning Rule 
We obtain the synapse learning rule by adding ( 13) and ( 17) 
Whereas the data of Figs. 12-14 were taken using terminal 
voltages chosen to prevent simultaneous injection and tunnel- 
ing, we have also investigated the synapse learning for terminal 
voltages that permit simultaneous injection and tunneling. 
Equation (1  8) describes adequately the synapse leaming for 
both modes of operation. 
D. Learning-Rate Degradation 
S io2 trapping is a well-known issue in floating-gate transis- 
tor reliability [ I  81. In digital EEPROM memories, it ultimately 
limits the transistor life. In the synapse, trapping decreases 
the learning rate. However, unlike the transistors in a digi- 
tal memory, the synapses in a typical learning system will 
transport only a small quantity of total oxide charge over the 
system lifetime. We tunneled and injected 1 nC of gate charge, 
and measured a -20% drop in both the weight-increment and 
weight-decrement learning rates. Because 1 nC of gate charge 
represents an enormous change in gate voltage, we believe that 
oxide trapping in the synapse can be ignored safely. 
VIII. CONCLUSION 
We have described a single-transistor silicon synapse with 
nonvolatile analog memory, simultaneous memory reading and 
writing, and bidirectional memory updates that are a function 
of both the applied terminal voltages and the present output. 
We have demonstrated that a learning system can be realized 
as a two-dimensional synaptic array, and have shown that we 
can address individual array nodes with good selectivity. We 
have characterized a synapse learning rule, and believe that we 
can build an autonomous learning system, combining single- 
tran,r;istor analog computation with memory updates computed 
both locally and in parallel, with this synapse. 
We have discussed the limitations of the present device, and 
anticipate that these limitations can be reduced or eliminated 
with more modern processing. We claim that we can halve the 
present device size in the current 2 pm process; further size 
reductions are possible in a more modern process. Finally, 
we mticipate that our single-transistor synapse will allow the 
development of dense, low-power, silicon learning systems. 
REFERENCES 
I 1  I P. Churchland and T. Seinowski, The Comimtulional Brain. Cam- . .  
bridge, MA: MIT Press, (993. 
121 P. Hasler, C. Diorio, B. A. Minch, and C. Mead, “Single transistor learn- . .  
ing synapses,” in Advances in Neural Injbrmutinn Processing Syslems 
7. 
131 ~, “Single transistor learning synapses with long tcrm storage,” 
IEEE h t .  Sjmp. on Circuits and Systems, 1995, vol. 3, pp. 1660-1663. 
Cambridge, MA: MIT Press, 1995, pp. 817-824. 
C. Diorio, P. Hasler, B. A. Minch, and C. Mead, “A single transistor 
silicon MOS device fix Ion$: term learning,” U.S. Patent Office Serial 
no. 081399966, Mar. 7, 1995. 
~, “A high-resolution nonvolatile analog memory cell,” IEEE Int. 
Symp. on Circuits and Systems, 199.5, vol. 3, pp. 2233-2236. 
F. Masuoka, R. Shirota, arid K. Sakui, “Reviews and prospects of 
nonvolatile semiconductor memories,” IEICE Trans., vol. E74, pp. 
868-874, Apr. 1991. 
M. Lenzlinger and E .  H. Snow, “Fowler-Nordheim tunneling into 
therinally grown SiO:!,” J. .4ppl. Phys., vol. 40, no. 6, pp. 278-283, 
Jan. 1969. 
C. Mead, Analog VLSI and ,Veural Systems. Reading, MA: Addison- 
Wesley, 1989. 
J. J. Sanchez and T. ,4. DeMassa, “Review of carrier injection in the 
silicodsilicon-dioxide system,” IEE Proc.-G, Junc 199 I ,  vol. 138, no. 
C. K. Crowell and S .  M. Sze, “Temperature dependence of avalanche 
multiplication in semiconductors.” Appl. Phys. Lett., vol. 9, no. 6, pp. 
242-244, Sept. 1966. 
A. S. Grove, Physics und Technology of Semiconductor Devices. New 
York: Wiley, 1967. 
C. C. Enz, F. Krummenacher, and E. A. Vittoz, “An analytical MOS 
transistor model valid in all regions of operation and dedicated to low- 
voltage and low-current applications,” Analog Integ. circ. sig. Proc., 
vol. 8, pp. 83-114, 1’39.5. 
A. G. Andreou and K A. Boahen, “Neural information processing 11,” 
in Analog VLSI Signal and Inlornzation Processing, M. Ismail and T. 
Fier, Eds. New York.: McGi-dw-Hill, 1994, pp. 358413. 
W. Shockley, “Problems related to p-n junctions in silicon,” Solid-state 
Electron., vol. 2, no. I ,  pp. 3 -67 ,  Pergamon Press, 1961. 
S. Tam, P. KO, and C. Hu, “Lucky-electron model of channel hot- 
electron injection in MOSFET’s,” IEEE Trans. Electron Devices, vol. 
ED-31, no. 9, pp. 1116-1125, Sept. 1984. 
S. M. Sze, Physics ojkiicoizductor Devices. New York: Wiley, 1981. 
C. Mead, “Scaling of’ M0S technology to submicrometer feature sizes,” 
J.  VLSI Sig. Proc., vol. 8 ,  pp. 9-25, 1994. 
S. Aritome, R. Shirota, G. Heinink, T. Endoh, and F. Masuoka, 
“Reliability issues of flash memory cells,” in Proc. IEEE, vol. 81, no. 
5 ,  pp. 776-787, May 1993. 
3, pp. 377-389. 
ChriS Diiorio (M’88) received the B A in physics 
trnm Occidental College, Lo\ Angeles, CA, in 1983, 
and the !VI S dcgrcc in electricdl engineering from 
the Cdlifmnd Imtitute of Technology, Pasadena, in 
1984. Sirice September 1992, he has been a doctoral 
candidatc in electrical engineering at the Cdlifornia 
Inrtitutc of Technology 
Hc is cmployed as a Staff Engineer at TRW, 
Inc,  Redondo Bcdch, CA, and has worked as a 
Senior Staff Scientist, American Systems Corpora- 
tion, Chantilly, VA, and a7 a Technicdl Consultdnt 
dt Thc Analytic Science\ Corporation, Keston, VA His interests include 
andlog integrnted circuit deugn, ultra-high-speed digital circuit design, dnd 
semiconductor device phy\ic\ Hi‘, current research involves using floating- 
gdtc MOS transibtors to build adaptive systems in d c o n  
Mr Diorio is ‘I member of Sigmd Pi Sigma 
Paiul Hasler (S‘87) received the B S E dnd M.S 
degrees in electricdl cngineering from AriLona State 
Universily, Tempe, in  August 1991 Since Septem- 
ber 1992, he hds been d doctoral candidate in 
coinputat ion and neural 5ystems at the California 
Inditute of Technology, Pasadena 
ch intcrcsts include using floating-gdte 
or\ to build adaptive systems in silicon, 
the solid-state physics of floating gate 
devices, and modeling high-field carrier transport in 
SI and S i 0 2  
Mr Ha\lcr I\ a mcinbcr of Tau Beta Pi and Eta Kappa Nu. 
19x0 IEhb TRANSACTIONS ON ELECTRON DEVICES, VOL 43, NO 1 I ,  NOVEMBER 1996 
Bradley A. Minch (S’90) received the B S in 
electrical engineering, with distinction, from Cornel1 
Univervty, Ithdia, NY, in 1991 Since September 
I99 1, he has been a doctoral candidate in iompu- 
tation and neural system5 dt the California Institute 
of Technology, Paqadena 
His research interem include current-mode cir 
wits and signal processing, the uce of floatmg-gate 
MOS transistors to build dddptive ry5tems in cilicon, 
and yilicon models of dendritic computation 
Mr Minch is d member of Tau Beta Pi, Eta Kappd 
Nu, and Phi Kappd Phi 
Carver A. Mead (S’53-M’6O-SM’92-F’95) has 
taught at the California Institute of Technology, 
Pasadena, for more than 30 yearn, and is the Gordon 
and Betty Moore Professor of Engineering and 
Applied Science. He has contributed in the fields of 
solid-state electronics and the management of com- 
plexity in the design of very large-scale integrated 
circuits, and has been active in the developnient 
of innovative design methodologies for VLSI. He 
wrote, with Lynn Conway, the standard text for 
VLSI design, Introduction to VLSI Systems. His 
more recent work Analog VLSI and Neural Systems (Addison-Wesley, 1989) 
is concerned with modeling neuronal structures, such as the retina and the 
cochlea, using analog VLSI systems. 
Profesvor Mead is a member of the National Academy of Sciences, the 
National Academy of Engineering, the Ainerican Academy of Arts and 
Sciences, a foreign member of the Royal Swedish Academy of Engineering 
Sciences, a Fellow of the American Physical Society, and a Life Fellow of 
the Franklin Institute. He is also the recipient of numerous awards, including 
the Centennial Medal of the IEEE. 
