Abstract-Spike timing dependent plasticity (STDP) forms the basis of learning within neural networks. STDP allows for the modification of synaptic weights based upon the relative timing of pre-and post-synaptic spikes. A compact circuit is presented which can implement STDP, including the critical plasticity window, to determine synaptic modification. A physical model to predict the time window for plasticity to occur is formulated and the effects of process variations on the window is analysed. The STDP circuit is implemented using two dedicated circuit blocks, one for potentiation and one for depression where each block consists of 4 transistors and a polysilicon capacitor. SpectreS simulations of the back-annotated layout of the circuit and experimental results indicate that STDP with biologically plausible critical timing windows over the range 10µs to 100ms can be implemented. Also a floating gate weight storage capability, with drive circuits, is presented and a detailed analysis correlating weights changes with charging time is given.
I. INTRODUCTION
ignificant research over the last two decades has been undertaken on studying biological neural networks. Specifically this research has focused on how neural networks learn and adapt to their ever changing environment together with the translation of this into biologically inspired hardware neural networks [1] [2] . A neural network (NN) consists of interconnecting neurons, with each neuron connecting to another via a synapse. Within the human brain there are in excess of 10 11 neurons, with each one having up to 10 3 synaptic connections [3] .
In a NN, the effect that one neuron has upon another will vary depending upon input stimuli and synaptic weight. The synapse is responsible for adaption and learning within a NN [4] , through long term potentiation (LTP) or long term depression (LTD), depending on the temporal ordering of the pre-and post-synaptic spikes. Additionally weight modification can also be a short term potentiation (STP) or a short term depression (STD).
Hebb's theory [5] describes how the synaptic weight is allowed to change based upon the inputs and outputs of each neuron within the NN. A further development of the Hebbian learning concept was the introduction of spike timing dependent plasticity (STDP) in 1983 [6] . STDP is concerned with increasing or decreasing the weight of a synapse based upon the relative timings of pre-and post-synaptic spikes. In biology two STDP functions are commonly reported and referred to as symmetric and asymmetric [4, [6] [7] [8] [9] [10] [11] [12] . In this paper we focus on asymmetric STDP as this type of plasticity is known to occur more frequently in biological NN, [4, 7, [11] [12] . It is also worth noting that the exponential functions commonly depicted, are not a pre-requisite for STDP but rather a mathematical convenience. What is important however is the relative timings between pre and postsynaptic spikes as this temporal ordering dictates whether potentiation or depression occurs [46, 47] . In asymmetric STDP, weight potentiation (a pre-post spiking event) occurs if a pre-synaptic spike precedes the post-synaptic spike and this leads to LTP; ∆t s is positive. Likewise, the weight is decreased if a post-synaptic spike occurs prior to a pre-synaptic spike, giving rise to LTD (a post-pre spiking event, ∆t s is negative). The critical timing window [7, [14] [15] [16] [17] [18] typically occurs over the range 10-100msec and outside of this window, no potentiation or depression will occur [7, [14] [15] [16] [17] [18] [19] [20] . The critical timing window is implemented in this work and is programmable.
It has been shown that STDP can be implemented in hardware, and while the majority of these circuits are biologically plausible, their footprints are large [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] requiring up to and, in some cases, exceeding thirty MOSFETs. Other solutions require dedicated microprocessors. A key requirement of hardware neural networks (HNN) is that they are scalable and therefore the designs for neurons, synapses and synaptic modification circuits must be compact, low-powered, while at the same time maintain biological plausibility.
It is proposed here that an STDP circuit with critical time window can be implemented using two dedicated circuit blocks each consisting of 4 MOS transistors, and a polysilicon capacitor. The paper is organized as follows; in section II an overview of theoretical operation of the compact STDP circuit is presented. Section III presents experimental and simulation results undertaken in AMS 0.35µm CMOS process and SpectreS in the Cadence environment respectively. All simulations are conducted on back-annotated layouts, thus incorporating all parasitic elements. A discussion of results relating to the circuit properties is presented in section IV and conclusions drawn in section V.
II. CIRCUIT OPERATION
This section provides an overview of the operation of the proposed STDP weight potentiation and depression circuits. Also a model for the critical timing window is given together with its dependency on process variations. 
A Compact

II.A WP and WD Circuits
The WP circuit is presented in Fig. 1 (a) of negative charge stored on the floating gate ( equivalent capacitance C FG . The weight increase occurs the WP block except that the pre and post spike input terminals are swapped. The WD circuit by removing charge on the FG during a post The WP and WD circuits each consist of 3 NMOSTs, M Transistor M reset is used to ensure that, V and V Pre are high, M reset is off and will not The initial conditions when no pre-or post by M leak and C is discharged.
Consider a pre-post spiking event where V TMpre is the threshold voltage of M pre . When a rate determined by voltage V leak . Voltage in order to cause the synaptic weight to be and V wi , are connected and V wi is pulled up to The synaptic weight will be increased, while
The WP output buffer is constructed using two CMOS inverters with 3 MOSFETs are sized so as to produce the following operation; if inverter then the output from the second inverter, CMOS inverter, then the output from the second inverter is held at ground. determines how much charge is injected and stored on the FG. → min τ cg . Finally for a post-pre spiking event of when the presynaptic occurs.
The operation of the WD block is similar to that of the WP block weight. The WD output buffer is constructed using a single CMOS inverter with 1(b). The inverter MOSFETs are sized so as to produce the following operation; voltage, the output of the buffer is pulled down to output is 0V. For the case of pre-post spiking, the pre update of the synaptic weight. It should be noted that if then ∆w = 0 because both the WP and WD circuits will be 'on' during this event causing node This is consistent with biophysical experiments where it has been reported [ pre-and post-synaptic neurons is inherently delayed by axons or dendrite latencies and thus the actual strongest and weakest synapse efficacy does not occur at the absolute temporal difference ( (a). The circuit will cause an increase of the synaptic weight by increasing the amount the floating gate (FG) of a non-volatile memory device. This . The weight increase occurs during a pre-post spiking event. The WD circuit the WP block except that the pre and post spike input terminals are swapped. The WD circuit ng a post-pre spiking event.
ircuit block with FG device and driver buffer circuit. Voltages indicated are relative to ground. post spiking event where a pre-synaptic spike (V Pre ), increases V C to its maximum value (= When the pre-synaptic pulse ends, C starts to discharge via M Voltage V leak thus controls the timing window in which a post to be increased. When the post-synaptic spike (V Post ) occurs is pulled up to V C -V TMpost (V wi ); V TMpost (V wi ) is the threshold voltage associated with M , while V wi is greater than the trigger voltage of the output buffer The WP output buffer is constructed using two CMOS inverters with 3.3V and 10V V DD rails, as shown in F MOSFETs are sized so as to produce the following operation; if V wi is greater than the trigger voltage of the first CMOS inverter then the output from the second inverter, V CG, will be pulled up to 10V. If V wi is below the trigger voltage of then the output from the second inverter is held at ground. The pulse-width, determines how much charge is injected and stored on the FG. As ∆t s →∆t s min , τ cg → max τ cg pre spiking event no update of the synaptic weight occurs since V
The operation of the WD block is similar to that of the WP block, with post-pre spiking WD output buffer is constructed using a single CMOS inverter with 3.3V and -10V supply . The inverter MOSFETs are sized so as to produce the following operation; when V wd voltage, the output of the buffer is pulled down to -10V. If V wd is less than the threshold voltage of the inverter, then the post spiking, the pre-synaptic spike causes V C and V wd to be pulled low and there is It should be noted that if ∆t s = 0 (a pre-and post-synaptic spike occurring at the same time) w = 0 because both the WP and WD circuits will be 'on' during this event causing node biophysical experiments where it has been reported [50, 51] that synaptic communication between synaptic neurons is inherently delayed by axons or dendrite latencies and thus the actual strongest and weakest does not occur at the absolute temporal difference (∆t s = 0).
2
The circuit will cause an increase of the synaptic weight by increasing the amount his device is represented by its The WD circuit is identical to that of the WP block except that the pre and post spike input terminals are swapped. The WD circuit decreases the synaptic weight and driver buffer circuit. Voltages indicated are relative to ground.
, a PMOST, M reset and a MOS capacitor, C.
and V Pre respectively. When V post The operation of the WP circuit is now outlined.
are low, node V C is pulled low its maximum value (= 3.3V-V TMpre ): to discharge via M leak , and V C decreases at ndow in which a post-synaptic spike must occur occurs, the nodes with voltages V C the threshold voltage associated with M post . is greater than the trigger voltage of the output buffer. rails, as shown in Fig. 1 (a). The is greater than the trigger voltage of the first CMOS is below the trigger voltage of the first width, τ cg , and magnitude of V CG cg . Similarly as ∆t s →∆t s man , τ cg since V C and V wi are low, regardless pre spiking causing a decrease in synaptic 10V supply rails, as shown in Fig. wd , is greater than the threshold is less than the threshold voltage of the inverter, then the to be pulled low and there is no synaptic spike occurring at the same time) w = 0 because both the WP and WD circuits will be 'on' during this event causing node V CG (Fig.1) to be set at 0V. 51] that synaptic communication between synaptic neurons is inherently delayed by axons or dendrite latencies and thus the actual strongest and weakest
II.B Critical Timing Window
The critical timing window (CTW) is crucial in biology because it determines the time window over which synaptic modification can occur and is typically 20-25ms for potentiation and depression [7, 9] . However, in hardware the computational speed is greatly accelerated, with average spike train frequencies in the MHz range. We therefore implement an equivalent timing window of 20-25µs in this work although, as will be shown, the window can be programmed to accommodate a wide temporal range. We define here, the critical timing window, t cw , as the time it takes for V C to fall from 90% to 10% of its initial value for both the WP and WD blocks. The rate at which the sub-threshold current reduces V C is set by V leak and the aspect ratio of M leak , S Mleak . The sub-threshold current, I leak is constant for V DS = V C > 3kT/q; (2) which can be used to determine the critical timing window, t cw : V M (= 3.3V-V TMpos ) is the maximum value of V C . The window can be adjusted using V leak according to:
Substituting equation (1) into (2) and rearranging allows a value for V leak to be calculated for the required t cw . In this study, t cw is chosen to be 20µs, giving V leak =410mV.
The important effects of process variation upon the critical timing window are now considered. Process variation can affect most parameters of the MOSFET and these can conveniently be represented by the transconductance factor (β) and threshold voltage, V t [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] . Subthreshold MOSFETs are particularly sensitive to process variation because of the exponential relationship between drain current and gate voltage (equation 1). The threshold voltage is also strongly related to several device parameters which are prone to variation during the fabrication process.
For M leak operating in subthreshold, only V t is considered, [35, 38, [43] [44] as this incorporates variations in both off-current and subthreshold slope, as shown in equation (3), for an n-channel device, where N a is the acceptor doping concentration, t ox the oxide thickness, ߶ ி the Fermi potential, Φ ெௌ the work function difference, Q t the trapped oxide charge density, C o the oxide capacitance and ε 0 , ε s , ε ox are the permittivity of free space, relative permittivity of silicon and silicon dioxide respectively.
The variation in ܸ ௧ = ܸ ௧ ± ∆ܸ ௧ where V t0 is the nominal threshold voltage for the AMS process, V t0 = 0.48, and ±∆V t is the change in V t due to process variations. For the AMS process ∆V t = ±17.5mV. A simple model for the effect of process variation on t cw , can therefore be written as:
Monte Carlo analysis was undertaken in Cadence to assess the effects of inter-die/die-to-die process variation on the critical timing window and results are presented in Fig. 4 . The results of Fig. 4 , compare the Monte-Carlo simulations with equation (4) , and good agreement is apparent with ∆V t = ±17.5mV. The results also show a considerable change in the critical timing window, t cw , from the ideal value of 20µs, due to process variation for V leak = 410mV. For ∆V t = +17.5mV, t cw = 30.86µs, and for ∆V t = -17.5mV, t cw = 12.21µs. The effects of process variation on t cw is presented later where it will be shown ( Fig. 19 ) that this variation can be offset by adjusting the learning duration. Fig. 3 (a) presents simulation and measured results of a post-pre spiking event, where the pre-synaptic spike occurs 5µs after the end of the post-synaptic spike, ∆t s = 5µs. In this case no weight update occurs. This is because C is initially discharged with V C = 0V due to the occurrence of the post spike before the pre spike. Results are now presented in Fig. 3(b) , Fig. 4 and Table 1 , for a series of pre-post spiking events where the time difference, ∆t s , between pre-and post-synaptic spike is increased from 1µs to 15µs. Fig. 3(b) indicates that V pre causes C to be charged to voltage V C =V M , and then discharges to give t cw = 20µs. Voltage V wi tracks V C after V post occurs, triggering a weight update. It should be noted that V wi is only pulled down to about V t . For ∆t s =1µs, the maximum weight update occurs, ∆w = ∆w max . This occurs as V wi is above the trigger voltage of the output buffer, while V post is still high. Thus V CG is at its maximum pulse width, τ cg = 10.91µs (simulation) and has a measured value of τ cg = 10.75µs. In both cases V CG has a magnitude of 10V. Fig. 5(b) shows that the measured value for V C shows good agreement with the simulation results. In Fig. 4(a) , ∆t s is increased to 7µs, again V CG is pulled high to 10V. However τ cg is reduced compared to ∆t s =1µs, τ cg is now 4.92µs (simulated) and 4.60µs (measured). The reduction in τ cg occurs because V post coincides with the linearly decreasing V C . Voltage V wi now tracks the decreasing V C , until, eventually V wi is pulled below the trigger voltage of the first CMOS inverter, while V post is still high, Fig. 4(a) . Finally in Fig. 4(b) ∆t s = 11µs further reduces τ cg to 0.91µs and 0.65µs for simulation and measured respectively. The magnitude of V CG is slightly reduced to 9.6V. This corresponds to the minimum weight update ∆w = ∆w min . Table 1 presents the results of increasing ∆t s on τ cg for both simulation and experimental results. Table 1 indicates that once ∆t s ≥ 12µs then no update in the synaptic weight takes place as V CG ≈ 0 due to V wi being less the threshold voltage of the first CMOS inverter when V post is high. The results presented in Table 1 represented the upper left hand quadrant of the STDP curve presented later in Fig. 6 
III.A WP Results
III.B WD Results
As the WD circuit block is identical to the WP circuit with the exception of the application of V pre and V post its operation is also identical. Fig. 5 (a) presents simulation and measured results of a pre-post spiking event, where the post-synaptic spike occurs 5µs after the end of the pre-synaptic spike, ∆t s = 5µs. In this case no weight update occurs. Table 2 present the simulation results for a series of post-pre spiking events upon the WD circuit. |∆t s | is once again increased from 1µs to 15µs. Referring to Fig. 5(b) , ∆t s = -7µs; as V post is pulled high C is charged to voltage V M = 2.43V. As V pre goes low, C discharges (initially) linearly via M leak . When V pre goes high, nodes V C and V wi are connected such that V wi ≈ 1.70V. A weight decrease is triggered as V CG is pulled down to -10V. V pre goes low, both V wi and V CG are pulled back to 0V, ending the synaptic weight update. This is consistent with the theoretical operation outlined previously.
For ∆t s =-1µs, the maximum value of the weight decrease occurs, ∆w = ∆w max . V CG is at its maximum pulse width; τ cg = -11.31µs and magnitude, V CG = -10V. Table 2 shows that by further increasing ∆t s , to ∆t s = -5µs, ∆t s = -7µs, ∆t s = -8µs. causes τ cg to be reduced to 8.14µs, 6.16µs and 5.15µs respectively. For ∆t s = -13µs τ cg ≈ 0.53µs, and the magnitude of V CG is slightly reduced to -9.6V. This corresponds to the minimum weight update ∆w = ∆w min . Table 2 indicates that once ∆t s ≥ 14µs then no update in the synaptic weight takes place as V CG ≈ 0 due to V wd being less the threshold voltage of the CMOS inverter when V pre is high. The results presented in Table 1 represented the lower right hand quadrant of the STDP curve presented later in Fig. 6 . Fig. 6 is a plot of τ cg against ∆t s which from 1µs to 15µs, τ cg decreases from 11.31µs to decreased from -1µs to -15µs τ cg decreases from 11.31µs to function since τ cg ∝ ∆w, where Q inj α ∆w. 
IV. PHYSICAL
The STDP circuit is to be used with FG devices, therefore we next consider the sensitivity of the weight charge injection to the FG, in relation to the STDP curve presented in Fig. the change in the associated weight; Q inj where ‫ܣ‬ = ‫01ݔ45.1‬
effective mass of an electron in the insulator and be noted that the constants A, B are strictly for tunneling from a metal contact but are similar to the case of injection from a semiconductor [49] and serve our purpose for illustrating the model and method. Fig. 7 presents the cross-section of a FG device constructed using a poly onto the FG, Q inj , can be found from consideration of the current in the thin tunneling oxide, t derive a model to allow the determination of 
PHYSICAL MODELLING OF WEIGHT STORAGE
The STDP circuit is to be used with FG devices, therefore we next consider the sensitivity of the weight charge injection to relation to the STDP curve presented in Fig. 6 and charging time. The charge injected onto the FG α ∆w. The charge is injected by the Fowler-Nordheim mechanism [48] .
ܸ/ܿ݉, m o is the mass of an electron at rest, m effective mass of an electron in the insulator and ߶ is the barrier height for injection from semiconductor to oxide. It should B are strictly for tunneling from a metal contact but are similar to the case of injection from a semiconductor [49] and serve our purpose for illustrating the model and method.
section of a FG device constructed using a poly-silicon and MOS , can be found from consideration of the current in the thin tunneling oxide, t ation of Q inj (∆w) and the associated potential of charge stored on the FG, V Equivalent capacitor diagram of FG device, CFG; CFG = (Cpoly -1 +Cox
where Cpoly is the capacitance of the interpoly oxide, C are the voltages applied to the control gate and coupled onto the FG respectively. constructed using polysilicon and MOS capacitors. Qinj represents the charge stored on the FG and Qrem represents the charge removed from the FG, both due to FN tunneling.
Asymmetric STDP Curve
The STDP circuit is to be used with FG devices, therefore we next consider the sensitivity of the weight charge injection to and charging time. The charge injected onto the FG Q inj represents Nordheim mechanism [48] . (5) is the mass of an electron at rest, m ox is the semiconductor to oxide. It should B are strictly for tunneling from a metal contact but are similar to the case of injection from a and MOS capacitor. The charge injected , can be found from consideration of the current in the thin tunneling oxide, t ox over a time step, ∆t. We now w) and the associated potential of charge stored on the FG, V ∆w .
is the capacitance of the interpoly oxide, Cox is the capacitance are the voltages applied to the control gate and coupled onto the FG respectively. Cross section of FG device, represents the charge removed from the FG, both due
The capacitively coupled voltage, V FG capacitive coupling coefficient, defined as it is assumed that there is no parasitic charge in the oxide or initially stored on the FG. is the surface potential at the oxide-semiconductor interface. The field at equation (6) (see appendix for derivation).
The associated change in potential is calculated by finding the difference between successive steps of field:
The charge per unit area injected onto the FG for the duration of the pulse width Fig. 8 presents plots of (a) Q inj against tunneling area. The increment of charge injected decreases for increasing electric field.. Similarly as ∆t s is decreased below
The results indicate that Q inj (and V ∆w Increasing the device tunneling area causes a shift in the STDP curve. Specifically this is a shift in the magnitude of the charge injected/removed for the same ∆t value.
The effect of process variation (PV) on the STDP curves is now cons FG which falls across t ox is shown in Fig. 7 , and given by capacitive coupling coefficient, defined as ߙ = ೣ ା . The electric field in the oxide, E ox is given it is assumed that there is no parasitic charge in the oxide or initially stored on the FG. V FG is the potential of the FG and semiconductor interface. The field at successive time steps, (see appendix for derivation).
The charge per unit area injected onto the FG for the duration of the pulse width ∆t is then found as against ∆t s and (b) V ∆w against ∆t s . Fig. 8 (a) presents the STDP curve for increasing tunneling area. The increment of charge injected decreases for increasing ∆t because the stored charge serves to reduce the is decreased below -1µs, the amount of charge removed is also decreased. , where is the potential of the FG and ߶ ௦ time steps, ∆t, can be found from (6) The associated change in potential is calculated by finding the difference between successive steps of field: (7) then found as ‫ݓ∆‬ ∝ ܳ ൌ ‫ܥ‬ ܸ ∆௪ . (a) presents the STDP curve for increasing t because the stored charge serves to reduce the is also decreased.
) v ∆t s and τ cg v ∆t STDP plots. Increasing the device tunneling area causes a shift in the STDP curve. Specifically this is a shift in the magnitude of the shows the effect of PV upon the output characteristics of the STDP circuit, τ cg against ∆t s . The plot concurs with the earlier statement that PV can either increase or decrease t cw . The effect of this is to cause a shift in the ideal τ cg against ∆t s curve. If PV causes t cw < t cwideal (20µs) the curve is shifted to the left. Conversely if t cw > 20µs the curve is shifted to the right. The effect of PV is to vary the amount of charge (hence potential of charge) injected/removed from the FG. For t cw < 20µs ∆w (V ∆w ) curve is shifted to the left. Conversely if t cw > 20µs ∆w (V ∆w ) curve is shifted to the right. Specifically there is no overall change in the magnitude of ∆w, Q inj . Rather there is a shift in the magnitude of the charge injected/removed for the same ∆t s value. This does not affect the overall operation of the STDP circuit in that it still follows the STDP rule. However, the amount of charge injected can be compensated for by altering the learning duration.
V. CONCLUSION
Compact STDP circuit blocks have been proposed, which can control weight increase and decrease within a hardware neural network. Simulation and experimental results of the WP circuit are presented which indicate that for a post-pre spiking event, no update of the synaptic weight occurs. A pre-post spiking event will however cause the synaptic weight, which is represented as charge on the FG of the synapse, to be increased. The amount, by which the synaptic weight is changed, ∆w, is determined by the duration that V wi is greater than 1.2V and by the magnitude of V CG . The maximum weight, ∆w max is obtained when V CG has a pulse width of ≈11µs and a constant magnitude of 10V. The minimum weight, ∆w min , prior to V wi being less than 1.2V is achieved when V CG has a pulse width of 0.9µs and magnitude of 9.6V. Furthermore, the critical timing window within which synaptic modification takes place can also be controlled with voltage, V leak . The key issue of the significant influence of process variations for devices operating in subthreshold has been modeled. We show that process variations do not adversely affect the learning dynamics because the weight changes depend on the temporal difference within the STDP window. Also changes in charging/discharging duration can be compensated for within the learning algorithm. Additionally a model correlating charge alterations within the FG as a function of the charging/discharging duration was presented and this relationship was extended to show the dependency of the weight changes on the temporal difference between pre and post synaptic spikes. 
