I. INTRODUCTION
Over the past ten years, numerous experimental studies [1] - [4] have shown that the synaptic strength varies as a function of the precise spike timing difference ∆t = t post − t pre between the firing times t pre and t post of the presynaptic and postsynaptic neurons respectively. This synaptic plasticity rule, called Spike Time-Dependent Plasticity (STDP), has evolved as one of several unsupervised plasticity rules that play an important role in learning and memory in the brain. The mathematical model of STDP based on a pair of pre-and postsynaptic spike is referred as doublet STDP (D-STDP) while the one based on triplet of synaptic spikes [5] - [8] i.e either pre-post-pre synaptic spike or post-pre-post synaptic spike is referred to as triplet STDP (T-STDP). Several variants of these rules have also been proposed for pattern classification tasks [9] , [10] .
Experimental results [11] indicate that D-STDP model based on pairs of spikes are not sufficient to explain synaptic changes due to triplets or quadruplets of spikes. D-STDP model also fails to reproduce frequency effects. However, T-STDP model can reproduce frequency effects along with the explanation of synaptic changes due to triplets and quadruplets Roshan Gopalakrishnan and Arindam Basu are with VIRTUS, IC design centre of excellence, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798 (email:arindam.basu@ntu.edu.sg).
Copyright (c) 2010 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to pubs-permissions@ieee.org. of spikes. It is also important to be able to replicate rate based plasticity experiments. It has been demonstrated that the Bienenstock-Cooper-Munro (BCM) learning rule [12] based on firing rates can be obtained from D-STDP when presynaptic and post-synaptic neurons fire uncorrelated or weakly correlated Poisson spike trains, and only nearest-neighbor spike interactions are taken into account [13] . However, it is not possible to make a strict theoretical mapping from the nearest-spike interactions D-STDP models to the BCM rule [11] whereas T-STDP allows for such theoretical mapping. Apart from these experimentally observed benefits of T-STDP model, there are some computational advantages as well [14] . It has been shown that T-STDP can detect input correlations higher than the second order ones to which D-STDP is sensitive. Hence, it has been shown to drive direction and orientation selectivity [14] . Further, it can be shown to reproduce a more generalized version of BCM overcoming the limitations of the original one. Though there are several models of plasticity with varying degrees of bio-realism [15] , T-STDP is a good compromise between simplicity and richness of function. Its simplicity also lends itself to easy analysis making it a good choice for a plasticity rule with functionality beyond D-STDP.
Recently STDP became so popular in computational neuroscience that neuromorphic engineers who try to emulate brain function using VLSI have also tried to emulate this behaviour in silicon. However, implementing a compact learning synapse continues to be one of the big challenges in the field [16] . Several recent papers have reported D-STDP implementations [17] - [21] and T-STDP implementation [22] ; however, these synapses could either only store states in a transient fashion (using charge on a capacitor) or only hold two states in the long term. The size of these synapses are also large hindering scalability of these designs. A promising solution for nonvolatile analog weight storage in very small area is provided by a floating-gate (FG) device [16] , [23] - [30] . This concept was utilized recently to show weight storage and adaptation due to quantum phenomena based on input signal timing [30] .
Compared to other work, here we demonstrate for the first time the implementation of the T-STDP rule in a FG synapse by appropriately modifying the drain voltage pulse based on spike triplets. From chip measurement results, we show that FG synapse in Fig. 3 (b) can reproduce (1) D-STDP learning window and (2) T-STDP results when appropriate control signals are applied on its terminals. Some initial results for triplet experiment based on this work were presented in [31] . In this paper, we present results for quadruplet experiments and frequency effects as well as a new drain voltage generation scheme. We also present circuits for VLSI implementation of the drain voltage waveforms.
The paper is organized as follows: Section II provides a brief 
II. STDP SYNAPTIC MODIFICATION RULE
In biology, synapses are specialized structures that permit the transfer of signals between two neurons with an associated synaptic strength or weight. Learning typically implies the modification of synaptic weight due to the activities of the preand post-synaptic neurons. The STDP models are explained in detail next.
A. Doublet STDP (D-STDP) Model
In D-STDP, potentiation occurs when a postsynaptic spike succeeds a presynaptic spike; otherwise depression happens. The weight changes can be governed by a temporal learning window. The temporal learning window for STDP can be expressed as [11] , [22] 
where ∆t = t post − t pre is the time difference between a post-synaptic and pre-synaptic spike, τ + and τ − are the time constants of the learning window, and A + and A − are the maximal weight changes for potentiation and depression, respectively. The theoretical graph for the above equation is simulated using MATLAB and is shown in Fig. 1 with the parameters being obtained by data fitting as explained in [11] . As mentioned in [2] , τ + and τ − are taken as 16.8ms and 33.7ms respectively for the simulation.
B. Triplet STDP (T-STDP) Model
Previous studies [7] , [32] show that the D-STDP model fails to reproduce the experimental outcomes involving higher order spike patterns such as triplet and quadruplets of spikes and furthermore, fails to account for the observed weight dependence on repetition frequency of pairs of spikes. To resolve the above mentioned issues, the D-STDP model was extended in [11] to include spike triplets resulting in T-STDP model which could sufficiently reproduce physiological experiments.
The T-STDP rule is written as a function of difference in spike timings as, [11] , [22] 
where A + 2 and A − 2 denote the amplitude of the weight change whenever there is a pre-post pair or a post-pre pair respectively. Similarly, A + 3 and A − 3 denote the amplitude of the triplet term for potentiation and depression, respectively. ∆t 1 = t post (n) -t pre (n) , ∆t 2 = t post (n) -t post (n-1) and ∆t 3 = t pre (n) -t pre (n-1) are time difference between combinations of pre and post-synaptic spikes as shown in Fig.  2 . τ − , τ + , τ x and τ y are time constants for the above spike pairings.
In [11] , though the T-STDP rule above is introduced first, it is shown later that not all terms are needed to explain biological data. Thus two different minimal models are defined later: (1) A + 2 = 0 and A − 3 = 0 for visual cortex data and (2) A − 3 = 0 for hippocampal culture data set. Hippocampal culture data set [7] is used for obtaining the results for triplets of spikes whereas visual cortex data [32] is used for showing the frequency effects of T-STDP rule. For visual cortex data set, equation (2) simplifies to,
On the other hand, for hippocampal culture data set, equation (2) simplifies to,
For both cases, ∆w − is exactly same as the case of long term depression (LTD) in D-STDP as shown in equation (1) . Hence, to implement the triplet rule in circuits, we only need to modify pre-existing FG design to add the extra term in the potentiation case.
III. FLOATING GATE SYNAPSE Fig. 3(a) shows the architecture of a single floating gate synapse in prior work [30] . It has three main terminals for programming as shown in the dashed box. The terminals are named as gate, drain and tunnel terminals with the respective voltages denoted as V g , V d and V tun . A "non-STDP" behaviour seen in this work is ameliorated in our previous work [25] , [26] along with detailed analysis of the operation of the D-STDP learning rule in a floating gate synapse. In previous works [25] , [26] , [30] , the quantum mechanism of tunneling A comparison of the temporal notations in this paper and [11] (mentioned as Pfister Paper in figure) is shown.
is spread across a larger time scale (illustrated in Fig. 4 (a) and (b)) which makes it difficult to analyze the effect of triplets and quadruplets of spikes in a floating gate synapse. Similar to an approach in [28] , [29] , the effect of tunneling can be localized at the occurrence of pre-synaptic spikes with the modification (red blocks) shown in the architecture of Fig. 3(b) making it easier to mathematically analyze the weight change for triplets. In the new architecture, whenever a pre-synaptic spike occurs, a triangular gate voltage waveform is generated which will create an exponential excitatory post-synaptic current (EPSC), similar to biology, because of the exponential relationship between the gate voltage and drain current of the MOS transistor in subthreshold region. The current at the maximum gate voltage is nearly zero. Similarly, whenever a post-synaptic spike arrives, a global triangular tunnel voltage waveform and an inverted pulse drain voltage waveform is generated. The global triangular tunnel voltage waveform is then sampled at the occurrence of pre-synaptic spike with the help of pulse extender block and the multiplexer. This creates a voltage waveform, V tun ef f at the tunnel terminal. Thus, in the new architecture, during pre-synaptic spike, a gate voltage, V g and a tunnel voltage, V tun ef f waveforms are generated and during post-synaptic spike, a drain voltage, V d waveform is generated. Whereas, in the previous architecture, a gate voltage waveform is generated during pre-synaptic spike and both drain voltage and tunnel voltage waveforms are generated during post-synaptic spike.
The equation for drain current of a subthreshold saturated pFET whose well is tied to V dd is given by [30] 
where U T is the thermal voltage and κ is the gate coupling coefficient.
Weight modification in a FG synapse uses a combination of hot-electron injection (HEI) and Fowler-Nordheim tunneling [30] . HEI adds electrons on to the floating gate node, which reduces the floating gate voltage resulting in more current through the transistor hence increasing the weight of the synapse. On the other hand tunneling removes electrons from the FG node to reduce the synaptic weight. In our previous paper [25] , for the case of D-STDP, a gate voltage waveform is generated at every pre-synaptic spike while at every postsynaptic spike, a tunneling voltage and a drain voltage is generated as illustrated in Fig. 4 (a). This shows that at every pre-synaptic spike only tunneling happens while at every post-synaptic spike there is both injection and tunneling. The occurrence of both tunneling and injection at the post-synaptic spike arrival necessitates the effect of injection to be greater than tunneling to obtain potentiation as in the traditional D-STDP learning window. Moreover since the effect of tunneling is spread over a long time, mathematical analysis or intuitive understanding of the effect due to multiple pre-and postsynaptic spikes becomes difficult. In contrast, the mathematical versions of both STDP models have potentiation and depression of weights localized at pre-and post-synaptic events. Hence, to relate the mathematical analysis of FG synapse with the T-STDP model easily, we also decided to localize the effect of tunneling and injection at pre-and post-synaptic pulses respectively. A similar technique has also been used in [28] , [29] .
An illustration for this is shown in Fig. 4 (b) where a tunneling voltage waveform is still created at every post synaptic pulse as in earlier work [25] , [30] but is not applied directly to the tunneling junction of the FG. Instead, it is sampled by the pre-synaptic pulse through a multiplexer to create a new waveform V tun ef f , which is applied to the FG (refer to Fig. 3 ). Figure 4 (b) also shows the specifications of terminal voltage waveforms for the case of triplets of spike. The well for the high voltage PMOS in the multiplexer can be shared with tunneling junction; nevertheless, this incurs an area penalty motivating us to look at possibilities of using non-localized tunneling for T-STDP in future. The governing equations for injection and tunneling are given as [30] , [33] 
where I d is the drain current, α = 1-U T /V inj , V ox and V inj are process dependent parameters.
A. D-STDP Model on a Floating Gate Synapse (FG D-STDP)
The weight of the synaptic device can be defined as [30] :
Hence, equations to predict the change in FG voltage effectively predict the change in weight. The following assumptions are made in deriving the theoretical equations [25] : 1) To ensure small change in weight at very large negative and positive values of ∆t, V g init has to be high enough so that V tun max -V g init is small enough for negligible tunneling. Similarly ∆V g = V g init -V g min , should be small enough so that V tun init -V g min is small enough for negligible tunneling. 2) Strong coupling from gate to floating gate node where as a weak coupling from tunneling node to floating gate node. This is justified since typically gate capacitance, C g >> tunneling capacitance, C tun .
3) The gate voltage waveform falls to its minimum value instantaneously. In other words, S 1 >> S 2 , S 3 as shown in Fig. 4(a) . One difference from the earlier case in [25] is that now we only have injection for ∆t > 0 and only tunneling for ∆t < 0 due to the localization of tunneling effects.
We can now derive the slow time scale equation [30] for change in FG voltage due to tunneling and injection as:
where C T is the total capacitance on the FG node and V f g denotes change on a slow time scale. 1) Case 1: ∆t > 0: First, we consider the case of ∆t > 0 i.e the positive axis of STDP curve and combine equations (6) and (5) to get:
where V f g inj is the slow time scale change in V f g due to injection only. Since change in V f g on the RHS happens due to coupling from the gate voltage, we can write:
where S 2 is positive slope of V g and C g is the capacitance connected between the gate terminal and the floating gate terminal. Here V f g min is not same as V g min due to initial charge stored on the FG. Substituting equation (10) in equation (9),we get
where
Referring to Fig. 4 (a), significant amount of injection happens in the time from ∆t to ∆t + T d ( circled in red ), where ∆V ds is constant and significant. Also, since T d is very small compared to T g , we can assume that V g is constant during the drain pulse. Hence, we finally get:
For more details of the derivation, we refer the interested readers to [25] .
With reference to Fig. 4(b) , since we have modified the tunneling voltage waveform from V tun to V tun ef f the effect of tunneling is not present in ∆t > 0 . Hence, we have ∆V f g = ∆V f g inj .
2) Case 2: ∆t < 0: Now we consider the case of ∆t = (t post − t pre ) < 0 i.e the negative axis of STDP curve. Similar to what we have done above, let us see the effect of tunneling and injection separately.
For the contribution of injection to V f g , we could see that injection happens only during the initial small period, T d of time axis where the drain current of the MOS is almost zero. So we can completely neglect the effect of injection on V f g in this case. Now let us consider the contribution of tunneling to V f g . Similar to the analysis above, using assumption 1, we have:
Also, similar to the case of positive ∆t, here we have:
V tun ef f = V tun ; −∆t < t < −∆t + T tun pulse V tun init ; f or other values of t (15) Substituting equations (15) and (6) into equation (14), we get
Since injection is negligible for ∆t <0, we have ∆V f g = ∆V f g tun .
B. T-STDP Model on a Floating Gate Synapse (FG T-STDP)
Comparing equation (1) , which occurs at the arrival of postsynaptic spike. Intuitively, for the case of triplet of spikes, we can see that there should be some modification to the drain voltage pulse compared to the doublet case since it is related to potentiation and is generated at the post-synaptic spike. There are two ways in which we can have more potentiation due to injection: (1) increase ∆V ds or (2) increase injection pulse width T d . Based on this, we propose the following two drain voltage waveform that accounts for triplet spike interactions: 1) Single-pulsed V d : This case implements the entire potentiation term in equation (4) by a single pulse of width T d at t post (Fig. 4(c) -however, the voltage V d depends on the time difference ∆t 2 between successive post spikes. So from Fig. 4 (c) expression for V d becomes: (Fig. 4(c) ) implements the doublet term with the first pulse of width T d and amplitude same as in FG D-STDP. It then creates the extra triplet term with the following pulse again of width T d but with different amplitude. The entire waveform is given by:
T-STDP model to understand the kind of modification to be done. Here, we consider only the FG phenomenon of injection that happens at post-synaptic pulses, since our modification has localized tunneling to pre-synaptic events. Then, we can equate this to the weight change from T-STDP model for any specific case (e.g. hippocampal culture data set in [11] ) to extract the exact parameters of the pulse drain waveform. Here, we will express the weight change for T-STDP normalized to D-STDP for ease of understanding; this is intuitively good since T-STDP has modifications on top of D-STDP. Let us consider this case by case.
1) Case 1: FG T-STDP for spike doublet inputs
The FG phenomenon on doublets of spike is exactly similar to the derivation shown in previous subsection III-A for single and double pulsed V d cases. Therefore,
2) Case 2: FG T-STDP for spike triplet inputs a) Single-pulsed V d : Similar to the mathematics in case 1, we get:
where A and X are same as above. The ratio of change in weight due to FG T-STDP on triplets and doublets of spike is given as,
where X is same as above. Here, the term e −XT d arises since the gate voltage has changed a little bit at the start of the second pulse compared to the first. Since e −XT d ≈ 1 due to small value of T d , the ratio of change in weight due to FG T-STDP on triplets and doublets of spike can be written as:
(25) 3) T-STDP rule for spike doublets and triplets
For the case of T-STDP and D-STDP models the ratio Y can be obtained directly from equation (4) as:
Now, we can equate the Y f g values in equations (23) and (25) with the Y theory value in equation (26) to get the desired drain voltage parameters. a) Single-pulsed V d :
C. Parameter Translation from Theory to FG model
Given values of parameters such as A However, in our case, the measurement setup restricted the temporal width of voltage waveforms to a maximum value of 300ms (note the maximum time duration used for T tun ). Hence, compared to the learning window τ + (refer to Fig.  1 ) shown in [11] , our learning window τ + f g (Fig. 5) has lesser width. This implies that we need to apply a compression factor on the time scale to match our results with those in [11] . We define the compression factor r as follows:
For the present case, we get the value of r ≈ 2. Thus, to replicate the weight changes in [11] , we need to reduce all the temporal dimensions in our case by the factor of r. This can be formally written as:
Next, we show this translation explicitly for the three experimental protocols considered in [11] .
1) Triplet experiments:
For the different parameters of hippocampal culture data set from table 4 in [11] i.e A The definitions of (∆t 1 and ∆t 2 ) here are similar to [22] and are different compared to the notations given in [11] . For clarity, the difference is shown in Fig. 2 . As an example, the data points chosen in protocol 2 are given by (∆t 1 /r, ∆t 2 /r) where (∆t 1 , ∆t 2 ) ⇒ (-5,5), (-10,10), (-5,15) and (-15,5) ms corresponds respectively to data points in [11] . 2) Frequency effects of pairing protocol:
In [11] , ∆t used for the case of frequency effects is 10ms which is halved to 5ms for obtaining the measurement results in this paper. Also, ρ Hz mentioned in frequency effects of pairing protocol results of [11] will be equal to r × ρ Hz in our measurement results.
3) Quadruplet experiments:
Quadruplet experiments contain two temporal variables ∆t and T as shown in Fig. 4(c) . ∆t = 5ms in [11] hence, we have used ∆t = 2.5ms for measurements. Then the chip measurement values for quadruplets are taken for different values of T /r up to 80ms. For comparing these measurement results with mathematical model, we simulated the mathematical model for double the value of T i.e up to 160ms.
IV. RESULTS
The measurement results shown in this section is obtained from a single FG synapse fabricated in TSMC 0.35µm CMOS process with C g = 5.5 pF. Though not optimized for synaptic density, the results here do serve as a proof of concept for our FG T-STDP implementation.
A. FG D-STDP learning window
The first set of testing on FG synapse is to check whether the FG T-STDP can reproduce D-STDP learning window. 
B. Failure of FG D-STDP
We applied D-STDP rule on FG synapse to the pre-and post-synaptic spike pairs as done in [11] . For obtaining the measurement results in Fig. 6 , we set the voltage and timing parameters as given in Fig. 4(b) i.e, same as mentioned in subsection IV-A.
1) FG D-STDP fail to reproduce frequency effects:
For obtaining measurement results, we have set ∆t = ±5ms as shown in Fig. 4(c) . The measurement results are plotted as weight change in D-STDP rule as a function of frequency, ρ. The measurement results (red lines) shown in Fig. 6(d) , almost follows the mathematical D-STDP model (blue lines) in [11] . As mentioned in [11] , this can be attributed to the following reasons. First, As pointed out in [32] , at low repetition frequency, ρ, there is no potentiation. This cannot be captured by FG D-STDP, because pulsed drain voltage waveform, V d at the occurrence of a post-synaptic spike after a pre-synaptic spike by few milliseconds induces LTP. Second, In experiments, for ∆t >0, potentiation increases when frequency increases. This trend can also not be reproduced by FG D-STDP. In D-STDP model, as soon as the frequency increases, the pre-post spike pairs approach each other and the post-synaptic spike of the first spike pair interact with the pre-synaptic spike of the subsequent spike pair. With increase in frequency, the post-pre spike interaction increases and therefore depress the synapse, which is not observed in experiments.
2) FG D-STDP fail to reproduce triplet experiments:
In triplet experiments, as shown in Fig. 6(a) and (b) , there is a clear asymmetry between two protocols mentioned (black bars). 60 repetitions of pre-post-pre triplet yields less weight change, whereas 60 repetitions of post-pre-post triplet yields a weight change of ∼30% (black bars). However, FG D-STDP shown in red bars, predicts almost same result for both protocols, because the mechanism of potentiation and depression happening due to the generation of different voltage waveforms at pre-or post-synaptic spikes are similar in both protocols. Therefore, triplet results cannot be explained by a sum of pre-post injection mechanism and a post-pre tunneling mechanism.
3) FG D-STDP fail to reproduce quadruplet experiments:
The asymmetry present in the quadruplet experiments, as shown in Fig. 6(c) , also causes some problem for FG D-STDP. A quadruplet consists of a pre-post-post-pre sequence where, T<0 or a post-pre-pre-post sequence where, T>0 as shown in Fig. 4(c) . Here, |T | denotes the interval between the first and last pair of spikes within the quadruplet. Sequence, prepost-post-pre consists of two pre-post interactions and a postpre interaction whereas; for the sequence, post-pre-pre-post, the opposite occurs i.e two post-pre interactions and only one pre-post interaction. This clearly leads to an asymmetry which is not seen in experiments [11] .
C. Success of FG T-STDP
FG T-STDP is implemented with the help of a drain voltage waveform generator, explained in next section V. As mentioned in the previous section III-B, the mathematical analysis provides an intuitive understanding of utilization of V d pulse waveform to obtain the extra triplet term in the T-STDP rule (equation (2)). As the extra triplet term can be achieved with the two cases of proposed pulse drain voltage waveform, here, we have shown measurement results for both the cases of V d waveform. The measurement results obtained for both the case of drain voltage waveform satisfy the behavior seen in experimental results [11] . The timing specifications for frequency effects of pairing protocol, triplets of spike and quadruplets are shown in Fig. 4(c) .
1) FG T-STDP can reproduce triplet and quadruplet experiments:
T-STDP model on floating gate synapse does not only reproduce the learning window (Fig. 5) , but also it can reproduce most of the triplet and quadruplet experiments as shown in Fig. 7(a),(b) and (c) and Fig. 8(a),(b) and (c) . The voltage and timing parameters for the case of single-pulsed drain voltage waveform are given in Fig. 4(b) i.e, same as mentioned in subsection IV-A. Here, V d min = 0.3V is set with respect to ∆V d (∆t 2 ) value (equation (27)), for parameters of Nearest-Spike minimal model, hippocampal culture data set, from table 4 in [11] i.e A + 3 = 9.1x10 −3 , A + 2 = 4.6x10 −3 and τ y = 48ms. For the case of double-pulsed drain voltage waveform, voltage and timing parameters are also same as in the case of single pulse but, ∆V d (∆t 2 ) is calculated using equation (28) . In the triplet measurement results the red bars can almost capture the experimental black bars i.e more potentiation in the case of protocol 1 and less potentiation in the case of protocol 2. The blue bars shows the result of the mathematical model in [11] . The quadruplet measurement results also replicate the symmetry seen in the experimental results [11] .
2) FG T-STDP model can reproduce frequency effects:
FG T-STDP shown in red lines in Fig. 7(d) and Fig. 8(d) ameliorated the results obtained with FG D-STDP (Fig. 6(d) ). The trend of increase in potentiation with frequency is seen for both cases of pulse drain voltage waveform: single and = 9.1x10 −3 of the hippocampal culture data set, to ensure more potentiation with increase in frequency. Also note that, the measurement results for frequency effect of pairing protocol is obtained with the help of hippocampal culture data set instead of visual cortex data set mentioned in [11] . Again, for the case of double-pulsed drain voltage waveform, voltage and timing parameters are same as in the case of single pulse but, ∆V d (∆t 2 ) is calculated using equation (28), for parameters of A + 3 = 28.7x10 −3 , A + 2 = 4.6x10 −3 and τ y = 48ms. As in [11] , here also we have a limitation i.e the absence of potentiation at low frequency is not observed in the measurement results in the case of single-pulsed and double-pulsed drain waveform ( Fig. 7(d) and Fig. 8(d) ). This is because the single pulse at the first post-synaptic spike itself is enough to generate some injection.
V. DRAIN VOLTAGE GENERATOR: VLSI IMPLEMENTATION
The drain voltage waveform generator is the important block for generating the voltage pulses according to spike timing as shown in Fig. 3 . The input to the generator is a postsynaptic pulse from a neuron as shown in Fig. 11 . The output pulse from the generator is fed back to the drain terminal of FG synapse. The number of drain waveform generators in a system depend on the number of neurons present in that neural network architecture. We propose circuits for singleand double-pulsed drain voltage waveform generator below along with their SPICE simulation results.
A. VLSI implementation of single-pulsed drain voltage waveform
From equation (27) , we need to create an exponentially decaying voltage trace for ∆V s d (∆t 2 ) for large values of ∆t 2 . Also, V d min is the default value of V d when there is no post synaptic pulse for a long time (∆t 2 → ∞). In order to create the exponential voltage trace, we can use a capacitor, C and a switched capacitor resistor, R sc , where the capacitor charges from the lowest voltage, V d min -∆V dmax to V d min through the resistor (Fig. 9(a) ). Here, ∆V dmax is given by the equation (27) , where ∆t 2 → 0. The operation of the circuit is as follows: at every post-synaptic pulse denoted as CLK in Fig. 9(a) , the voltage across the capacitor V 
Here, C cs is the switched capacitor and 1/T sc is the frequency of non overlapping clocks used in switched capacitor. T sc is limited by the desired time resolution for ∆t 2 , which is around 2ms. Thus, C/C sc = 24 and C sc ≈ 42fF for τ y = 48ms. It can be seen that the circuit simulation does not exactly match the theory for moderate values of ∆t 2 due to our approximation of equation (27) by an exponential. Fig. 9(b) shows the circuit implementation of double-pulsed drain voltage waveform according to equations (28) and (19) . The circuit operation is exactly similar to single pulsed drain waveform generator except that ∆V d d (∆t 2 ) is linearly dependent on ∆t 2 . Hence, the resistor is replaced with a current source, I p . Since, we need to create double pulse, an extra clock is used here. CLK1 is to create the first pulse, during which V d = V d min and CLK2 is to sample the voltage, V 
B. VLSI implementation of double-pulsed drain voltage waveform
also from capacitor charging,
Thus equating both the slopes of equations (32) and (33) , we get:
Hence, I p = -5.2 pA for τ y = 48ms and V inj = 0.25V. The simulation result of ∆V 
VI. DISCUSSION
An important consideration for synapse designs is scalability to large arrays. Figure 11 shows the system level architecture of a neuromorphic hardware device with N neurons, M inputs and M xN synapses (number of synapses outnumbered compared to number of neurons). It shows the connection between FG synapses and neurons and gives an idea about the number of different voltage waveform generators to be used for generating the terminal voltages for the FG synapse shown in Fig. 3 . Here, for the entire system shown, we need only N drain voltage, N tunnel voltage and M gate voltage waveform generators. Thus the synaptic area overhead compared to earlier implementations [23] is only the added multiplexer for switching tunneling voltages. Table I compares this work with other reported implementations of plastic synapses-a detailed review of these circuits can be found in [37] . It can be seen that our work is the first that combines high resolution non-volatile storage with sophisticated plasticity rules. The term normalized area is used to denote the ratio of the synapse area to the square of the process technology. It is a normalized metric to compare the size of a synapse circuit independent of process technologysmaller numbers refer to more compact designs. The floatinggate device used in our test chip is quite large. However,this is not a fundamental problem since we have earlier demonstrated STDP in much smaller floating-gate devices [23] -the only difference of this work with our earlier one is in terms of peripheral circuits to control drain and tunnel waveforms. Hence, after this proof of concept work, we can make a dedicated chip with floating-gates occupying ≈ 100µm 2 area. Some emerging devices like memristors are also showing promise as a compact learning synapse [38] for spiking systems. Though we are not yet aware of reports of dense arrays of memristive STDP synapses integrated with CMOS neurons in hardware, this seems like a promising area in future when scaling of flash memory or floating gates become limited.
One of the important aspects of a circuit implementation of a learning rule is the ease with which its parameters can be tuned. We showed in Section III-B how the parameter A + 3 can be tuned based on ∆V d . Other learning rule parameters are also directly related to parameters of the control voltage waveforms. For example, τ + can be modified using T g , τ − using T tun and A − 2 using T tun pulse and V tun max . This is evident from the mathematical analysis of section III. Similarly, from equation (27) and equation (28) Some results for the variation of learning window to different parameters can be seen in our previous paper [26] . Also, other minor changes to the learning rule can be done by modifying the circuits at the neuron and periphery that generate the gate, drain and tunnel waveforms.
After this proof of concept, we will continue the work by simulating a spiking neural network (SNN) using SPICE simulations to understand the difference between T-STDP and D-STDP learning rules and then extend our work in future by fabricating chips with thousands of neurons and millions of learning synapses to do tasks like rapid and robust pattern recognition [39] , [40] . Such neuromorphic chips that incorporate noise and heterogeneity are useful to understand the principles used by our brain to compute using imprecise elements [41] , [42] as well as for accelerated simulations of neural networks [19] , [43] . Moreover, we hope to use neuromorphic systems as the "brain" for real-time behaving systems like robots [41] where both low-power dissipation and real-time operation are necessary. In these cases, using a traditional computer for the implementation is inefficient due to the mismatch between Von-Neumann computing model of digital computers and the massively parallel analog computing of the brain where memory and computing are closely intermixed [44] . Hence, it is useful to be able to mimic biological neural networks closely in ciruits to enable experimental paradigms as well as low-power intelligence.
VII. CONCLUSION
We have presented a spike triplet based learning rule using a single FG transistor as the synapse for VLSI spiking neural networks. The spike triplet affects the setting of drain voltage-we presented a single pulse and a double pulse drain voltage method to obtain the desired dependence of weight on spike timing. We presented a method to calculate the parameters of the drain voltage pulse to obtain results matched to the original theoretical T-STDP rule. We also show FG measurement results in comparison with the biological experimental observations for (1) original doublet protocol, (2) two protocols of spike triplets, (3) frequency effects of pairing protocol and (4) quadruplet experiments. The failure of FG D-STDP rule in replicating the biological results is also included. Possible hardware implementations of drain voltage waveform generator are also proposed and verified through SPICE simulation results. It was shown that the voltage waveform for double pulse case can be generated more accurately due to its simplistic nature.
ACKNOWLEDGEMENT
Financial support from MOE through grant ARC 8/13 is acknowledged. The authors thank Prof. Jennifer Hasler for providing access to FG transistor.
