Abstract-Neuromorphic systems consist of a framework of spiking neurons interconnected via plastic synaptic junctures. The discovery of a two terminal passive nanoscale memristive device has spurred great interest in the realization of memristive plastic synapses in neural networks. In this work, a synapse structure is presented that utilizes a pair of memristors, to implement both positive and negative weights. The working scheme of this synapse as an electrical interlink between neurons is explained, and the relative timing of their spiking events is analyzed, which leads to a modulation of the synaptic weight in accordance with the spike-timing-dependent plasticity (STDP) rule. A digital pulse width modulation technique is proposed to achieve these variable changes to the synaptic weight. The synapse architecture presented is shown to have high accuracy when used in neural networks for classification tasks. Lastly, the energy requirement of the system during various phases of operation is presented.
Recently, a two-terminal nanoscale device known as memristor has been realized and shown to have adjustable electrical conductivity [2] . This property enables it to be used as a synapse in CMOS-memristor hybrid circuits [3] . While the non-volatility of its conductance state and nanoscale dimensions enhance the density and performance metrics of such realizations, its programmability is specifically well suited for neural network circuits. This work leverages these characteristics of memristors to implement spike-timing-dependent plasticity (STDP), a biological learning rule. The synapse is also implemented in a Spiking Neural Network for classification tasks, yielding accuracy of 96%, 84% and 73% for Iris, Wisconsin Breast Cancer and Pima Indian Diabetes datasets respectively.
The remainder of the paper is organized as follows: Section II provides the background information on STDP and memristors. Section III presents the working scheme of the twin memristor synapse. Section IV delineates the digital control circuit realizing STDP in the synapse. Section V details the design parameters that influence the learning behavior of the synapse. Section VI exemplifies the impact of on-chip STDP learning on the classification accuracy produced by spiking neural networks. Section VII summarizes the energy requirements of our synapse. Section VIII concludes the paper.
II. BACKGROUND

A. Spike-Timing-Dependent Plasticity
Spike-timing-dependent plasticity is a process in which the synaptic strength of the connections between neurons is adjusted as a function of temporal differences between neuron spiking events. In this work, we refer to the neuron that precedes a synapse as the pre-neuron and the succeeding one as the post-neuron. In general, if the pre-neuron fire occurred within a reasonable time window before the post-neuron fire occurs, long term potentiation (LTP) takes place. Conversely, if the pre-neuron fire occurs within a reasonable time window after the post-neuron fire occurred, long term depression (LTD) takes place. The largest change in synaptic weight occurs when the difference in time between the pre-and post-neuron fires is small and as this difference gets larger, the synaptic weight change diminishes [4] . The STDP behavior captured in a piecewise exponential relation can be described by equation 1 and Fig. 1 .
where Δt is time difference between the post-neuron and preneuron fire (Δt = t pre − t post ), Δw is the change in synaptic weight, expressed as a percentage of a predefined maximum weight (w max ). A + and A − define the maximum synaptic modification. τ + and τ − are the time window over which potentiation or depression may occur, respectively. 
B. Memristive Device and Model
In 1971, Leon O. Chua had postulated the fourth fundamental device that can implement any given charge-flux curve [5] . This two terminal device was named a memristor since it acts as a resistor with memory. Recent findings [2] confirm the existence of passive devices that have properties similar to those predicted by Chua. Memristors are of particular interest in neuromorphic circuits as synaptic weights can be encoded as memristance values. Incremental memristance changes also prove vital for learning. In this work, the memristor model used for simulation is based on a hafnium oxide (HfO 2 ) memristor designed and fabricated at SUNY Polytechnic Institute [6] . The empirical compact model [7] fits the experimental data from the HfO 2 device. The model is given by equations 2 and 3.
where M is the memristance, dM dt is the rate of change of memristance, C LRS and C HRS are fitting coefficients, incorporating switching time, V(t) is the voltage applied to the memristor, V tp and V tn are the switching thresholds, P LRS and P HRS are the polynomial coefficients, controlling the nonlinearity of the model.
Window functions f LRS (M (t)) and f HRS (M (t)) account for the resistance saturation and are defined by equation 3.
where β and θ are fitting parameters, Δr is the absolute difference between the High Resistance State (HRS) and Low Resistance State (LRS) values. Fig. 2a shows the circuit symbol for the memristor. The pinched hysteresis curve of current vs voltage is shown in Fig. 2b for both the model and experimental data. 
C. Memristor Based Approach to Neuromorphic Systems
There are numerous projects dedicated to actualizing neuromorphic systems utilizing a memristive device as the synapse [8] . The properties of memristive devices facilitates the realization of several kinds of learning mechanisms seen in biological synapses such as Long Term Potentiation (LTP), Long Term Depression (LTD), Spike-Timing-Dependent Plasticity etc. Learning rules were demonstrated by applying Pulse Width Modulated (PWM) signal on a TiO 2 memristor in [3] . Time Divison Multiplexing (TDM) approach on Ag/Si memristive synapse also showed similar learning patterns in [9] . 1T1R based HfO 2 memristor implementation used pulse shape filters in pre and post-neuron to achieve synaptic behavior in [10] .
All of the aforementioned neuromorphic systems mentioned in the literature rely on the analog shaping of the neuron spikes. Accuracy of the learning behavior is hence dependent on meticulous design of the precise spike shape and its generation circuit. Such analog spikes are prone to the detrimental effects of noise and also pose a significant challenge for high-fan out conditions when the same analog signal has to be propagated to a huge load across the chip.
In the following sections, a digital approach for STDP is presented. Our approach utilizes digital pulses for performing STDP, hence overcoming the need for meticulous analog spike shaping. Such digital approach provides a precise control over the weight change magnitude of the synapse during online learning. Moreover, the synchronous approach adopted here provides the flexibility to modify the magnitude of weight change during the online learning process by changing the clock frequency used. Additionally, digitizing the spike instead of using analog spikes provides immunity to noise while communicating signals across the chip.
III. TWIN MEMRISTOR SYNAPSE
The twin memristor synapse presented in this work is derived from [11] . It comprises of two memristors connected with opposite polarity as shown in Fig. 3 . This combination of memristors (which forms the synapse) is interposed between the pre-neuron and the post-neuron. While a pair of terminals of the memristors are driven separately by the pre-neuron, the other pair that connects to the post-neuron is shorted to form the post-synaptic node. The synapse operates in two phases: accumulation and learning. The control circuit in the neuron provides appropriate voltage levels to drive the memristors during each phase.
1) Accumulation:
When the pre-neuron fires, the control circuit drives opposite polarity voltages on the nodes V p and V n , while the post-synaptic node is held to a virtual ground by the post-neuron. The virtual ground is provided by the integrator op amp of the post-neuron [11] . As shown in Fig. 3 , positive current flows through M p while negative current flows through M n . These currents are summed at the post-synaptic node and the summed current can be positive, negative or zero, depending on the values of M p and M n . Hence, both positive and negative weights can be realized using the twin memristor without any additional circuitry. The weight of the synapse, which is the effective conductance of the memristors, is:
It may be noted that positive current through the synapse leads to accumulation of charge in the post-neuron while negative current causes the dissipation of charge from it.
2) Learning: In this phase, the synapse adjusts its weight according to the STDP learning rule. The control circuit in the pre-neuron drives the same voltage on both the pre-synaptic terminals i.e., V p = V n . The post-synaptic node is driven by the post-neuron. The polarity of the voltage across the synapse is dependent on whether the synapse is being potentiated or depressed. During potentiation, the synaptic weight must increase. To achieve the increase, the control circuit provides a negative voltage to the pre-synaptic memristor terminals (V p , V n ) while the post-synaptic node is held at a positive voltage by a feedback from the post-neuron. This causes a voltage difference across the memristors, which surpasses their switching threshold. Since the memristors are connected with reversed polarity, memristance of one of the memristor (M p ) decreases, while that of the other (M n ) increases. This results in an overall increase in effective conductance and synaptic weight. The new conductance due to potentiation is given by:
Depression implies a decrease in the synaptic weight. In this case, the voltages on V p , V n and the post-synaptic node are reversed. Now, memristance of M p increases, while that of M n decreases. This results in an overall decrease in effective conductance and synaptic weight.
The amount of increase or decrease in the memristances (ΔM ) in both the cases depends upon the timing of the firing events of the pre and post-neurons; the further the fires are separated in time, the smaller the change in memristance.
IV. STDP IN TWIN MEMRISTOR SYNAPSE
In this mixed signal neuromorphic system, analog voltage accumulated in the neuron, upon reaching its threshold, is sampled to produce a digital rectangular spike. Depending on the relative timing of the spikes of pre and post-neuron, a synapse is either potentiated or depressed. A voltage greater than the threshold is applied across the memristor for different periods of time. If the pre-neuron fire is further apart in time from post-neuron fire, voltage across the memristor is applied for smaller period of time and causes weaker potentiation or depression. As the pre-neuron fire gets closer to the postneuron fire, voltage across the memristor is applied for longer period of time, causing stronger potentiation or depression.
The proper circuit operation is ensured by the synaptic control block of the pre-neuron. The schematic diagram for the control circuit is shown in Fig. 3 . The serial-in parallelout shift registers store the firing information of both pre and post-neuron. This helps in tracking the temporal difference in the firing events of these neurons. It also communicates the firing events to the pulse generator block. It is a CMOS combinational logic circuit that produces proper digital voltage pulses for the twin memristors in the synapse, taking into account the timings of the firing events of pre and postneuron. The control logic determines whether outputs to the synapse should be turned on or off. If the synapse is neither in accumulation phase, nor in learning phase, it is considered idle. The outputs V p and V n are turned off during idle state, so that there is no inadvertent accumulation or learning.
For simulations in Cadence Spectre, we have used the memristor model discussed in Section II-B for the synapse and the Integrate and Fire neuron shown in [11] . In addition to accumulation and firing, the neuron also feeds its fire back to the pre-neuron and drive the post-synaptic node during the refractory period. During the refractory period, the neuron accumulation path is cut off and hence it does not accumulate charges. Instead, it provides a digital signal to the postsynaptic node that stays at positive rail for half the refractory period and goes to negative rail for the other half of the refractory period. First half of the refractory period is used to perform potentiation while the second half is used to perform depression. Waveforms for different LTP/LTD conditions. F P re and F P ost are the fires from pre and post-neuron respectively, V P SN is the voltage at the post-synaptic node, which is driven by the post-neuron, V p and V n are the waveforms generated by the pulse generator block driving the memristors at the synapse side, V dif f is the voltage difference across the twin memristors during the learning period only. As V p and V n are same only during the learning period, we can consider the twin memristors to be shorted on both ends and can find V dif f .
The output control block and the pulse generator block is explained with an example. Here we provide waveforms to perform STDP with 2-clock cycle tracking ability. This essentially means that the STDP circuit can account for temporal differences of up to 2 clock cycles between the pre-and the post-neurons' firing events. For this scenario, there can be 4 different combinations of firing events: (1) Post-neuron fires after 1 clock cycle delay of pre-neuron fire, (2) Post-neuron fires before pre-neuron and there is one cycle gap between them, (3) Post-neuron fires just after pre-neuron and lastly (4) Post-neuron fires just before pre-neuron. Conditions 1 and 3 are potentiation events while conditions 2 and 4 are depression events but the amount of change in conductances are different. These scenarios can be best illustrated by the waveforms of figure 4. The conditions 1 and 2 are represented by figure 4a and conditions 3 and 4 by figure 4b.
If there is a pre-neuron fire, V p goes to positive rail and V n goes to negative rail, causing a net current flow for accumulation. The voltages across the memristors never go above the switching threshold during the accumulation phase and hence there is no change in memristances. Whenever there is a post-neuron fire, the neuron drives V P SN to positive rail for 2 clock cycles and negative rail for next 2 clock cycles. These also constitute the 4 cycle refractory period, during which the neuron does not accumulate. This is the learning phase. During the learning phase V p and V n are equal, as can be seen in figure  4 . For conditions 1 and 3, which are LTP conditions, V p and V n are driven to negative rail but for different amounts of time. The voltage across the memristor crosses the negative threshold and due to the opposite polarity connections of the memristors M p and M n , the effective conductance increases according to equation 5. When the fires are closer together, the voltage across the memristor crosses the threshold for longer period of time, causing a greater change in memristance (condition 3) and hence a greater increase in conductance (synaptic weight). Conditions 2 and 4 are LTD conditions, and now V p and V n are driven to positive rail and opposite scenario occurs. The digital logic circuitry used for the 2-cycle STDP shown here realizes the following boolean functions for V p and V n , V p = F P ost · (F P re t1 + F P re t2 ) + F P ost t1 · F P re t2 , (6) V n = F P ost t2 · F P re t1 + F P ost t3 · (F P re t1 + F P re t2 ), (7) where F P re t1 , F P ost t1 , F P re t2 , F P ost t2 , F P re t3 and F P ost t3 represents the 1,2,3 cycle delayed signals of F P re and F P ost respectively. These timing states are generated by the shift register block shown in Fig. 3 .
The output control block turns off the pulse generator output to realize the idle state. It is another simple combinational logic block implementing the following function Following similar approach, different versions of synapses can be implemented, with each version capable of tracking different number of cycles before and after a post-neuron fire. The example discussed in Fig. 4 is defined as 2-cycle STDP.
V. DESIGN PARAMETERS AND STDP PERFORMANCE
In this section, the impact of memristor device properties and design parameters are analyzed on a 5-cycle STDP architecture. The memristor device parameters used for the simulation are as follows: HRS = 50kΩ, LRS = 5kΩ, V tp = 0.75V, V tn = -0.75V, t swp = t swn = 1μs. The parameters assumed resemble the memristive device of [6] . For characterizing the conductance change, the maximum conductance was taken to be,
HRS and change in conductance is expressed in percentage of G max . Percentage change in conductance with respect to the timing difference of fires is presented in Fig. 5a . As it can be seen from the figure, as the fires are closer together, the change in conductance increases non-linearly, which is similar to the exponential STDP explained in section II-A.
The memristor device parameter and in turn the choice of design parameter such as clock frequency affects the STDP behavior of the twin memristor synapse. Using the same switching time parameter, the clock frequency was increased to analyze its effect on STDP curve. With a frequency of 100 MHz, the generated STDP curve shows a linear relation (Fig. 5b) . With a higher frequency, the width of the voltage pulse applied to the twin memristor decreases. Hence there is a small change in memristance and conductance. This can also be inferred by looking closely at the synaptic conductance equation.
Assume that initially the synapse has a weight of zero, implying that both the memristors have equal memristance. So, 
For small change in memristance, ΔM is small compared to M . As a result, the ratio ΔM M is very small and the term within the bracket reduces to 1,
According to the memristor model of equation 2 and ignoring the window function for simplicity, ΔM depends linearly on the pulse width of the voltage applied across the memristors, considering the voltage to be above threshold (V th ). Hence, synaptic conductance changes linearly with smaller pulse widths, causing the STDP curve to be linear for 100MHz clock frequency. As the pulses become wider (lower clock frequency), the non-linear terms in equation 9 become significant. As a result, conductance changes non-linearly, which can be approximated as an exponential change. The simplified assumptions made here are sufficient to explain the the shift of STDP behavior qualitatively for different clock frequencies.
Mismatch in switching time and threshold voltage of memristors due to process variation have been reported in the literature [11] , [6] , [12] . Typically, HRS to LRS switching time (t swn ) is faster than LRS to HRS switching time (t swp ) and LRS to HRS transition threshold (V tp ) is higher than HRS to LRS transition threshold (V tn ). The effects of variation is shown in fig. 6 , keeping other parameters same as the original assumption. The results presented points to the immunity of twin-memristor to threshold voltage variation. Switching time variation starts affecting the STDP curve severely when the mismatch reaches 50%, as the conductance starts to saturate. Drastic conductance change of one device due to low switching time dominates the overall conductance change and hence this faster switching device dictates the observed (crippled) STDP behavior in such a case. However, the anti-symmetric STDP curve is always retained due to having two memristors connected in opposite polarity.
VI. STDP IMPACT ON NEURAL NETWORK LEARNING
In order to demonstrate the consequence of STDP described herein on a neural network, we consider three classification tasks, namely Iris, Wisconsin Breast Cancer and Pima Indian Diabetes dataset taken from UCI machine learning repository [13] . Iris dataset consists of 150 entries, each with four properties of Iris flower and three output labels. Breast Cancer dataset contains 699 entries, each with 9 features. The Diabetes dataset has 768 entries of patients, each covering 8 input features. Evolutionary Optimization (EO) algorithm is applied to generate the networks for each classification tasks, the details of which can be found in [14] . EO starts off with a fixed number of input and output neurons, but number of hidden neurons and the synaptic connection between all the neurons can change over generations. During each generation, every network of the population is evaluated in simulation, and a score is assigned to each network based on how well it performed on a particular application. Better performing networks are selected to serve as parent networks for the next generation. Parent networks are then probabilistically recombined through a crossover process and mutated, where mutations can update network structure (e.g., adding or deleting a neuron or synapse) or parameters (e.g., changing a synaptic delay or synaptic weight). The EO method currently utilizes a high-level software simulation of the memristive network during the evaluation of the network, which allows us to design a network off-chip. 20 networks of each kind were trained and tested, with half of the dataset used for training and the other half for testing. Fig.  7 shows the testing accuracy of the best performing networks of each kind. In general, the more cycles are tracked, the better the accuracy of the network is. One exception is the Iris dataset, where there is only marginal improvement. This can be attributed to the network size, which is really small compared to the other two. Iris network has 4 input neurons and 3 output neuron, whereas Breast Cancer network has 27 input neurons and 2 output neuron and Diabetes network has 24 input neurons and 2 output neurons. Although the overall accuracy of the network is dictated by numerous network metrics such as number of neurons, synapses and feedback loops, results in Fig. 7 lead us to the general conclusion that larger networks have more room for accuracy improvement due to STDP.
IRIS BREAST CANCER DIABETES
Another point worth noting is the circuit overhead, which increases for tracking more number of cycles. While there is a significant increase in accuracy from the simple 1-cycle tracking to 3-cycle tracking approach, the same cannot be said for the 5-cycle tracking approach over 3-cycle tracking. In this regard, the 3-cycle tracking approach can be a viable option for designing networks considering the trade-off between circuit overhead and classification accuracy.
VII. ENERGY ESTIMATES
In terms of energy requirements, analog version of synapse consumes 1.9 mW for a system of synapses which gives 0.23 nJ to 0.23 mJ per synapse [15] . Pulses directly applied to the memristive synapse gives energy value ranging from 11 pJ per spike to 0.1 pJ per spike for memristance range of 1kΩ to 1MΩ [16] . Whereas another approach lists energy as 36.7 pJ per spike for memristance range of 70Ω to 670Ω [17] . Energy requirements for this implementation with different version of synapses are presented in Table I . As we track for more number of cycles, the circuit complexity increases and consequently the energy requirement also increases. The energy requirements listed in the table were calculated based on the memristance range of 5kΩ and 50kΩ, following the practical device of [6] . It is possible to achieve even lower energy values with higher LRS and HRS values (such as 500kΩ and 200M Ω, respectively [18] ).
VIII. CONCLUSION
In this paper, a synapse consisting of a complementary connection of two memristors has been proposed for use in neuromorphic circuits with on-chip STDP based learning. It has been shown that the learning behavior of this synapse can be controlled by a set of circuit parameters that together comprise the designer's arsenal. Precise control over change in weight of the synapse was achieved by the modulation of width of the digital pulse applied across the synapse. This digital approach was shown to have implications not only on energy dissipation, but also on the learning pattern, resulting in varying accuracy for different classification tasks.
