Abstract-Resistive switching memory (RRAM) has been proposed as an artificial synapse in neuromorphic circuits due to its tunable resistance, low power operation, and scalability. For the development of high-density neuromorphic circuits, it is essential to validate the state-of-the-art bistable RRAM and to introduce small-area building blocks serving as artificial synapses. This paper introduces a new synaptic circuit consisting of a one-transistor/one-resistor structure, where the resistive element is a HfO 2 RRAM with bipolar switching. The spike-timingdependent plasticity is demonstrated in both the deterministic and stochastic regimes of the RRAM. Finally, a fully connected neuromorphic network is simulated showing online unsupervised pattern learning and recognition for various voltages of the POST spike. The results support bistable RRAM for high-performance artificial synapses in neuromorphic circuits.
networks [10] [11] [12] . Neuromorphic computing can even take an advantage of stochastic variations, which contribute to the normal operation of fuzzy neural networks in animals and humans [13] .
In this paper, we present a new synapse circuit with a one-transistor/one-resistor (1T1R) structure that is used as a tunable connection between a presynaptic neuron (PRE) and a postsynaptic neuron (POST). On the one hand, the RRAM synapse allows to passively transmit spikes and, on the other hand, to update its weight in accordance to a spike-timing-dependent plasticity (STDP) protocol. The STDP characteristics are characterized and modeled for deterministic and stochastic switching. Finally, we simulate a two-layer neuromorphic network based on the experimentally observed STDP characteristics, considering resistance-dependent STDP [12] , [14] , [15] and demonstrating online pattern learning and recognition with deterministic and stochastic switching. These results support the state-of-the-art RRAM for neuromorphic circuits capable of learning, updating, and recognizing real-world visual and auditory patterns.
II. RRAM SAMPLES AND CHARACTERISTICS
Our RRAM devices consist of a Si-doped HfO 2 layer with TiN bottom electrode (BE) and Ti top electrode (TE) [4] . 1T1R structures, as shown in Fig. 1(a) , were used to conduct pulsed experiments driving TE and gate nodes by an arbitrary waveform generator, while the TE voltage and the RRAM current were monitored by an oscilloscope as in Fig. 1(b) [16] . Fig. 1(c) shows a typical I -V curve obtained in response to bipolar triangular pulses for set (positive voltage) and reset (negative voltage) [16] . The pulsewidth t P was 1 ms, while the compliance current I C was adjusted to 50 μA by proper tuning of the gate voltage V G . Set transition from the highresistance state (HRS) to the low-resistance state (LRS) takes place at V set ≈ 1.5 V. On the other hand, the onset of the reset transition from LRS to HRS is seen at V reset ≈ −1 V and is completed at V stop = −1.5 V, which is the maximum voltage in the negative sweep, as shown in Fig. 1(c) [9] . Note that both set and reset transitions are rather abrupt, which contrasts with the gradual adjustment of synaptic weight observed in the biological STDP [17] , [18] . Complementary switching during set process was avoided in our devices by the use of an 0018-9383 © 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. asymmetric structure of RRAM with a Ti oxygen exchange layer at the TE, and of a relatively low I C [19] , [20] .
III. 1T1R SYNAPSE
The 1T1R structure in Fig. 1(a) can be adopted as a synapse circuit, as shown in Fig. 2(a) . This is a simplified version of the two-transistor/one-resistor synapse [12] , where one transistor could activate the communication of the PRE spike to the POST, while the other transistor was responsible for updating the synaptic weight according to STDP. The 1T1R circuit in Fig. 2(a) is capable of both functions with just one transistor, which alternatively activates communication or plasticity in the synapse. As shown in Fig. 2(a) , the PRE spike controls the gate voltage V G of the transistor, while the TE voltage V TE is controlled by the POST and is generally biased to a relatively low constant voltage. As a result, every PRE spike activates a current, which is inversely proportional to the 1T1R resistance. The 1T1R current is collected by the virtual ground input node of the POST neuron, which also collects the current from other synapses. As the integrated current exceeds an internal threshold, the POST experiences a fire event according to the typical integrate and fire behavior of the neuron [21] . Upon fire, besides sending a spike pulse to the subsequent layer of neurons, the POST also delivers a pulse back to the TE of the 1T1R synapse according to the waveform in Fig. 2(b) . The TE waveform shows two phases, the first one consisting of a positive voltage pulse of 1 ms followed by a 9 ms pause, while the second phase has a negative pulse of a 1 ms width followed by a 9 ms pause. Before and after the backward spike, the same low-amplitude V TE is maintained with the purpose of activating current spikes to the POST. In our experiments, the V G spike of the PRE consists of a first phase with a positive voltage of 2.1 V and a width of 10 ms followed by a second phase of zero voltage for 10 ms. The value of V G was chosen in correspondence of a compliance current I C = 50 μA, which is small enough to allow a relatively small power consumption during set/reset transitions. The TE voltage during communication was kept constant and equal to a relatively low value V TE = 20 mV, which is low enough to induce no change in the RRAM resistance. The positive and negative peaks during the fire events were V TE+ = +2.5 V and V TE− = −1.6 V, respectively.
The large values of V TE+ and V TE− , in contrast to the low value of V TE < V set in the communication stage, allow to activate STDP according to the timing between the PRE and POST spikes. In fact, defining a relative delay t given by t = t post − t pre (1) where t pre and t post are measured in correspondence of the onset of the PRE and POST pulses, respectively, as shown in Fig. 2(b) , and the sign of t dictates the change of RRAM resistance. For t > 0, the positive V TE pulse overlaps with the V G spike, thus resulting in a set transition corresponding to long-term potentiation (LTP). On the other hand, for t < 0, the negative V TE peak overlaps with the V G spike, thus resulting in reset transition and consequent long-term depression (LTD) [12] .
IV. STDP CHARACTERISTICS
To validate the proposed 1T1R synapse, we applied the V G and V TE pulse waveforms in Fig. 2 (b) to a 1T1R device with variable t and initial resistance R 0 , with the purpose of collecting the STDP characteristic. After every combined gate/TE pulse application, the new resistance R of the device was measured. Fig. 3(a) shows R 0 /R, namely, the relative increase in conductance induced by the application of the two pulses, as a function of the pulse delay t. Various curves are reported corresponding to increasing initial resistance R 0 , which was changed in a range from 25 to 500 k by initially preparing the device by a partial-reset operation with variable voltage V stop [22] . The curves show STDP with LTP and LTD at positive and negative delay t, respectively. As previously noted [12] , the STDP depends on the initial resistance: for instance, virtually no LTP can be observed on LRS [R 0 = 25 k in Fig. 3 (a)], since this state already has a very low resistance. In fact, the resistance after set transition is controlled by the size of the conductive filament (CF), which is controlled by the compliance current I C [23] . Since a constant V G was used in the scheme of Fig. 2(b) , no variation in the maximum size of the CF could be obtained, thus resulting in no possible potentiation of LRS. Similarly, no substantial LTD is possible for HRS [R 0 = 500 k in Fig. 3(a) ]. Note that a similar dependence on the initial state was observed in biological systems, where a synapse conductance change cannot exceed minimum and maximum values [24] . On the other hand, intermediate resistance states can achieve both LTP and LTD. In any case, the STDP characteristics show constant R 0 /R for t < 0 and t > 0, as a result of the constant V TE+ , V TE− , and V G in Fig. 2(b) .
The STDP curves were reproduced by a Simulink circuit model that is able to simulate the 1T1R device. The RRAM in the 1T1R was described by our previous analytical model [25] , where a set transition consists of the growth of the CF diameter, while the reset transition occurs via the formation and growth of a depleted gap, in agreement with the results of numerical simulations of set/reset processes [26] . In the simulations, we applied the same pulses shown in Fig. 2(b) and used in Fig. 3(a) , assuming variable t and variable R 0 , as shown in Fig. 3(a) . Fig. 3(b) shows the calculated R 0 /R as a function of t at increasing R 0 , indicating a close agreement with data. Fig. 4 shows the calculated STDP characteristics in a 3-D plot, where R 0 /R in the z-axis is reported as a function of R 0 (x-axis) and t (y-axis). LTP occurs for t > 0 and increases with R 0 , while LTD occurs for t < 0 and is more pronounced for low R 0 .
Note that the maximum relative LTP is around a factor 20, while the maximum relative LTD is around a factor 1/20, corresponding to the overall resistance window between HRS (∼500 k ) and LRS (∼25 k ) in our device. This indicates that the synapse shows a bistable behavior where, starting from , and contrasts with the generally assumed analog behavior of biological synapses [17] , [18] , [24] .
V. PATTERN LEARNING WITH DETERMINISTIC STDP
To demonstrate the functionality of the bistable 1T1R synapse for unsupervised pattern learning, we simulated a two-layer neuromorphic network with 64 PREs in the first layer and one POST connected to the first layer with 64 synapses [12] . As schematically shown in Fig. 5(a) , the first layer acts as a retina, emitting spikes in correspondence of a visual pattern, e.g., an X, as shown in Fig. 5(b) , alternated with random noise [ Fig. 5(c) ]. The currents generated in each activated 1T1R synapse are collected by the POST, which is modeled as a leaky-integrate and fire (LIF) neuron, integrating the currents and delivering a spike as the internal potential exceeds a fixed threshold. Either the pattern or the noise was periodically presented by the PRE layer every epoch, corresponding to a period of 10 ms. Pattern and noise were submitted with equal probabilities of 50%, and noise had an average density of 9% activated PREs in the first layer. The RC time constant of the LIF was τ = 45 ms. Fig. 5(d) shows the spiking activity of the first layer, reporting the active channel (PRE) as a function of discrete time (epoch). Either noise or pattern events occur randomly at each epoch. Fig. 5(e) shows the corresponding internal potential in the POST, namely, the output potential of the leaky integrator, which is the equivalent of the membrane potential in biological neurons [24] . The internal potential increases due to the integration of spiking currents, and then eventually exceeds the threshold resulting in a POST fire event. This dictates the generation of a POST spike and the discharge of the internal potential. and 500 epochs is shown in Fig. 6(b)-(d) , respectively. All weights were initially prepared in a random state uniformly distributed between HRS and LRS. The weights corresponding to the input pattern show a fast potentiation due to LTP in the initial 50 epochs. On the other hand, background patterns display gradual depression toward low conductance due to LTD. Noise is functional in depressing background synapses, since LTD generally takes place in synapses excited by noise soon after a fire event induced by the presentation of the pattern. Because of uncorrelated noise behavior, depression of background synapses is relatively slow, taking approximately 150 epochs in Fig. 6(a) . These results support unsupervised pattern learning in the RRAM-based synaptic network via STDP.
VI. LEARNING WITH STOCHASTIC SYNAPSES
The abrupt set/reset processes in our RRAM device cause bistable STDP in contrast with the gradual weight tuning, which is believed to occur in biological STDP. It was previously reported that gradual switching can be mimicked in bistable synapses via stochastic switching, where set/reset process is induced randomly (instead of deterministically) in the STDP protocol [14] , [15] . To study the impact of stochastic switching on pattern learning, we changed the V TE+ and V TE− voltages to explore both random-set transition and partial-reset transition of RRAM.
A. Partial-Reset Characteristics
We characterized the partial-reset process in our RRAM by applying a sequence of triangular V TE pulses, as shown in Fig. 7(a) . First, the device was initialized in the full reset state (HRS) by a reset pulse, and then a set pulse was applied to induce set transition to the LRS. The compliance current was 50 μA during the set pulse by properly limiting the gate voltage (not shown). Finally, a partial-reset pulse with variable V stop was applied to induce transition to the partialreset state. The sequence was repeated 10 3 times for each value of V stop to gain sufficient statistics. Fig. 8(a) shows the distribution of R measured after the partial-reset pulse with V stop = −0.7, −1, −1.1, −1.2, −1.3, and −1.6 V. For V stop = −0.7 V, the R distribution coincides with the LRS distribution, since the voltage is too small for the reset transition. As V stop is increased, first a high-R tail appears with increasing amplitude, and then the full distribution moves toward high R [16] . The distribution dependence on V stop can be captured by an empirical model, where we described each distribution by combining two subdistributions, one for HRS and one for LRS. Both subdistributions were modeled as lognormal distributions defined as an average value μ and a slope (or standard deviation) σ . We extracted the average value μ HRS and its slope σ HRS on the lognormal scale, which are shown in Fig. 8(b) and (c), respectively. We also extracted the average value μ LRS of the set-state distribution [i.e., the one for Fig. 8(a) ] and its slope σ LRS on the lognormal scale, which are also shown in Fig. 8(b) and (c), respectively. Based on the extracted parameters in Fig. 8(b) and (c), we obtained the partial-reset distributions at any V stop by combining HRS and LRS distributions with a Monte Carlo approach, as shown by calculations in Fig. 8(a) . Note that μ HRS can be smaller than μ LRS in Fig. 8(b) , as a result of extrapolating the HRS tail to lower resistance in the lognormal scale. Such low values of μ HRS have no physical meaning, but are functional to the accurate description of the overall R distribution in Fig. 8(a) . Fig. 7(b) shows the triangular pulse sequence for studying random-set distributions, similar to partial-reset distributions in Fig. 8 . The V TE waveform in Fig. 7(b) includes an initialization set pulse, a full reset pulse with V stop = −1.6 V, and a final pulse for random-set transition with a variable voltage V A [9] . As a result of the large stochastic cycle-tocycle fluctuation of the set voltage V set , the voltage V A can be above or below the nominal value of V set , thus inducing set transition in a fraction of cycles. Fig. 9(a) shows the cycle-cycle distributions of measured R for the initial HRS and after random-set transition at variable V A . The random-set pulse induces set transition in a fraction of cycles, as a result of the statistical variability of V set . Therefore, the application of V A might lead to set transition for V A > V set [state A in the distribution of Fig. 9(a) ], or the device might remain in the HRS state for V A < V set (state B). In some case for V A ≈ V set , the set transition might be stopped at the end of the V A pulse, thus resulting in an intermediate state as indicated by state C. Fig. 9(b) shows the I -V curves captured during the randomset pulse in correspondence to states A, B, and C in Fig. 9(a) . As V A increases, the set probability increases as shown in Fig. 9(c) , showing the fraction of cells with R < 80 k in the distributions in Fig. 9(a) . We chose 80 k as a threshold for separating LRS and HRS cells. Data in Fig. 9 (c) can be described by the fraction of cells undergoing set transition, i.e., those falling below V A in the Gaussian distribution P(V set ) of V set . As a result, the set probability P set can be obtained as
B. Random-Set Characteristics
where μ = 1.3 V is the average value of V set and σ = 0.193 V is the standard deviation of V set . Similar to partial reset, the random-set distribution was modeled by a Monte Carlo approach combining the full (initial) distribution HRS and the full LRS distribution with a random-set probability given by (2) . Calculations by (2) are shown in Fig. 9(c) , in good agreement with the observed P set . Equation (2) was used in STDP simulations by assuming V A = V TE+ , namely, the positive peak of the POST spike in Fig. 2 .
VII. PATTERN LEARNING WITH STOCHASTIC SYNAPSES
To study the impact of stochastic switching on pattern learning efficiency, we simulated the neuromorphic circuit of Fig. 5(a) by changing the values of V TE+ and V TE− of the POST spike in Fig. 2(b) . An X pattern was presented to the PRE layer with a noise occurrence probability of 50% and noise average density of 9%, as shown in the simulations of Fig. 6 . After each applied pulse at voltage V TE+ or V TE− , the resistance was updated according to the Monte Carlo model for partial reset and random set of Section VI. After 1000 simulated epochs starting from a random distribution of synaptic weights, we defined the learning efficiency P learn as the ratio of the number n p, f of fire events in correspondence of the presentation of a pattern, divided by the number n p of total appearances of the pattern. Note that P learn should be ideally one in the case of fire occurring systematically at the presentation of the pattern. We also calculated the error probability P err as the ratio of the number n n, f of fire events in correspondence of the presentation of noise, divided by the number n n of total appearances of noise. Note that P err should be ideally zero, i.e., the POST never fires in correspondence of the presentation of noise. Fig. 10 shows the calculated P learn and P err in color maps as a function of V TE− in the x-axis and V TE+ in the y-axis. V TE+ controls the probability of synapse potentiation, while V TE− is responsible for synapse depression. From the maps, the region with the highest P learn and lowest P err is for V TE+ ranging between 1.2 and 1.6 V and for |V TE− | above 1.3 V. LTP is too weak for V TE+ < 1.1 V, thus causing a generalized depression of all synapses. On the other hand, synapses cannot be depressed for |V TE− | < 1.2 V, thus causing a generalized potentiation of all synapses and systematic spiking in response to both pattern and noise.
We studied a possible improvement of learning by using multiple 1T1R cells for each synapse, each connecting the same PRE to the POST. Fig. 10(c) and (d) show the calculated P learn and P err , respectively, for the case of two cells per synapse, while Fig. 10 (e) and (f) show the calculated P learn and P err , respectively, for the case of four cells per synapse. The learning/error performance slightly increases due to averaging within the resistance distributions after partial reset and random set. In fact, the regions of high P learn and the regions of low P err show an increasing area for increasing number of Color maps of calculated learning efficiency P learn and error probability P err for (a) and (b) one cell per synapse, (c) and (d) two cells per synapse, and (e) and (f) four cells per synapse. cells per synapse in Fig. 10 . Fig. 11(a) shows the calculated P learn and P err as a function of V TE+ for V TE− = 1.6 V (full reset) and for variable number of cells per synapse, indicating a slight improvement obtained by redundant RRAM cells.
We also studied the impact of noise on learning efficiency. Fig. 11(b) shows P learn and P err as a function of the noise activity within the PRE array, namely, the average fraction of firing PREs while presenting a noise image. In the simulations, noise was presented randomly in 50% of all epochs. For zero noise activity, P learn is ∼60% due to the lack of background depression. As noise is increased, P learn increases, reaching a maximum value around 95.3% in correspondence of 9% firing PREs. A further increase in noise activity leads to performance degradation, where P learn decreases and P err increases. This is because excessive noise may cause a sequence of noiseinduced fire of the PRE, immediately followed by pattern presentation, which results in the LTD of all pattern synapses. The results in Fig. 11(b) suggest that noise should be carefully tuned to maximize the learning efficiency in the neuromorphic network.
VIII. CONCLUSION
We presented a novel 1T1R synapse using bipolar RRAM as tunable resistance for neuromorphic learning circuits. The STDP behavior in the synapse arises from the overlap of PRE and POST pulses across the RRAM. We demonstrated STDP characteristics by experiments and unsupervised learning in a fully connected neuromorphic network of 64 PREs and one POST. The impact of stochastic switching was studied by implementing an empirical Monte Carlo model for switching variability during partial-reset and random-set processes. Stochastic switching simulations of learning show a large region of operation with optimum learning at large TE voltages. Optimization of noise for best learning efficiency is finally discussed. He is currently a Post-Doctoral Researcher with the Politecnico di Milano. His current research interests include electrical characterization, modeling, and development of novel applications of resistive switching memory.
