This PDF file includes:
Note S1. STDP characteristics of single RRAM synapse. Note S2. Inference of the network. Note S3. Learning algorithm. Note S4. Hardware implementation of POSTs and weight updates. 
Supplementary Notes
Note S1. STDP characteristics of single RRAM synapse. In the case of potentiation, which applies to false silence, the resistive switching synapse was first initialized to HRS, then an exponentially-decaying axon potential was applied to the FET gate (V axon ) and a rectangular spike with positive voltage V TE+ = 3 V was applied to the TE with variable delay t (see also in fig. S3 ). The axon spike had a decay time  RC = 8 ms, while the POST spike was 1 ms long. Data show that the synapses are increasingly potentiated as t decreases, as the compliance current I C increases at high V axon . Note that synapses are no longer potentiated for t > 4 ms, due to I C dropping below the minimum current to induce set transition in the device. Data are also shown for time-dependent depression, which applies in the case of false fire. In this case, RRAM synapses were first initialized in LRS, then an exponentially-decaying axon potential and a negative-voltage POST spike (V TE-= -1.6 V) were applied with variable delay t. Similar to potentiation, depression decreases for increasing t, as a result of the transistor resistance increasing with time, thus causing a smaller voltage drop across the RRAM device and a correspondingly smaller depression. The time-dependent potentiation and depression processes were also simulated by a physics-based model of RRAM switching (34), showing good agreement with the experimental characteristics in Fig. 3B . Note S2. Inference of the network. The input spatio-temporal spike pattern can be mathematically represented as = { | ∈ 1, 2, … , }, where N is the total number of PREs, and is the precise spiking time of i-th PRE (38). For the situation of one POST, the synaptic network can be represented by its synaptic weight matrix = { | ∈ 1, 2, … , }, where is the weight of the i-th synapse given by the conductance of the RRAM device. To process the temporal associated information among the spatio-temporal pattern, the spike was converted to an exponentially-decaying signal in each axon (39)
Here, is the maximum value of ( ), denotes the decay time constant, and ( ) is the Heaviside function. For inference stage, a small voltage V read is applied to the synaptic TE, thus inducing an axon signal which is weighted by the synaptic conductance, and generating a synaptic current
where k mos is the FET characteristic parameter and V T is the FET threshold voltage. The currents are collected at a virtual ground input node of a TIA (21), thus yielding the internal voltage V int of the POST according to
where R n is the feedback resistance of the TIA circuit. From Eqs. (1)- (3), it turns out that V int is the convolution of input spatio-temporal spike pattern T with the synaptic network matrix W in the time domain, with the exponential axon potential in Eq.
(1) serving as convolutional kernel (39).
Note S3. Learning algorithm. In the training stage, the learning goal for spatio-temporal computing is to adjust synaptic weight in W so that the true pattern T generates the highest V int and causes fire. The learning algorithm follows the Widrow-Hoff (WH) rule (37), where each synaptic weight is updated according to a weight change given by
where is the learning rate, , and refer to the input, the correct output and the actual output, respectively. In the conventional perceptron networks, the variables in Eq. (4) are regarded as real-valued vectors of . In a spiking network, instead, the input and output signals are described by spike timing, thus Eq. (4) can be replaced by a preciselearning algorithm equation given by (39, 40)
where ( ) = ( − ) is the teacher signal, is the timing of the teacher spike, and ( ) = ( − ) is the actual output signal with denoting the timing of actual output spike. The value of ( ) − ( ) could be 0, -1, or 1, denoting the inference of true fire, false fire or false silence, respectively, and the polarity of the weight updating. The weight update rule in Eq. (5) is realized by the STDP-type potentiation and depression of the 1T1R synapse as shown in Fig. 3 . Note S4. Hardware implementation of POSTs and weight updates. The POST circuit shown in fig. S2A is implemented by a mixed-signal printed circuit board (PCB) including an analog circuit and a digital C, organized in three stages, as shown in fig. S4A . The first stage is an analog TIA collecting the currents I syn from the 1T1R synapses and converting them into a voltage according to V int = R n ΣI syn , where R n = 50 kΩ is the feedback resistance. The second stage is digital C which reads the value of V int by an analog-digital converter (ADC). When V int exceeds a threshold, the C updates the state of a fire variable, which is then compared to the teacher signal. The third stage is a C-controlled multiplexer dictating the TE voltage at the 1T1R synapses depending on falsefire or false-silence events. (9), where neurons corresponding to the stimuli of higher intensity spike earlier, while neurons corresponding to the stimuli of lower intensity spike later. The analog information contained in the external stimuli is represented by the precise spike timing among neurons. Fig. 1d for various t = t PRE -t POST between PRE and POST spikes, in the case of long term potentiation. The device was first initialized in the HRS. The axon spike applied to the FET gate had peak voltage of 2.5 V and a decay time  RC = 8 ms, while the POST spike had an amplitude of V TE+ = 3 V and a pulse-width of 1 ms (see inset). The resistance decreases at decreasing delay, evidencing STDP where more potentiation takes place as the spike delay gets shorter. (B) CDF of measured R for various t = t PRE -t POST between the PRE and POST spikes, in the case of long term depression. The device was first initialized in the LRS. The axon spike applied to the FET gate had peak voltage of 2.5 V and a decay time  RC = 8 ms, while the POST spike had an amplitude of V TE-= -1.6 V and a pulse-width of 1 ms (see inset). The resistance increases at decreasing delay, evidencing STDP where more depression takes place as the spike delay gets shorter. Simulation results by our previous Monte Carlo model of STDP (32) . Each spatio-temporal sequence consisted of a 4-spike train, where each spike was 1 ms long, and separated from the following spike by a 1 ms delay. Between one train and the other, a delay time of 50 ms was inserted to allow for PREs and POST to recover to their rest states. The submission of each spatio-temporal sequence corresponds to one cycle in the figure. The red spots indicate the true sequence 1-4-9-16 which the SNN should recognize after training. To accelerate the supervised learning process, the true sequence was submitted more frequently (at least one true sequence in 50 cycles) than random sequences. (C) Axon potentials generated in response to the submitted spikes between the 41 st and 53 rd training cycle. (D) Detailed axon potential during the submission of a random sequence and (E) during the submission of the true sequence. Synapses #1, #4, #9 and #16 are corresponding to the true pattern 1-4-9-16, indicating potentiation in correspondence of false-silence events in (A). Synaptic weights of channels which do not belong to the true pattern such as synapse 2, 7, 11, 13, and 14, indicating depression in correspondence of false-fire events in (A).
Fig. S7. Convergence of the network training for various initial conditions.
(A) Calculated V int and synaptic weights as a function of time during supervised training of the SNN, assuming that synapses are all initially prepared in HRS. The training converges to synapses 1, 4, 9 and 16 with increasing LRS weight as a result of timedependent potentiation at false silence events. (B) Same as (A), but assuming that synapses are all initially prepared in LRS. The training causes initial weight depression in response to false fire events, followed by weight potentiation at synapses 1, 4, 9 and 16, in response to false silence events. Fig. 4 and fig. S6 ). (B) Same as (A), but for weights initially prepared in HRS (same as fig. S7A ). (C) Same as (B), but for weights initially prepared in LRS (same as fig. S7B ). DL distance for all permutations of four letters in the word "word", and (D) corresponding correlation plot of the calculated V int as a function of the DL distance after supervised training with true pattern "word". The DL distance for a sequence of four elements is computed by assuming a substitution cost of 1 and a transposition cost of 0.5. The DL distance shows 9 levels of values which provide an estimation of the similarity between a generic test sequence and the true sequence. Calculation of the DL distance requires many steps for comparing each element of the sequences within two programming loops. On the other hand, the POST potential V int can assess the similarity between patterns with analog behavior, and only requires one inference step in a spatiotemporal SNN after training which strongly enhances the energy efficiency and processing time thanks to computation acceleration in a high-parallelism SNN.
