# A Hybrid CMOS-Memristor Spiking Neural Network Supporting Multiple Learning Rules Davide Florini, Daniela Gandolfi, Jonathan Mapelli, Lorenzo Benatti<sup>®</sup>, Paolo Pavan<sup>®</sup>, *Senior Member, IEEE*, and Francesco Maria Puglisi<sup>®</sup>, *Senior Member, IEEE* Abstract—Artificial intelligence (AI) is changing the way computing is performed to cope with real-world, ill-defined tasks for which traditional algorithms fail. AI requires significant memory access, thus running into the von Neumann bottleneck when implemented in standard computing platforms. In this respect, low-latency energy-efficient in-memory computing can be achieved by exploiting emerging memristive devices, given their ability to emulate synaptic plasticity, which provides a path to design large-scale brain-inspired spiking neural networks (SNNs). Several plasticity rules have been described in the brain and their coexistence in the same network largely expands the computational capabilities of a given circuit. In this work, starting from the electrical characterization and modeling of the memristor device, we propose a neuro-synaptic architecture that co-integrates in a unique platform with a single type of synaptic device to implement two distinct learning rules, namely, the spike-timing-dependent plasticity (STDP) and the Bienenstock-Cooper-Munro (BCM). This architecture, by exploiting the aforementioned learning rules, successfully addressed two different tasks of unsupervised learning. Index Terms—Bienenstock-Cooper-Munro (BCM), memristor, resistive memory, spiking neural network (SNN), spiketiming-dependent plasticity (STDP). ### I. INTRODUCTION THE demand for ubiquitous edge computing, e.g., due to the deployment of Internet-of-Things devices, calls for energy-efficient computing solutions. In this respect, the need for constantly moving data between the CPU and the memory in the von Neumann architecture has been identified as the main limitation of traditional systems, Manuscript received 24 May 2021; revised 15 March 2022 and 21 June 2022; accepted 22 August 2022. (Corresponding author: Francesco Maria Puglisi.) Davide Florini and Lorenzo Benatti are with the Dipartimento di Ingegneria "Enzo Ferrari," Università di Modena e Reggio Emilia, 41125 Modena, Italy. Daniela Gandolfi and Jonathan Mapelli are with the Dipartimento di Scienze Biomediche, Metaboliche e Neuroscienze, Università di Modena e Reggio Emilia, 41125 Modena, Italy, and also with the Centro Interdipartimentale di Neuroscienze e Neurotecnologie, Università di Modena e Reggio Emilia, 41125 Modena, Italy. Paolo Pavan and Francesco Maria Puglisi are with the Dipartimento di Ingegneria "Enzo Ferrari," Università di Modena e Reggio Emilia, 41125 Modena, Italy, and also with the Centro Interdipartimentale di Neuroscienze e Neurotecnologie, Università di Modena e Reggio Emilia, 41125 Modena, Italy (e-mail: francescomaria.puglisi@unimore.it). Color versions of one or more figures in this article are available at https://doi.org/10.1109/TNNLS.2022.3202501. Digital Object Identifier 10.1109/TNNLS.2022.3202501 known as von Neumann bottleneck [1]. A three-order-of-magnitude gap in delay and energy consumption between the actual computation and the complete von Neumann pipeline has been estimated [2]. This limitation becomes of primary importance if artificial intelligence (AI) algorithms are asked to be solved locally in low-power systems (so-called edge intelligence) since AI algorithms are typically energy-hungry [3]. Recently, a large variety of approaches have been proposed to reduce power consumption while maintaining (or even increasing) performances for AI applications [4]. Hardwarebased neuromorphic spiking neural networks (SNNs), i.e., ad hoc non-von Neumann solid-state circuits emulating some neurobiological functionalities, seem among the most promising. The advantages brought to the fore by SNNs are, in fact, related to their intrinsic functionality that mimics the processes involved in biological neural computation. The latter is indeed very efficient and leverages on a distributed network of elements in which computation and memory functions are co-located, i.e., synapses take an active part in both information processing and storage [5], [6], [7]. In recent years, novel memristive devices emerged as key enablers for fabricating very large-scale SNNs [8], [9] since they can serve as local computational and storage units, emulating specific properties found in the biological realm such as homo-synaptic plasticity [8]. In an SNN, when a presynaptic neuron (i) (hereafter called pre) fires a spike, an excitation (or inhibition in the case of inhibitory neuron) is observed on the postsynaptic neuron (j) (hereafter called post). The amplitude of the response at the post side is mediated by the synaptic efficacy or weight $w_{ij}$ . In fact, $w_{ij}$ is not constant, and its change is referred to as synaptic plasticity. Memristive devices are employed to mimic synaptic weights. When the pre neuron fires a spike, the spike voltage is applied to the memristive device, and the resulting current is integrated by the post neuron. Since, in a first approximation, a single spike provides an all-or-none type of information, the role of the synaptic weight is to introduce analog modulation in the information transferred to the post neuron. In memristorbased SNNs, this task is performed by the conductance of the memristive element. In neurobiology, the change of synaptic weights is called learning and the conditions leading to express plasticity are the learning rules [5]. The implementation of learning rules in SNNs leads to optimizing networks' ability to perform a given task. Several learning rules have been derived and encoded, and here, we focus on those involved in biologically plausible mechanisms characterized by the absence of supervision or input labeling. Unsupervised learning is indeed the most intriguing feature pertaining to human cognition and perception by virtue of its independence from teaching signals at each learning event [10]. The learning rules instantiating the unsupervised learning can be classified as being timing or rate based. In timing-based rules, synaptic plasticity is expressed in terms of the time at which individual spikes at the pre and/or post are produced, whereas in rate-based, the relation is expressed in terms of the spiking rates at the pre and/or post. The functional properties emerging from the activity of plastic networks are related to specific learning rules. Two of the most studied are categorized into spike-timing-dependent plasticity (STDP) [10] and the Bienenstock-Cooper-Munro (BCM) [11], [12], [13]. STDP is known to develop an asymmetric weight connectivity when repetitive exposure to inputs with consistent temporal structure is presented to the network [14], [15]. BCM, on the other hand, has been shown to subtend the emergence of neuronal selectivity [11], [12], [13]. While STDP is sometimes used for pattern classification tasks, both using mathematical models of neurons (e.g., leaky integrateand-fire) and synapses [16] and memristive devices [8], [17], [18], [19], we limit our discussion on STDP to its timing properties. This choice is based on the idea that BCM can provide a better implementation of pattern classification due to its intrinsic development of selectivity. While the developmental equation for weights under the STDP rule predicts Hebbianlike weights dynamics [5], STDP does not entail competition between input synapses associated with different patterns. When using STDP, this type of competition is typically added through several mechanisms at neuron and/or network level, and some examples of these mechanisms are adaptive firing threshold, lateral inhibition [16], and synaptic normalization. On the other hand, BCM has competition between synapses intrinsically built-in (discussed briefly in Section III-B), making it better suited for pattern separation and classification. The coexistence of multiple learning rules is of primary importance to provide neuromorphic hardware with the ability of the brain to adapt to complex scenarios. Several works have shown the possibility to implement STDP with either analog [8], [18], [19], [20] or binary [21] memristive devices and few others tackled the problem of other long-term plasticity rules [22], [23], [24], [25], [26], [27]. At the first glance, mimicking several plasticity rules would require the use of different classes of synaptic elements integrated in the same chip by means of different synaptic devices. Nonetheless, this poses serious challenges to process engineers, dramatically increasing the technological complexity and the overall cost of the circuit. For this reason, in this work, we address this challenge by exploring the possibility to implement two different learning rules, namely, STDP and BCM, in a single neuromorphic platform using a unique type of synaptic device. Previous works [24], [25], [26], [27] have shown the co-integration of rate and temporal learning on memristive elements, but with several limitations. He et al. [24] showed the integration of two learning rules (rate- and timing-based) on the very same synaptic element, assuming cumulative effects of presynaptic stimulation. In their implementation, depending on the pre rate, either STDP or rate-based (SRDP) plasticity is observed. However, no correlation between pre and post neuron activities is considered for the rate-based plasticity, as instead required by many learning rules such as BCM [5], [28]. Ahmed et al. [25] implemented different plasticity rules by using specialized writing circuits to program the memristive elements and not directly exploiting the actual properties of the spikes. A time-to-digital-to-voltage amplitude circuit reads pre and post spike timings and subsequently updates the memristor conductance using analog multiplexers, allowing fine-tuning of the programming voltage in relation to the pre and post timings. Wang et al. [26] were able to reproduce both STDP and SRDP with a series connection of volatile (diffusive) and nonvolatile (drift) memristors. The use of the volatile memristor temporal dynamics achieves STDP with nonoverlapping spikes as well as a rate rule with a monotone relationship between pre stimulation rate and potentiation. While it is an interesting approach, it suffers from several drawbacks. To demonstrate STDP, they have exploited the delay and relaxation time of the diffusive memristor. The waveforms consist of a short high-voltage pulse followed by a long low-voltage one. The former can induce plasticity in the drift memristor (i.e., long-term plasticity), but due to the delay time, it is unable to activate the diffusive memristor, which, when OFF, effectively impedes plasticity. The latter instead is unable to modify the drift memristor but switches ON the diffusive memristor. Thus, a couple of spikes close to each other are needed to induce long-term plasticity. A careful design of the pre and post spikes allows achieving STDP. In addition to the common STDP, however, this approach includes a non-Hebbian plasticity term due to multiple pre (or post) spikes close to each other. Furthermore, their implementation of SRDP does not consider correlation between pre and post rates (i.e., it is a non-Hebbian plasticity), and its monotonic relation between weight and stimulation rate is different from the BCM rule. Milo et al. [27] integrated STDP and SRDP with memristors using a one-transistor oneresistor (1T1R) and 4T1R architecture, respectively. For the SRDP, long-term potentiation (LTP) is obtained when the post neuron is active and, at the same time, the pre stimulation rate is sufficiently high. Long-term depression (LTD) is achieved by using random post neuron depression back-spikes, which are effective only when the pre neuron is also active. The three main drawbacks of their implementation are: the nonintegration by the post neuron of low-frequency pre spikes, the quadratic relation between the weight dynamics and the presynaptic rate (rather than the postsynaptic rate as in BCM rule [12]), and the lack of threshold adaptation to move from LTD to LTP. As explained briefly in Section III-B, the last two properties are important for the robust development of selectivity in the post neuron. The novelty of this work is the integration of different learning rules using the same 1T1R synaptic circuit allowing different synapses to follow specific learning rules. No specialized programming circuit is used, with the spikes produced by the proposed neuron design being directly used as the programming voltages for the memristors. To this end, each of the two learning rules is tied to a different input of the proposed neuron. Different back-spikes at the two inputs are used to program the memristor synapses to follow different rules. While our implementation of the STDP is alike other in [8], [9], [18], [19], [20], [29], and [30], we propose a new architecture for a detailed implementation of the BCM rule. The implemented learning rules are successfully tested by circuit simulations of artificial SNNs and are verified to be consistent with the detailed features of the plasticity models employed. A thorough electrical characterization of commercially available packaged memristive devices is performed, and the results are used to calibrate the UniMORE resistive random access memory (RRAM) compact model [31]. The calibrated model is then used to 1) verify the conditions to be satisfied by the chosen memristive device in order to exhibit the specific type of plasticity; 2) design a full hybrid-CMOS memristive circuit that can support simultaneously both learning rules; and 3) to perform comprehensive circuit simulations of two different tasks that leverage on the two different bio-inspired learning rules. Results show that using the same components (artificial neurons and synapses) and only changing the organization of the network, i.e., the connectivity and the plasticity rules, we can optimize the architecture to perform different tasks. # II. DEVICES, EXPERIMENTS, AND COMPACT MODELING # A. Devices and Experiments The devices we used are the carbon-doped self-directed channel (SDC) memristors fabricated and commercialized by Knowm Inc. [32], [33]. In SDC memristors, Ag agglomerated in a Ge<sub>2</sub>Se<sub>3</sub> layer is used to modulate the conductance of the device [32], [33]. The structure of the memristor is a multilayer stack composed of W/Ge<sub>2</sub>Se<sub>3</sub>/Ag/Ge<sub>2</sub>Se<sub>3</sub>/SnSe/Ge<sub>2</sub>Se<sub>3</sub>:C/W, where Ge<sub>2</sub>Se<sub>3</sub>:C is the active layer [32]. During fabrication, the first three layers below the TE are mixed and form the Ag source [33]. The role of the SnSe layer is twofold: 1) it acts as a barrier to avoid Ag saturation in the active layer; and 2) the production of Sn ions and their migration into the active layer during the device "forming" promote Ag agglomeration in specific sites [32], [33]. The electrical measurements were performed using the Keithley 4200-SCS semiconductor parameter analyzer on dual in-line package (DIP) devices. To verify the functionality and basic operation of the memristors [34], [35], [36], we performed a sequence of 15 I-V measurements in dc sweep (quasi-static) mode. The applied voltage ramp extended from -0.8 to 0.4 V, and the parameter analyzer was set to enforce a $10-\mu A$ current compliance. Results are shown in Fig. 1(a) (red traces): the switching curves are characterized by an abrupt transition from the lowresistive state (LRS) to the high-resistive state (HRS) with a marked cycle-to-cycle variability on the transition voltage and a more predictable and gradual transition from HRS to LRS. Then, the synaptic functionality of the memristors was experimentally verified by applying a suitable pulsed voltage sequence to the device. In this experiment, a $10-k\Omega$ resistor Fig. 1. (a) Experimental (red curves) and simulated (black lines) quasi-static I-V characteristics of the memristive device. (b) Pulse waveforms used to potentiate and depress the memristive device. (c) Experimental (symbols) and simulated (solid line) pulsed response of the memristive device when subject to sequences of potentiation (red circles) and depression (blue diamonds) pulses, shown in (b). The device is initially driven in LRS by means of 20 "initial set" rectangular pulses, also shown in (b). The resistance read after each pulse by means of a read pulse, also shown in (b), is computed as $(V_{\rm READ}/I) - R_s$ ( $R_s = 10~{\rm k}\Omega$ series resistance). was connected in series with the device to prevent accidental current overshoots (whit the role of the series resistor being negligible in normal operating conditions). The device was initially driven in LRS by means of 20 rectangular pulses ( $V=0.6~\rm V$ and $T=100~\mu s$ ). Then, depression and potentiation were verified by means of trains of 20 identical rectangular depression pulses ( $V=-0.2~\rm V$ and $T=10~\mu s$ ) followed by 20 identical rectangular potentiation pulses ( $V=0.55~\rm V$ and $T=30~\mu s$ ). Each potentiation or depression pulse is followed by a small reading pulse ( $V_{\rm READ}=50~\rm mV$ and $T_{\rm READ}=50~\mu s$ ) that is used to retrieve the resistance value. The pulses are shown in Fig. 1(b), while the synaptic response of the device to the pulse sequence is reported in Fig. 1(c), in which a gradual and reproducible resistance change caused by the cumulative effect of identical pulses is evidenced. # B. Compact Model The experimental data are used to calibrate the UniMORE RRAM compact model [31]. The latter is a physics-based compact model supported by the results of advanced multiscale simulations [37] that has been shown to reproduce both the quasi-static and the dynamic behavior of different memristors technologies with a single set of parameters [38] and considers the intrinsic device stochasticity and random telegraph TABLE I Values of the Parameters of the Compact Model as Listed and Described in [31] Calibrated to Reproduce the Data in Figs. 1 and 2 | $\rho (\Omega \cdot nm)$ | t <sub>ox</sub> (nm) | $S_0(\mu m^2)$ | E <sub>a</sub> (eV) | $V_0(V)$ | $\alpha (K^{-1})$ | β | c <sub>0</sub> (Hz) | $C_{pb}(JK^{-1})$ | |--------------------------|----------------------|----------------------|---------------------|-----------|-------------------|----------|---------------------|----------------------| | $14 \cdot 10^9$ | 40 | 25 | 0.12 | 0.353 | 0 | 0.0022 | $2.5 \cdot 10^{14}$ | $2 \cdot 10^{-11}$ | | | | | | | | | | | | $C_{pcf}(JK^{-1})$ | $k_{cf}(WK^{-1})$ | E <sub>ad</sub> (eV) | g (e·nm) | gg (e·nm) | $E_{ag}(eV)$ | a (e·nm) | b | $k_{bar}(WK^{-1})$ | | 4 · 10-9 | 9 · 10 <sup>-8</sup> | 1 | 79 | 0.99 | 1 | 2 | 7.3 | 4 · 10 <sup>-7</sup> | noise [39], [40] providing a strong link to the device physics. Coded in Verilog-A, the model can be seamlessly employed in SPICE circuit simulations. The model includes the main mechanisms governing the transition between HRS and LRS by means of field- and temperature-assisted ions/defects drift and recombination (LRS-to-HRS transition) and by means of field- and temperature-driven bond breaking and related defect generation/motion (HRS-to-LRS transition). The internal temperature dynamics modeling includes the effect of thermal capacitance. The details of the model are discussed in [31]. The model calibration, as well as all circuit simulations, is carried out using the Cadence Virtuoso ADE tool. Initially, the model parameters are adjusted to reproduce the results of both quasi-static switching and pulsed response experiments performed during the characterization stage, using the same voltage waveforms used in the experiments. The results of both experiments, as shown in Fig. 1(a)-(c), are correctly reproduced, confirming that the calibrated model can be effectively used in circuit simulations for dependable results. Specifically, the model properly accounts for the nonlinearity of charge transport observed in both HRS and LRS, as well as for the quite abrupt transition from LRS to HRS and the more gradual transition from HRS to LRS in quasi-static operation, together with the observed cycle-to-cycle variability. In addition, the model well-reproduces the typical potentiation and depression patterns in Fig. 1(c), which is essential for the simulation of spike-driven local learning processes. The optimized values of the model parameters (as given in [31]) are shown in Table I. # III. BIO-INSPIRED LEARNING RULES In an SNN, action potentials propagate through the network conveying information and potentially determining changes in the weights of the encountered synapses, giving rise to synaptic plasticity. The existence of various learning rules has been reported in almost all brain areas according to their role in brain computation. For instance, the cerebellar cortex, which is primarily involved in motor learning, can express different plasticity rules. In particular, in the same glomerulus, a small and compact synaptic hub collecting axons and dendrites from tens of neurons, Hebbian [41], non-Hebbian [42], STDP [43], and BCM-like [44] synaptic rules can be simultaneously expressed. This rich repertoire of mechanisms allows the circuit to exploit complex spatiotemporal input computation. Similar mechanisms have been observed in visual cortical systems [45] and in hippocampal formations [46]. In a wider perspective, timing-based learning models are typically associated with tasks in which the precise timing of events assumes significant importance, such as velocity estimation or spatial navigation [46]. Conversely, rate-based models have been used to explain computational primitives such as statistical learning and neural selectivity [5], [28], as in the case of pattern discrimination of sensory stimuli operated by cortical columns [47]. We focus on two well-known plasticity rules, the STDP and the BCM, since they are typical examples of timing- and rate-based rules. In a circuit simultaneously supporting both learning rules, an important objective is to achieve a design in which information is conveyed through the network in a learning-rule agnostic fashion. Indeed, being spikes stereotyped events, only their timings and rates should, at least in first approximation, carry the information. In fact, in theoretical neuroscience, it is typical to model spikes using Dirac's delta function, implying that the shape of the spike carries no information [28]. Overall, at the post neuron, the effect of a pre spike should only depend on the synaptic efficacy and not on the learning rule used to modify the weight. This constrains the (forward propagating) spikes to be identical in synapses learning by different rules. #### A. Spike-Timing-Dependent Plasticity Learning Rule In STDP, the sign of plasticity depends on the relative timing between the spikes at the pre and the post neurons. The focus is on the causality: the synapses connected to pre neurons that cause the post neuron to fire undergo LTP, while the synapses that experience an anticausal relation undergo LTD. From a neuromorphic device standpoint, STDP can be implemented by exploiting the overlap of presynaptic and postsynaptic spikes [9], [29]. Using overlapping waveforms from pre and post neurons to obtain STDP is commonly employed [8], [9], [18], [19], [29], [30], though other approaches are possible. For example, Wang *et al.* [26] used nonoverlapping spikes leveraging on the temporal dynamics of volatile memristors. In [20] and [27], the waveforms applied across the memristor are driven by the post neuron only, while the pre neuron just drives the selector transistor. To study the feasibility of overlap STDP with the calibrated memristor model, the setup in Fig. 2(a) has been considered. The memristor device with its nMOS selector is placed between two spike waveforms' voltage sources that mimic the effective behavior of the pre and post neurons, respectively, labeled as forward spike $(V_{\rm FS})$ and backward spike $(V_{\rm BS})$ , see Fig. 2(b). The forward and backward spikes are designed to be different from each other to optimize the STDP relation. To be noted that only the forward spike causes excitation on other Fig. 2. (a) Test circuit used to verify the correct implementation of STDP rule. The two voltage waveform generators mimic the pre and post neurons. (b) Forward spike ( $V_{\rm FS}$ ) and backward spike ( $V_{\rm BS}$ ) waveforms. (c) Relative change of the synaptic weight (the memristor conductance) as a function of the relative timing of pre and post spikes, showing similarity to the STDP relation found in biological synapses. Simulations are repeated starting from different initial resistance values (different symbols). Experimental results obtained on an SDC memristor are reported as black hollow circles. neurons (i.e., the actual output spike of a neuron), while the backward spike, on the other hand, can solely travel backward to implement plasticity. The memristor is placed with its bottom electrode connected to $V_{\rm FS}$ , with the top electrode facing the selector, in turn connected to $V_{\rm BS}$ . The SpikePRE and SpikePOST signals in Fig. 2(a) are rectangular pulses that control the voltage sources representing the pre and post neurons, respectively. Specifically, SpikePRE is driven high when the pre neuron fires a spike and is driven low at the end of the spike. In addition, SpikePRE also drives the selector, connecting the memristor between $V_{\rm FS}$ and $V_{\rm BS}$ . SpikePOST is driven high when the post neuron fires a spike and is driven low at the end of the spike. When SpikePRE and SpikePOST are low, $V_{\rm FS}$ and $V_{\rm BS}$ are, respectively, zero. Two possible conditions can occur depending on the relative spike delay. 1) The pre neuron spikes more than 10 ms before or more than 10 ms after the post neuron spikes: in this case, there is no time overlap between the two spikes. When only SpikePRE is high, the memristor will experience $V_{\rm FS}$ and 0 V at its bottom and top electrodes, respectively. In this case, no conductance modulation, i.e., plasticity, shall be caused by the forward spike, which - is therefore designed to be small enough. When only SpikePOST is high, the selector is open and the memristor top electrode will be floating, naturally preventing any conductance modification. - 2) The time interval between firing events at the pre and post neurons is in between -10 and 10 ms. The overall time can be divided into three periods: only SpikePRE is active, both SpikePRE and SpikePOST are active, and only SpikePOST is active. Their sequence is dictated by whether the pre neuron firing anticipates or follows the post neuron firing activity. As outlined in the previous point, no plasticity will occur when SpikePRE or only SpikePOST is active. However, when both are active, i.e., during the overlap, the total voltage across the memristor is $V_{\rm BS} - V_{\rm FS}$ , which must be designed to be sufficiently high to cause conductance modulation. The two voltage waveforms in Fig. 2(b) have been carefully designed to yield an STDP-like relation between the relative conductance variation and the relative delay between pre and post spikes. Fig. 2(c) shows the relative conductance (i.e., weight) change in the synaptic element when subject to spikes with different delay times, which shows the typical pattern of STDP response [14]. These curves are obtained by simulating the circuit in Fig. 2(a) with different delay times between pre and post neuron spikes (with positive delay meaning that the spike at the post neuron anticipates the one at the pre neuron) and iterated for different initial resistance states. Experimental data [black circles in Fig. 2(c)] are obtained using pulses identical to those in Fig. 2(b) applied to the SDC memristor using the Keithley 4200-SCS. Simulations and experimental results are in good agreement with each other, increasing the dependability of the memristor model and the simulations in this work. Simulation results also show the saturating effect on plasticity caused by the bounded resistance dynamics. Such a saturating effect on the weights is sometimes regarded as weight-dependent plasticity rate [5]. The curves in Fig. 2(c) indeed show how the more a device is potentiated (depressed), which corresponds to low (high) $R_0$ , the harder is to potentiate (depress) it further, which has been reported as the optimal choice for synaptic behavior to maximize memory capacity in recursive networks [48]. However, the dynamic range that we show in Fig. 1(c) is not representative of the one that can be obtained in general, as it results from the choice of the voltage pulse amplitude, pulsewidth, and the number of consecutive potentiation or depression pulses [that was set to 20 in the example of Fig. 1(c)]. However, if more pulses are delivered, potentiation and depression can continue (although eventually saturating) and a much larger dynamic range (about 9× in the networks simulated in this study) is achieved. #### B. BCM Learning Rule Unlike STDP which relates individual spike timing with change in synaptic efficacy, BCM is a rate-based learning rule, relating plasticity to the average firing activity measured over time. The typical function used to describe BCM is $\vec{w}_{ij} = \eta \cdot \Phi(v_j, \theta) \cdot v_i$ , where $\vec{w}_{ij}$ is the time derivative of the Fig. 3. (a) Test circuit used to verify the correct implementation of the t-STDP rule. The two voltage waveform generators mimic the pre and post neurons. A limiter circuit limits the maximum voltage of the backward spike. The maximum voltage allowed by the limiter, $V_{\rm sat}$ , is set on its control port. (b) Relative change of the synaptic weight (the memristor conductance) as a function of the relative timing of pre and post spikes, showing similarity with the t-STDP curves reported in theoretical works to implement the BCM rule [50], where only the potentiation window of the t-STDP (negative delays) is affected by the triplet term. synaptic weight connecting pre neuron i to post neuron j, $\eta$ is the learning rate, $v_i$ is the pre firing rate, $v_j$ is the post firing rate, and $\theta$ is the post firing rate threshold separating LTD from LTP (i.e., for $v_j < \theta$ , LTD will occur, LTP otherwise). In many theoretical works, $\Phi(v_j, \theta)$ is typically written as $\Phi(v_j, \theta) = v_j(v_j - \theta)$ , which provides the characteristic nonmonotonic relation between postsynaptic firing rate and plasticity [5], [12], [28]. The distinctive trait of the BCM rule is its ability to provide selectivity. It can be proven that, under certain conditions, the only stable point in the dynamics of the weights leads to post neurons responding selectively to the input patterns used during training [12]. Given the features of the BCM rule outlined above, it follows that its implementation on a memristors-based SNN requires specialized circuitry to monitor the firing rates of both pre and post neurons and then update the conductance value of the memristor. It has been shown that the BCM rule can be obtained with a triplet-STDP (t-STDP) protocol [23], [49], [50]. The principle behind emulating BCM with t-STDP can be summarized as follows. In the framework of the STDP rule, if the delay between pre and post neurons firing activity is uniformly distributed over time, then the average effect on the synaptic weight will be either potentiation or depression depending on the area underneath the STDP curve [5]. Therefore, if a viable strategy to modulate the area beneath the STDP curve depending on the spiking activities of the neurons can be found, then a rate-based plasticity rule can be effectively implemented. In our implementation, the STDP curve is varied by tuning the maximum positive voltage of the back-spike, affecting only the LTP region of the curve, by means of a limiter circuit [additional block at the post neuron side in Fig. 3(a)] controlled by an ad hoc control voltage generated by a dedicated circuitry (described later in this section). An important feature of this implementation is that only the post back-spike is modified, whereas the forward spike is unchanged compared to the STDP case, making information propagation independent of the learning rule. To test this approach, a setup like the one used to show STDP in Section III-A is used, with the addition of a limiter block with external control that limits the maximum voltage of the back-spike, as shown in Fig. 3(a). The results of the simulations are shown in Fig. 3(b), which demonstrates that the total area beneath the STDP curve can be effectively modulated by tuning the maximum positive voltage of the back-spike. The obtained results are in accordance with the theoretical predictions reported in [50]. In this implementation of the t-STDP protocol, the area beneath the LTP region should be modulated by the temporal distance between successive postsynaptic spikes. The circuit in Fig. 4(a) has been designed to drive the limiter circuit to implement BCM through the t-STDP protocol. Starting from the part of the circuit labeled as Fast, each time the postsynaptic neuron fires a spike, the signal Spike POST is set high and the capacitor $C_{\text{fast}}$ is charged through the pMOS $M_{p3}$ . The capacitor $C_{\text{fast}}$ is discharged through the nMOS $M_{n3}$ with a rate that depends on the voltage applied to its gate, $V_p$ . The capacitor voltage is buffered $(V_{sat})$ , and it is used to drive the limiter circuit. The capacitor $C_{\text{fast}}$ is chosen such that it is fully charged after just one spike, i.e., during normal conditions, at the end of a spike, the voltage $V_{\rm sat}$ is at its maximum voltage, $V_{\text{max}2}$ . During the interspike interval, $V_{\text{sat}}$ decreases, and at the successive spiking event, its value will set the maximum amplitude of the back-spike through the limiter circuit. To test the ability of the t-STDP implementation described above in emulating the BCM rule, the setup reported in Fig. 4(b) is simulated. Two independent random spiking trains are generated and applied at the presynaptic and postsynaptic side. Spike trains are generated by dividing the time into discrete 1-ms temporal bins, and to each bin, a random binary value is assigned signaling if a spike is generated in that time interval, bin = "1" or, not, bin = "0." Since spikes are 10 ms long while bins are 1 ms long, a nine-bin long absolute refractoriness is inserted preventing a spike being generated, while the previous is not completed. The firing rate of a train can be modulated by changing the probability of a bin to be "1." The simulation consists in applying 10-s samples of presynaptic and postsynaptic spike trains, with rate $v_i$ and $v_i$ , respectively, to the circuit and extracts the relative conductance variation due to the presentation of the two trains. The postsynaptic rate $v_i$ is then varied to obtain the relation between plasticity and postsynaptic spiking rate. Due to the intrinsic stochasticity introduced by the t-STDP protocol, the above process is iterated 15 times for each postsynaptic rate, with each simulation employing different realization of presynaptic and postsynaptic trains. Fig. 4(c) shows the results of such simulations where the characteristic Fig. 4. (a) Schematic of the circuit used to generate the limiting voltage, $V_{\rm sat}$ , for the limiter circuit used to implement the BCM rule. The two parts of the circuit, namely, the slow and the fast part, are evidenced. The fast part is used to convert the temporal distance between two post spikes, $\Delta T_{\rm post}$ , into the limiting voltage, $V_{\rm sat}$ . The slow part monitors the average firing rate of the post neuron and adjusts, through the $V_p$ voltage, the gain function relating $\Delta T_{\rm post}$ to $V_{\rm sat}$ in the fast part of the circuit, therefore implementing the adaptive threshold mechanism. (b) Schematic of the circuit used to verify that the implemented t-STDP protocol gives rise to BCM-like synaptic plasticity when pre and post neuron fire stochastically in time (as Poisson spike trains). These results are obtained by removing the slow part of the $V_{\rm sat}$ generator circuit in (a) and controlling $V_p$ manually to resolve its contribution in the threshold adaptation mechanism. (c) Relative change of the synaptic weight (memristor conductance) as a function of the post spikes rate obtained with different values of $V_p$ (different symbols). Symbols identify the average value obtained across many simulations, and the error bars report the $\pm 1~\sigma$ extension. (d) Simulated output of the slow circuit, $V_p$ , as a function of the post neuron rate. (e) Relation between LTD-LTP modification threshold [i.e., the post rate at which the curves in (b) cross the x-axis] and post neuron firing rate. nonmonotonic relation between postsynaptic firing rate and plasticity is visible. Specifically, the symbols in Fig. 4(c) represent the average relative conductance variation computed over different realizations with the error bars placed at $\pm$ one standard deviation. Fig. 4(c) also shows the effect of the gate voltage $V_p$ , of the discharge nMOS $M_{n3}$ , in shaping the BCM curve. This design allows obtaining, on average, a BCM-like relative conductance response embodied by a nonmonotonic function of the post firing rate but lacks another important feature of BCM, namely, the long-term threshold adaptation [11], [12]. The use of a fixed post firing rate threshold (i.e., $\theta = \text{const.}$ ) does not allow for the robust emergence of pattern selectivity. For instance, in a feedforward network with a single post neuron that is excited by two input patterns (each delivered through an arbitrary number of synapses), the two input patterns will cause two distinct responses in terms of post neuron rate, $v_{i1}$ and $v_{i2}$ . However, selectivity to the two patterns emerges only if $v_{i1}$ and $v_{i2}$ are not both above or both below the threshold. In such a case, the synapses related to the pattern evoking high rate at the post neuron will experience LTP that, over time, increases further the post rate, whereas the response to the other pattern will get silenced due to LTD. If the above relation between $v_{i1}$ and $v_{i2}$ and the threshold is not satisfied, i.e., if $v_{i1}$ and $v_{i2}$ are both above or both below the threshold, all the synapses related to both input patterns will experience LTP (LTD) until saturation, i.e., driving the post neuron to fire at the maximum (minimum) rate, irrespective of the pattern applied. This problem is addressed by letting the threshold be a function of the average post rate [i.e., $\theta = \theta(v_i(t))$ ]. The threshold adaptation mechanism is typically assumed to have a sufficiently long time constant to average the post rate response over all the input patterns [28]. If this condition is satisfied, the threshold function $\theta(v_i(t))$ can be designed to robustly separate the highest post rate response from the other, i.e., adaptively setting the threshold such that $v_{i1}$ and $v_{i2}$ are not both above or both below it. If so, after a learning period, the post neuron will respond selectively to only one of the patterns used during learning, irrespective of the post neuron rates they initially evoked. The adaptive threshold mechanism effectively implements competition between input synapses. In fact, the potentiation of a subset of synapses (e.g., associated with an input pattern) makes the post neuron to be more active, which leads to a higher BCM threshold, which hinders other synapses (i.e., associated with different input patterns with respect to the initial one) to potentiate further [28]. In addition, according to the theoretical foundation of the BCM learning rule, if the function that relates the threshold to the average post neuron firing rate is super-linear, a feedback effect that stabilizes the average post neuron firing rate emerges. More details on the advantages and the requirements for an adaptive threshold can be found in [11] and [12]. The adaptive threshold can be easily added to the proposed circuit design modulating the discharging current through $M_{n3}$ . The circuit used to estimate a long-term firing rate of the post neuron, labeled as slow in Fig. 4(a), is similar to the one used to measure the temporal distance between two spikes [i.e., fast in Fig. 4(a)], with the main difference being a longer time constant in charging and discharging the capacitor $C_{\text{slow}}$ . This circuit is used to automatically modulate the gate voltage $V_p$ , of the discharging nMOS $M_{n3}$ , which effectively provides a sliding threshold, as shown in Fig. 4(c). Fig. 4(d) shows the steady-state $V_p$ voltage generated by the circuit at different post spiking rates, while in Fig. 4(e), the threshold–frequency relation as obtained from the set of simulations of Fig. 4(b) is shown. Notably, the proposed circuit, capacitors included, can be seamlessly integrated within each neuron design, slightly increasing its footprint (estimated for transistor-level simulations in about 100 $\mu$ m<sup>2</sup> when implementing $R_{\text{slow}}$ with switched capacitors), without affecting the advantageous simplicity of the compact 1T1R synaptic architecture that is agnostic of the specific learning rule adopted. This is important as the synapse to neuron ratio can be in the order of 10<sup>4</sup> [5], [9]. Thus, extra circuit area can be tolerated in the neuron circuit but not on the synaptic element. # IV. From Rules to Spiking Networks # A. Neuron Architecture The implementation of an artificial neural network that can simultaneously support both BCM and STDP synapses requires designing a suitable neuron architecture. For this purpose, a neuron circuit is proposed based on a leaky integrate and fire neuron model with two inputs, where all the BCM and STDP synapses from pre neurons converge. The circuit has one output line for the spike waveform along with a digital signal, driving an arbitrary number of both BCM and STDP 1T1R synapses connected to its output. In general, a neuron is embedded into a network, asynchronously receiving and providing inputs from and to other neurons. For example, in a layered structure as that in Fig. 5(a), a neuron in Layer 1 is a post neuron for some of the neurons in Layer 0 and a pre neuron for some neurons in Layer 2. The internal neuron architecture is shown in Fig. 5(b). The neuron can be in two states, namely, the integration state and the firing state, depending on the value of the logic signal "Spike" generated by the SR latch. During integration (i.e., "Spike" at logic zero), both the inputs are connected to the leaky current integrator that locally provides virtual ground. At the same time, the neuron output is grounded. When the integrator voltage reaches the threshold, the "Spike" signal is raised, and the neuron is set in the firing state. The inputs are then disconnected from the integrator and connected to dedicated spike generators that are triggered by the same "Spike" signal, providing the back-spikes to the input Fig. 5. (a) Framework of an SNN that supports both learning rules. A neuron in layer n behaves as either a pre or a post for other neurons in layer n-1 and n+1, respectively. Pre neurons drive the 1T1R synaptic element through two signals, the forward spike waveform [neuron out in (b)] that carries the information to the post (next layer), and a logic signal [spike in (b)] that enables the selector transistor. Two inputs are used to connect the synapses following different learning rules since, as described in Section III, they require different backward spikes to be implemented. (b) Leaky integrateand-fire neuron architecture simultaneously supporting BCM and STDP rules on different synapses. (c) Simulated time diagrams of critical nodes of a neuron in an SNN architecture. While the neuron is in the integration state (Spike = "0"), both inputs are tied to 0 V by the integrator virtual ground, so incoming spikes are not seen in the voltage traces of panels 3 and 4. When the integrator voltage reaches the threshold (first panel), the "Spike" signal is raised (second panel), disconnecting the inputs from the integrator and triggering all the waveform generators (at the inputs and output) to provide backward and forward spikes (panels 3-5). In panel 4, the saturating effect of the limiter circuit involved in the implementation of the BCM is shown. The temporal distance between two spikes modulates the maximum voltage of the BCM backward spike. The negative slope of $V_{\text{sat}}$ is modulated on a longer time scale by the $V_p$ voltage (not shown) to implement BCM adaptive threshold, see Section III-B. synapses. At the output, the "Spike" signal triggers the spike generator that transmits the forward spike. The "Spike" signal is also used to drive the $V_{\rm sat}$ generator circuit of Fig. 4(a) (in which the control signal is labeled "Spike POST") that controls the voltage limiter that in turn tunes the maximum positive voltage of the BCM back-spike, as described in Section III-B. As all the generated spikes share the same time duration, $T_{\rm Spike} = 10$ ms, it is sufficient to introduce a delay circuit that resets the SR latch after $T_{\rm Spike}$ , which discharges the integrator and brings the neuron back into the integration phase. Fig. 5(c) shows the above process in terms of the time evolution of the voltage at specific nodes of the internal neuron architecture. Except for the subcircuit in Fig. 4 that was simulated at the transistor level, the components of the neuron were simulated with Verilog-A behavioral models. As such, no detailed figures of merit, such as silicon footprint, power consumption, and circuit complexity, can be provided. However, we suggest possible low-power implementations of the most critical constituent blocks. Spike generators can be implemented using a digital pulsewidth modulation circuit using a simple free-run digital counter and a few simple parallel registers accessed sequentially. The counter and the memory output can be wired together in an n-bit AND gate that would drive a toggle flip-flop to produce a square wave with controlled duty cycle. A passive low-pass filter would complete the circuit producing the spike waveform. The clock driving the counter can run at low frequency given the low bandwidth of the output pulse, limited in the kilohertz range. Another possibility, with less control on the spike shape (i.e., more distorted STDP window), entails analog multiplexing of different voltage sources [20], [30]. In a transistorlevel implementation of the neuron circuit that we propose, we expect that the largest fraction of the area would be contributed by the three op-amps needed to implement the leaky integrator, the comparator, and the buffer in the saturation voltage generator circuit for the BCM. In our network, all the signals are limited in bandwidth to a few kilohertz, as can be intuitively seen from Fig. 5(c). Therefore, this allows using ultralow power (10-100 nW) op-amp with subthreshold transistors that have notoriously limited bandwidth [51], [52], [53]. The overall neuron power consumption would be dominated by the saturation voltage generation circuit (simulated to be around 1.6 $\mu$ W) and by the power needed to effectively potentiate and depress the synapses (simulated to be, on average, in the 10–100 nW per synaptic connection). # B. Unsupervised Motion Detection Task With STDP The effectiveness of the proposed STDP implementation is verified by simulating a circuit inspired by the one in [14], in which Dan and Poo present a simple visual cortical network model with two groups of direction-selective neurons connected through two STDP synaptic layers to a readout neuron. Each neuron is placed in a specific spatial position, i.e., has a distinct receptive field that is the region of space in which a stimulus must occur for the neuron to respond with a spike. Therefore, a stimulus (e.g., an object) that moves across the receptive fields of the different neurons will cause a time- and space-dependent neuron activity. According to [14], the interaction of motion stimuli and STDP leads, after several presentations of a moving object, to an asymmetry in weights connecting input and output neurons. This asymmetry causes the output neuron to receive strengthened excitation from input neurons that early respond to the moving object, making the output neuron progressively anticipate its spiking activity over time [14], [15], [28]. In this study, we reproduced the circuit, limiting the number of neurons per group to 32. This is schematically shown in Fig. 6(a), in which two groups of 32 direction-selective Fig. 6. (a) Network used to perform the STDP-based unsupervised motion detection task, inspired by the example in [14]. Two sets of 32 direction-selective neurons labeled as $\rightarrow$ (red neurons) and $\leftarrow$ (blue neurons) enclosed by circles (red preferring L-to-R motion and blue R-to-L) excite a single post neuron. All the inputs converge at the STDP input of the post neuron. (b) Spatiotemporal stimulation provided by the two sets of input neurons (red circles for L-to-R neurons and blue crosses for R-to-L) caused by the passage of an object (black dot), from L to R (0–40 ms) and from R to L (from 90 to 130 ms). (c) and (d) Similar to (b) for L-to-R (c) and R-to-L (d) neurons only, when several stimulations are repeated. (e) Asymmetric weight development caused by the repetition of the spatiotemporal patterns in (b)–(d). The transition between the set of weights that gets potentiated ( $\Delta G/G > 0$ ) and depressed ( $\Delta G/G < 0$ ) shifts over time in the opposite direction of the one identified by the selectivity of the neurons, which causes the anticipated output response. neurons, 32 left to right and 32 right to left, respectively, labeled as $\rightarrow$ (red neurons) and $\leftarrow$ (blue neurons) enclosed by circles, provide the excitatory inputs through two STDP synaptic layers to a readout neuron. Input neurons with the same indices (e.g., in $_{LR1}$ and in $_{RL1}$ ) have the same spatial receptive field, while the preferred direction (LR or RL) specifies the motion direction to which the neuron evokes its maximum firing rate, i.e., when an object is at the maximum of the spatial receptive field moving in the preferred direction the neuron spikes at its maximum rate. In general, direction selectivity in neurons can be obtained by exploiting delayed inhibitions and STDP [54], [55]. We considered inputs neurons described by the following rule: $$f_i = k \cdot \left[ f_0 + \frac{\alpha + \operatorname{sign}(v)}{\alpha + 1} \cdot K(x - x_{0i}) + \eta \right]_{\perp}$$ where $f_i$ is the firing rate of the ith neuron, $f_0$ is the base firing rate, k is a gain factor, $\alpha$ modulates the direction selectiveness, K is a Gaussian kernel for the spatial receptive field centered at $x_{0i}$ , x and v are the position and the velocity of the object, respectively, and $\eta$ represents noise in the firing rate. The $[\ ]_+$ operator describes the rectified linear unit (ReLU) function, which returns its argument when the latter is positive and zero otherwise. This is used to generate a Poisson spike train for the input neurons. Initially, all the synaptic weights are randomly assigned with values drawn from the same distribution. The simulation consists in the presentation to the input neurons of an object moving left to right and back several times, i.e., scanning across the receptive fields of all input neurons. Fig. 6(b) shows the spiking wave of the $\rightarrow$ (red circles) and $\leftarrow$ (blue crosses) neurons as the object is moved one time from left to right (from 0 to 40 ms) and then from right to left (from 90 to 130 ms), reflecting the direction selectivity of the input neurons. In Fig. 6(c) and (d), the spiking activities recorded across several presentations of the object are superimposed to show the input variability. These spikes are then applied to the corresponding STDP synaptic layers. The output neuron integration time constant is set to generate a single spike per presentation with the initial values of the synaptic weights. As reported in [14], with this setup, STDP is expected to break the initial weight symmetry and develop an asymmetric weight distribution such that the synapses that connect to input neurons that fire before the post neuron are potentiated, while the one firing later is depressed. As shown in Fig. 6(e), this is effectively achieved with our implementation of the STDP protocol. Initially, the integration of a relatively high number of input spikes is needed for the output neuron to reach the threshold, which means that the output neuron will spike when the moving object has already passed by a relatively large number of input neurons. The asymmetric plasticity provided by the STDP potentiates synaptic connections related to inputs that spiked shortly before the output and depresses those related to inputs that spiked shortly after the output. Accordingly, when the moving object is presented again, the output neuron will require less input spikes to reach threshold, and therefore fire, anticipating its response. The depression of synaptic weights related to inputs that spiked shortly after the post limits the excitation of the post due to these inputs, which hinders the possibility for the output neuron to fire multiple spikes per object presentation. As predicted, in the sequence of Fig. 6(e), the transition between the set of weights that are potentiated and depressed, which is dictated by the post spike, recedes over time for the weights connected to $\rightarrow$ neurons (advances for weights connected to $\leftarrow$ neurons) consistently with the anticipated response of the output neuron. In fact, at T = 1 s, around 20 input spikes from the $\rightarrow$ neurons are necessary for the output to reach the threshold, while at T = 50 s, around 14 input spikes are sufficient. #### C. Unsupervised Multipattern Recognition Task With BCM The effectiveness of the proposed implementation of the BCM rule is verified by simulating a feedforward network composed of 32 pre and 4 post neurons. Four orthogonal input patterns defined in terms of input spiking rates were designed to stimulate the network. Each pattern consists of 32 independent Poisson spike trains with high (for 8 specific neurons) and low (for the remaining 24) frequency rates. The pre neurons are divided into four groups of eight neurons (here selected to be adjacent to one another for simplicity and without loss of generality): when the *i*th pattern is to be presented to the network, the *i*th group will generate the high rate trains, with the remaining groups generating the low rate trains, as shown in Fig. 7(a). This stimulation protocol has been chosen to achieve the fastest convergence of the network and easier interpretation of the results. Partially overlapping patterns, continuous input firing rate between high and low, as well as pattern presented randomly over time are also viable stimulations. If the overlap of the input pattern is not too extreme, the effectiveness of the rule is not significantly affected (not shown for brevity). The patterns are presented sequentially in time, with a pattern duration of 0.5 s, with an epoch being one presentation of the four patterns (2 s). The four output neurons are connected to each other through fixed weight inhibition. To this end, an inhibitory input is added to the neuron design in Fig. 5(b), whose input current, instead of being summed by the integrator, is subtracted, effectively reducing the neuron excitation. The overall architecture with the feedforward excitation and the lateral inhibition is shown in Fig. 7(a). All the 128 excitatory weights are initiated at random initial values, all drawn from the same distribution. Synaptic modification via BCM naturally starts with a transitory period where the combined effect of plasticity and BCM threshold adaptation spontaneously brings the firing rates of all the postsynaptic neurons and the BCM threshold close to each other. At the end of this initial transitory period, all the input patterns evoke similar responses to all neurons. This quasi-symmetric condition with respect to the input patterns is broken as soon as one output neuron starts to respond systematically with higher rates for a particular pattern. In that case, lateral inhibition lowers the net excitation on the other output neurons that in turn lower their spiking rates. Since they are all close to the threshold, when that happens, the winner output neuron tends to potentiate the synaptic inputs related to that input pattern, whereas the other output neurons tend to depress the same relative synaptic inputs due to the lower spiking rate. This process consolidates over time until each output neuron spontaneously develops univocal input pattern selectivity, as shown in Fig. 7(b). Fig. 7(c) highlights the effect of pattern learning and selectivity in terms of post neuron rates. At the first epoch, all the input patterns evoke a similar spiking activity in all the output neurons, whereas at the final epoch, each input pattern evokes high activity in only one output neuron and that neuron responds with high activity for only that input pattern. Fig. 7(d) shows the firing rates of output neuron 1 for the four input patterns at different epochs. From about the 12th epoch on, the response to pattern 2 intensifies, while the others abate. The same happens for all the output neurons. Defining selectivity of a generic output neuron as [11] Selectivity = $$1 - \frac{\text{mean(rate wrt input patterns)}}{\text{max(rate wrt input patterns)}}$$ in Fig. 7(e), a marked increase in selectivity over time for all the neurons is shown. The classification accuracy is 95.75%. It was evaluated as the number of spikes from the specialized neurons for each specific input pattern (i.e., successful classification) over the total number of output spikes in the last 25 epochs when the network reached the maximum selectivity [i.e., after learning, Fig. 7(e)]. A guard interval of 50 ms is used to mitigate the spikes of neurons excited during the presentation of the previous input pattern. Fig. 7. (a) Schematic illustration of the 32 pre by 4 postsynaptic neurons feedforward network with lateral inhibition among the postsynaptic neurons, used to implement the BCM-based unsupervised multipattern recognition task. Each input pattern excites eight pre neurons at high rate, while the other neuron fires at a low rate. (b) Time sequence of the patterns applied to the network and spiking activity of the four output neurons. Initially, all the synaptic weights are randomly assigned and no relation between the applied pattern and the output neurons activity is present (see 0–4 s zoomed inset). After learning through BCM, a one-to-one relation between input patterns and output neurons is achieved (see 96–100 s zoomed inset). (c) Color maps of the initial and final spiking rates of the output neurons for the different input patterns. (d) Evolution over the learning epochs of output neuron 1 firing rate for the different input patterns. (e) Evolution of the output neurons selectivity. At around 20 epochs, selectivity reaches its maximum at 0.75 for a four-input environment: $\max (1 - (\sum p_i f_i)/(\max(f_i))) = 1 - p_i = 1 - (1/4) = 0.75$ . Simulations employing memristors as synapses in SNN for pattern classification were already shown [8], [17], [18]. In [17], a $64 \times 4$ memristive network was used to perform recognition on four different characters. STDP was used with a winner-take-all paradigm for the output neurons, achieving 96% accuracy. Though sharing similarities with our work, directly comparing the performance would be misleading since we employed: 1) a single pattern per class and 2) different accuracy evaluation methods. However, we expect our implementation to be less prone to runaway dynamics due to self-stabilization of BCM and less dependent on lateral inhibition. As selectivity is emergent in BCM, in our network, lateral inhibition is mainly used to nudge different neurons to learn different patterns and not to make them selective to a single pattern. Similar arguments hold true for other works where pair-based STDP is used for classification [8], [18]. # V. CONCLUSION In this work, we proposed a new hybrid CMOS-memristor SNN architecture simultaneously supporting two learning rules. For this purpose, a memristor was electrically characterized in terms of its I-V and pulsed response, as well as to the spike waveforms of the proposed neuron. In simulation, a neuron architecture supporting the two learning rules was designed after both STDP and BCM had been verified independently. Then, the proposed architecture was verified to successfully solve different classes of unsupervised learning tasks. A motion detection task exploiting the timing nature of STDP was shown: the shift of the receptive field of the output neuron was observed when subjecting the network to repeated motion stimuli. Then, a multipattern recognition task exploiting the properties of the BCM rule with adaptive threshold was performed. In particular, the experiment showed good performances in terms of selectivity, with a classification accuracy of 95.75%. Due to the simultaneous support of multiple learning rules, the proposed architecture is an important step toward the implementation of complex biologically plausible SNN able to adapt to complex scenarios. #### REFERENCES - A. Boroumand *et al.*, "Google workloads for consumer devices: Mitigating data movement bottlenecks," *ACM SIGPLAN Notices*, vol. 53, pp. 316–331, Mar. 2018, doi: 10.1145/3173162.3173177. - [2] T. Zanotti, F. M. Puglisi, and P. Pavan, "Reliability-aware design strategies for stateful logic-in-memory architectures," *IEEE Trans. Device Mater. Rel.*, vol. 20, no. 2, pp. 278–285, Jun. 2020, doi: 10.1109/TDMR.2020.2981205. - [3] Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, "Edge intelligence: Paving the last mile of artificial intelligence with edge computing," *Proc. IEEE*, vol. 107, no. 8, pp. 1738–1762, Aug. 2019, doi: 10.1109/JPROC.2019.2918951. - [4] M. A. Talib, S. Majzoub, Q. Nasir, and D. Jamal, "A systematic literature review on hardware implementation of artificial intelligence algorithms," *J. Supercomput.*, vol. 77, no. 2, pp. 1897–1938, Feb. 2021, doi: 10.1007/s11227-020-03325-8. - [5] W. Gerstner, W. M. Kistler, R. Naud, and L. Paninski, *Neuronal Dynamics*. Cambridge, U.K.: Cambridge Univ. Press, 2014, doi: 10.1017/CBO9781107447615. - [6] Q. Yu, H. Tang, J. Hu, and K. T. Chen, "Neuromorphic cognitive systems," in A Learning and Memory Centered Approach. Cham, Switzerland: Springer, 2017, doi: 10.1007/978-3-319-55310-8. - [7] W. Bialek, *Biophysics: Searching for Principles*. Princeton, NJ, USA: Princeton Univ. Press, 2012. - [8] E. Covi, S. Brivio, A. Serb, T. Prodromakis, M. Fanciulli, and S. Spiga, "Analog memristive synapse in spiking networks implementing unsupervised learning," *Frontiers Neurosci.*, vol. 10, p. 482, Oct. 2016, doi: 10.3389/fnins.2016.00482. - [9] D. Ielmini and S. Ambrogio, "Emerging neuromorphic devices," Nanotechnology, vol. 31, no. 9, Dec. 2019, Art. no. 092001, doi: 10.1088/1361-6528/ab554b. - [10] H. Markram, J. Lübke, M. Frotscher, and B. Sakmann, "Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs," *Science*, vol. 275, no. 5297, pp. 213–215, Jan. 1997, doi: 10.1126/science.275.5297.213. - [11] E. L. Bienenstock, L. N. Cooper, and P. W. Munro, "Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex," *J. Neurosci.*, vol. 2, no. 1, pp. 32–48, Jan. 1982, doi: 10.1523/jneurosci.02-01-00032.1982. - [12] L. N. Cooper, N. Intrator, B. S. Blais, and H. Z. Shouval, *Theory of Cortical Plasticity*. Singapore: World Scientific, Apr. 2004, doi: 10.1142/5462. - [13] L. N. Cooper and M. F. Bear, "The BCM theory of synapse modification at 30: Interaction of theory with experiment," *Nature Rev. Neurosci.*, vol. 13, no. 11, pp. 798–810, Nov. 2012, doi: 10.1038/nrn3353. - [14] Y. Dan and M.-M. Poo, "Spike timing-dependent plasticity: From synapse to perception," *Physiol. Rev.*, vol. 86, no. 3, pp. 1033–1048, Jul. 2006, doi: 10.1152/physrev.00030.2005. - [15] T. Masquelier, R. Guyonneau, and S. J. Thorpe, "Spike timing dependent plasticity finds the start of repeating patterns in continuous spike trains," *PLoS ONE*, vol. 3, no. 1, p. e1377, Jan. 2008, doi: 10.1371/journal.pone.0001377. - [16] P. U. Diehl and M. Cook, "Unsupervised learning of digit recognition using spike-timing-dependent plasticity," *Frontiers Comput. Neurosci.*, vol. 9, p. 99, Aug. 2015, doi: 10.3389/FNCOM.2015.00099. - [17] X. Wu, V. Saxena, and K. Zhu, "Homogeneous spiking neuromorphic system for real-world pattern recognition," *IEEE J. Emerg. Sel. Topics Circuits Syst.*, vol. 5, no. 2, pp. 254–266, Jun. 2015, doi: 10.1109/JETCAS.2015.2433552. - [18] D. Querlioz, O. Bichler, and C. Gamrat, "Simulation of a memristor-based spiking neural network immune to device variations," in *Proc. Int. Joint Conf. Neural Netw.*, Jul. 2011, pp. 1775–1781, doi: 10.1109/IJCNN.2011.6033439. - [19] A. Serb, J. Bill, A. Khiat, R. Berdan, R. Legenstein, and T. Prodromakis, "Unsupervised learning in probabilistic neural networks with multistate metal-oxide memristive synapses," *Nature Commun.*, vol. 7, no. 1, pp. 1–9, Sep. 2016, doi: 10.1038/ncomms12611. - [20] G. Pedretti et al., "Memristive neural network for on-line learning and tracking with brain-inspired spike timing dependent plasticity," Sci. Rep., vol. 7, no. 1, pp. 1–10, Jul. 2017, doi: 10.1038/s41598-017-05480-0. - [21] X. Wu and V. Saxena, "Dendritic-inspired processing enables bioplausible STDP in compound binary synapses," *IEEE Trans. Nan-otechnol.*, vol. 18, pp. 149–159, 2019, doi: 10.1109/TNANO.2018. 2871680. - [22] S. Bang et al., "Validation of spiking neural networks using resistive-switching synaptic device with spike-rate-dependent plasticity," in Proc. Int. Conf. Electron., Inf., Commun. (ICEIC), Jan. 2020, pp. 1–4, doi: 10.1109/ICEIC49074.2020.9051210. - [23] Z. Wang et al., "Toward a generalized Bienenstock-Cooper-Munro rule for spatiotemporal learning via triplet-STDP in memristive devices," Nature Commun., vol. 11, no. 1, pp. 1–10, Mar. 2020, doi: 10.1038/s41467-020-15158-3. - [24] W. He et al., "Enabling an integrated rate-temporal learning scheme on memristor," Sci. Rep., vol. 4, no. 1, pp. 1–6, Apr. 2014, doi: 10.1038/srep04755. - [25] T. Ahmed et al., "Time and rate dependent synaptic learning in neuro-mimicking resistive memories," Sci. Rep., vol. 9, no. 1, pp. 1–11, Oct. 2019, doi: 10.1038/s41598-019-51700-0. - [26] Z. Wang et al., "Memristors with diffusive dynamics as synaptic emulators for neuromorphic computing," in *Nature Mater.*, vol. 16, pp. 101–108, Jan. 2017, doi: 10.1038/NMAT4756. - [27] V. Milo et al., "Demonstration of hybrid CMOS/RRAM neural networks with spike time/rate-dependent plasticity," in *IEDM Tech. Dig.*, Dec. 2016, p. 16, doi: 10.1109/IEDM.2016.7838435. - [28] P. Dayan and L. F. Abbott, *Theoretical Neuroscience*. Cambridge, MA, USA: MIT Press, 2001. - [29] C. Zamarreño-Ramos, L. A. Camuñas-Mesa, J. A. Pérez-Carrasco, T. Masquelier, T. Serrano-Gotarredona, and B. Linares-Barranco, "On spike-timing-dependent-plasticity, memristive devices, and building a self-learning visual cortex," *Frontiers Neurosci.*, vol. 5, p. 26, Mar. 2011, doi: 10.3389/fnins.2011.00026. - [30] X. Wu, V. Saxena, K. Zhu, and S. Balagopal, "A CMOS spiking neuron for brain-inspired neural networks with resistive synapses and in situ learning," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 62, no. 11, pp. 1088–1092, Nov. 2015, doi: 10.1109/TCSII.2015.2456372. - [31] F. M. Puglisi, T. Zanotti, and P. Pavan, "Unimore resistive random access memory (RRAM) verilog—A model," nanoHUB, Jun. 2019, doi: 10.21981/15GF-KX29. - [32] (Oct. 6, 2019). Knowm. Self Directed Channel Memristors. Knowm memristor Datasheet Rev. 3.2. Accessed: May 17, 2021. [Online]. Available: https://knowm.org/downloads/Knowm\_Memristors.pdf - [33] K. A. Campbell, "Self-directed channel memristor for high temperature operation," *Microelectron. J.*, vol. 59, pp. 10–14, Jan. 2017, doi: 10.1016/j.mejo.2016.11.006. - [34] A. Grossi et al., "Fundamental variability limits of filament-based RRAM," in *IEDM Tech. Dig.*, Dec. 2016, pp. 4.7.1–4.7.4, doi: 10.1109/IEDM.2016.7838348. - [35] C. Nail et al., "Understanding RRAM endurance, retention and window margin trade-off using experimental results and simulations," in *IEDM Tech. Dig.*, Dec. 2016, pp. 1–4, doi: 10.1109/IEDM.2016. 7838346. - [36] M. Zhao et al., "Investigation of statistical retention of filamentary analog RRAM for neuromophic computing," in *IEDM Tech. Dig.*, Dec. 2017, p. 39, doi: 10.1109/IEDM.2017.8268522. - [37] A. Padovani, L. Larcher, F. M. Puglisi, and P. Pavan, "Multiscale modeling of defect-related phenomena in high-K based logic and memory devices," in *Proc. IEEE 24th Int. Symp. Phys. Failure Anal. Integr. Circuits (IPFA)*, Jul. 2017, pp. 1–6, doi: 10.1109/IPFA.2017.8060063. - [38] T. Zanotti, F. M. Puglisi, and P. Pavan, "Reliability and performance analysis of logic-in-memory based binarized neural networks," *IEEE Trans. Device Mater. Rel.*, vol. 21, no. 2, pp. 183–191, Jun. 2021, doi: 10.1109/TDMR.2021.3075200. - [39] F. M. Puglisi, L. Larcher, A. Padovani, and P. Pavan, "A complete statistical investigation of RTN in HfO<sub>2</sub>-based RRAM in high resistive state," *IEEE Trans. Electron Devices*, vol. 62, no. 8, pp. 2606–2613, Aug. 2015, doi: 10.1109/TED.2015.2439812. - [40] F. M. Puglisi, N. Zagni, L. Larcher, and P. Pavan, "Random telegraph noise in resistive random access memories: Compact modeling and advanced circuit design," *IEEE Trans. Electron Devices*, vol. 65, no. 7, pp. 2964–2972, Jul. 2018, doi: 10.1109/TED.2018.2833208. - [41] D. Gandolfi, J. Mapelli, and E. D'Angelo, "Long-term spatiotemporal reconfiguration of neuronal activity revealed by voltage-sensitive dye imaging in the cerebellar granular layer," *Neural Plasticity*, vol. 2015, pp. 1–13, Oct. 2015, doi: 10.1155/2015/284986. - [42] J. Mapelli, D. Gandolfi, A. Vilella, M. Zoli, and A. Bigiani, "Heterosynaptic GABAergic plasticity bidirectionally driven by the activity of preand postsynaptic NMDA receptors," *Proc. Nat. Acad. Sci. USA*, vol. 113, no. 35, pp. 19383–19388, Aug. 2016, doi: 10.1073/pnas.1601194113. - [43] C. Piochon, P. Kruskal, J. MacLean, and C. Hansel, "Non-Hebbian spike-timing-dependent plasticity in cerebellar circuits," Frontiers Neural Circuits, vol. 6, p. 124, Jan. 2013, doi: 10.3389/fncir.2012.00124. - [44] J. Mapelli and E. D'Angelo, "The spatial organization of long-term synaptic plasticity at the input stage of cerebellum," *J. Neurosci.*, vol. 27, no. 6, pp. 1285–1296, Feb. 2007, doi: 10.1523/JNEUROSCI. 4873-06.2007. - [45] A. Maffei and G. G. Turrigiano, "Multiple modes of network homeostasis in visual cortical layer 2/3," J. Neurosci., vol. 28, no. 17, pp. 4377–4384, Apr. 2008, doi: 10.1523/JNEUROSCI.5298-07.2008. - [46] R. K. Mishra, S. Kim, S. J. Guzman, and P. Jonas, "Symmetric spike timing-dependent plasticity at CA3–CA3 synapses optimizes storage and recall in autoassociative networks," *Nature Commun.*, vol. 7, no. 1, May 2016, doi: 10.1038/ncomms11552. - [47] D. H. Hubel, T. N. Wiesel, and S. LeVay, "Plasticity of ocular dominance columns in monkey striate cortex," *Philos. Trans. Roy. Soc. London B*, *Biol. Sci.*, vol. 278, no. 961, pp. 377–409, doi: 10.1098/rstb.1977.0050. - [48] J. Frascaroli, S. Brivio, E. Covi, and S. Spiga, "Evidence of soft bound behaviour in analogue memristive devices for neuromorphic computing," *Sci. Rep.*, vol. 8, no. 1, pp. 1–12, May 2018, doi: 10.1038/s41598-018-25376-x. - [49] J.-P. Pfister and W. Gerstner, "Triplets of spikes in a model of spike timing-dependent plasticity," J. Neurosci., vol. 26, no. 38, pp. 9673–9682, Sep. 2006, doi: 10.1523/JNEUROSCI.1425-06.2006. - [50] J. Gjorgjieva, C. Clopath, J. Audet, and J.-P. Pfister, "A triplet spike-timing-dependent plasticity model generalizes the Bienenstock-Cooper-Munro rule to higher-order spatiotemporal correlations," *Proc. Nat. Acad. Sci. USA*, vol. 108, no. 48, pp. 19383–19388, Nov. 2011, doi: 10.1073/pnas.1105933108. - [51] J. M. Cruz-Albrecht, M. W. Yung, and N. Srinivasa, "Energy-efficient neuron, synapse and STDP integrated circuits," *IEEE Trans. Biomed. Circuits Syst.*, vol. 6, no. 3, pp. 246–256, Jun. 2012, doi: 10.1109/TBCAS.2011.2174152. - [52] L. Magnelli, F. A. Amoroso, F. Crupi, G. Cappuccino, and G. Iannaccone, "Design of a 75-nW, 0.5-V subthreshold complementary metal-oxide-semiconductor operational amplifier," *Int. J. Circuit Theory Appl.*, vol. 42, no. 9, pp. 967–977, Sep. 2014, doi: 10.1002/CTA.1898. - [53] F. Centurelli, R. D. Sala, G. Scotti, and A. Trifiletti, "A 0.3 V, rail-to-rail, ultralow-power, non-tailed, body-driven, sub-threshold amplifier," *Appl. Sci.*, vol. 11, no. 6, p. 2528, Mar. 2021, doi: 10.3390/APP11062528. - [54] W. Wang et al., "Neuromorphic motion detection and orientation selectivity by volatile resistive switching memories," Adv. Intell. Syst., vol. 3, no. 4, Nov. 2020, Art. no. 2000224, doi: 10.1002/aisy.202000224. - [55] M. Honda, U. Hidetoshi, T. Keiko, and K. Shinya, "Analysis of development of direction selectivity in retinotectum by a neural circuit model with spike timing-dependent plasticity," *J. Neurosci.*, vol. 31, no. 4, Jan. 2011, pp. 27–1516, doi: 10.1523/JNEUROSCI.3811-10.2011. **Davide Florini** received the B.S. and M.S. degrees in electronics engineering from the University of Modena and Reggio Emilia, Modena, Italy, in 2016 and 2020, respectively. He is currently pursuing the Ph.D. degree in electrical engineering with the l'Université de Sherbrooke, Sherbrooke, QC, Canada. His current research interests include the fabrication and characterization of memristive devices on CMOS neuromorphic chips for brain-inspired computing systems. **Daniela Gandolfi** received the B.S. degree in physics from the University of Modena and Reggio Emilia, Modena, Italy, in 2004, the M.S. degree in biophysics from the University of Parma, Parma, Italy, in 2006, and the Ph.D. degree in physiology and neuroscience from the University of Pavia, Pavia, Italy, in 2010. She is currently an Assistant Professor of bioengineering with the University of Modena and Reggio Emilia, where she is responsible for the Neurocomputational Unit, Neuromorphic Intelligence Laboratory. Her current research interests include neuronal circuits and artificial neuronal networks. **Jonathan Mapelli** received the M.Sc. degree in physics from the University of Milan, Milan, Italy, in 2002, and the Ph.D. degree in physiology from the University of Pavia, Pavia, Italy, in 2006. He is currently an Associate Professor of physiology with the University of Modena and Reggio Emilia, Modena, Italy, where he is responsible for the Experimental Neurophysiology Unit, Neuromorphic Intelligence Laboratory. His current research interests include neurotransmission and synaptic plasticity. Dr. Mapelli received the Young Investigator Award from the Italian Physiological Society in 2011. **Lorenzo Benatti** received the B.Sc. and M.Sc. degrees in electronic engineering from the University of Modena and Reggio Emilia (UNIMORE), Modena, Italy, in 2018 and 2020, respectively. In 2021, he joined UNIMORE as a Research Fellow at the H2020 BeFerroSynaptic Project. His current research interests include the electrical characterization and compact modeling of emerging non-volatile memories, e.g., resistive memories [resistive random access memory (RRAM)] and ferroelectric tunnel junction (FTJ), for ultralow-power and brain- inspired computing. Mr. Benatti was a recipient of the Best Student Paper Award at the 2021 IEEE International Integrated Reliability Workshop (IIRW). Paolo Pavan (Senior Member, IEEE) has been the Dean of the Electronics Engineering Program. He is currently a Professor of electronics with the University of Modena and Reggio Emilia, Modena, Italy, and the Rector's Delegate for Scientific Research. His current research interests include the characterization, modeling, and optimization of nonvolatile memory devices, more recently resistive random access memories (RRAMs). From this last activity, he started to investigate logic-in-memory and neuromorphic architectures. He is also involved in the development of safety-critical and energy-aware applications for low-power computing and automotive electronics. Prof. Pavan was on the Technical and Executive Committee of International Electron Devices Meeting (IEDM) and the Technical Committee of International Symposium on VLSI Technology, Systems and Applications (VLSI-TSA), IEEE-International Reliability Physics Symposium (IRPS), European Solid-State Device Research Conference (ESSDERC), and European Symposium on Reliability of Electron Devices, Failure Physics and Analysis (ESREF). He has been the Technical Program Chair of ESSDERC 2014 and is currently on the Steering Board of ESSDERC. He has been a Guest Editor of the IEEE TRANSACTIONS ON DEVICE AND MATERIAL RELIABILITY. He is an Associate Editor of IEEE JOURNAL OF ELECTRON DEVICE SOCIETY. Francesco Maria Puglisi (Senior Member, IEEE) was born in Cosenza, Italy, in 1987. He received the Ph.D. degree in information and communication technology from the University of Modena and Reggio Emilia, Modena, Italy, in 2015. He is currently an Associate Professor of electronics with the University of Modena and Reggio Emilia. His activity concerns the characterization and compact modeling of novel nonvolatile memories, focusing on noise, reliability, and variability. He is also interested in the design of new cir- cuit paradigms for low-power and brain-inspired computing. He has authored or coauthored more than 100 technical papers. Prof. Puglisi was a co-recipient of the Best Student Paper Award at the IEEE International Conference on IC Design and Technology (ICICDT) 2013, IEEE International Integrated Reliability Workshop (IIRW) 2020 and 2021, and IEEE International Reliability Physics Symposium (IRPS) 2022, and the Best Paper Award at IEEE European Solid-State Device Research Conference (ESSDERC) 2016 and 2019. He is an Associate Editor of Elsevier's *Microelectronic Engineering Journal* and is active in several technical and/or management committees of prestigious international IEEE conferences, such as International Electron Devices Meeting (IEDM), ESSDERC, IRPS, IIRW, International Symposium on the Physical and Failure Analysis of Integrated Circuits (IPFA), and Electron Devices Technology and Manufacturing (EDTM). He is the Technical Program Co-Chair of ESSDERC 2022 and the Technical Program Chair of IIRW 2022.