The advent of the first TiO 2 -based memristor in 2008 revived the scientific interest both from academia and industry for this device technology, with several emerging applications including that of logic circuits. Several memristive logic families have been proposed, each with different attributes, in the current quest for energyefficient computing systems of the future. However, limited endurance of memristor devices and variations (both cycle-to-cycle and device-to-device) are important parameters to be considered in the evaluation of such logic families. In this work we build upon an accurate physics-based model of a bipolar metal-oxide resistive RAM device (supporting parasitics of the device structure and variability of switching voltages and resistance states) and use it to show how performance of memristor-based logic circuits can de degraded owing to both variability and state-drift impact. Based on previous work on CMOS-like memristive logic circuits, we propose a memristive ratioed logic scheme, which is crossbar-compatible, i.e. suitable for in-/near-memory computing, and tolerant to device variability, while also it does not affect the device endurance since computations do not involve switching the memristor states. As a figure of merit, we compare such new logic scheme with MAGIC, focusing on the universal NOR logic gate.
INTRODUCTION
Leon Chua in 1971 proposed the existence of the memristor as the fourth fundamental circuit element [2] . An unprecedented attention on this new and emerging device technology was drawn much later though, after 2008, when the first demonstration of the TiO 2 -based memristor by Hewlett-Packard Laboratories (HP Labs) took place [10] , connecting the nature of such devices with Chua's previous theory. Owing to their analog nature, nonvolatility, high integration density potential and CMOS compatibility, memristors constitute an emerging trend in modern electronics [1] , representing a very promising technology which has extended its influence beyond memory [15] to logic and in-memory computing [14] , [5] .
The development of several behavioral models that capture the basic characteristics of the memristors [8] (e.g. threshold-based switching and nonlinearities near the resistive boundaries,) following that of HP Labs published in [10] , contributed to the progress of this emerging research field, being adequate to demonstrate the impact and usefulness of memristors in a variety of applications [6] , [7] . Among them, the memristive logic design [14] , i.e. the methodology of designing logic circuits using memristors, is an emerging concept in the constant quest for energy efficient post-CMOS computing systems. Many such memristive logic families have been proposed: IMPLY, CNIMP, MAGIC, MRL [14] , to name a few, use resistance to represent data, thus are suitable for crossbar-based resistive computing. The latter is considered a requirement for real in-/near-memory computing, since the topology of the logic circuits to implement should fit in the crossbar memory array.
Moreover, proposed metrics for comparison of such families so far naturally focus on latency, energy, and area efficiency [9] . However, most such relevant works omit crucial factors such as variability (both cycle-to-cycle and device-to-device, notable even by the uninitiated in all experimental works) and endurance of memristors, a major limitation to be considered if frequent switching is necessary during computations. In this context, Scouting Logic [16] was recently proposed to alleviate the endurance requirement while executing logic operations by just reading the memristor state, even though this scheme eventually suffers from device variability. It is worth mentioning though that the idea of performing logic computations without switching the states of the involved memristors was proposed much earlier by Vourkas et al. with the so called CMOSlike memristive logic [12] , revisited recently in [11] . Nevertheless, the CMOS-like scheme was not given much attention owing to the complexity of the pull-up/down memristive networks which inhibits its implementation inside a crossbar memory array.
In this context, we present here a crossbar-compatible variation of the CMOS-like concept, namely a ratioed logic scheme, which is both variation-tolerant and does not impact the device endurance. We performed simulations of the proposed scheme using the Cadence Virtuoso and a well-known accurate physics-based model of a bipolar metal-oxide resistive RAM device [3] , [4] , which supports, among several other features, parasitics of the MIM device structure and variability of switching voltages and resistance states. We highlight the importance of taking into account device variability in circuit simulations, guiding the reader through the key model parameters, to eventually show how performance of some known memristor-based logic circuits can de degraded owing to both variability and state-drift impact. Finally, we compare them with the proposed novel ratioed logic scheme, focusing on the universal NOR gate. Simulation results show how the proposed scheme outperforms in terms of robustness and viability, getting us to the simple conclusion that "rethinking of memristive logic design from a practical point of view" is necessary if we aspire to enable in-memory computing.
INSERTING VARIABILITY IN DEVICE MODELS FOR REALISTIC CIRCUIT SIMULATIONS
Let us shortly provide the basics of the target model used in this work, which is the Stanford-PKU ReRAM device model [3] . It is a compact physics-based model which captures typical DC and AC electrical behavior of metal-oxide based ReRAM devices. The model assumes a conductive filament (CF) growth process described by a change of the CF geometry during the SET and RESET processes under various bias conditions. The core of the model is a two-dimensional description of a unique CF, which includes both the CF gap region and the CF width as control variables [15] . Most importantly, the model includes parasitic effects such as the parasitic resistance of the switching layer and the electrodes, as well as the parasitic MIM capacitance. Furthermore, the model supports intrinsic variation effects, such as statistical distribution of switching thresholds and resistance states, temperature dependency and dynamic current fluctuations for the RESET process, thus supporting literally all the ReRAM device variation effects known to date. As far as operation is concerned, a positive applied voltage produces a SET process, where the oxide layer suffers a soft-breakdown; the CF is formed and the device is found at a low resistive state (LRS or R O N ). On the other hand, a negative applied voltage causes a RESET process in which the CF is dissolved through ion diffusion or drift processes and the device is found at a high resistive state (HRS or R O F F ). The model was used in the Cadence Virtuoso suite. The majority of the parameter values were kept at default values [3] , except those directly affecting the switching thresholds, i.e. the average active energy of oxygen vacancies (E a ), the hopping barrier of O 2− (E h ) and the energy barrier between the electrode and the oxide (E i ). Tuning parameters is recommended to adjust the overall behavior, according to the application requirements. For instance, assuming that for a particular application the following relation is necessary: V S ET > 2 × V RES ET , then the aforementioned parameters could be tuned as follows: E a = 0.9 eV, E h = 0.9 eV and E i = 0.7 eV. Figure 1 demonstrates i-v curves for 20 cycles taken for a device under a triangular applied voltage. The compliance current (cc) is defined by tuning the gate voltage of a series 0.35 μ m NMOS transistor. We define as SET threshold V S ET the voltage when the current reaches to 90% of the cc. Likewise, we define as RESET threshold V RES ET the voltage when the current first experiences a sudden decrease. These statistics give us the following mean values: V S ET = 2 V and V RES ET = -0.5 V, which constitute a good approximation to later establish the minimum required amplitude of the programming voltage pulses. Figure 2 shows how the effective R O F F /R O N ratio can change depending on the applied voltage. The device is first set to R O N and then a positive voltage is applied. As expected, R O N shows the ohmic conduction of the CF since the memristance cannot be lowered further beyond this value. Nevertheless, when we reset the device to R O F F and then apply a negative voltage, we notice a highly nonlinear behavior of the effective R O F F owing to the hopping current through the tunneling gap [15] , [3] . Therefore, such dependency of the R O F F state on the voltage across the device marks a significant difference compared to other device models or analyses where the R O F F state is treated as purely ohmic. This behavior could have a significant impact on the efficiency of memristive applications. Binary encoding of memristance is necessary for the target application here. Given the used parameter values, the model exhibits a memristance range from 5 kΩ to 3 MΩ. Following the results in Figure 2 , we define values above 1 MΩ as HRS and values below 100 kΩ as LRS, whereas all values within the intermediate guard band are treated as undefined states. Reading of the memristor state is performed in voltage mode with a 100 kΩ series resistor by applying a small voltage pulse (0.5 V amplitude and 40 ns-wide). In our case, the 0.5 V amplitude is low enough to not change the memristance of the device. However, lower voltages could be used if more safety in the operation or energy optimization is required. For the purposes of our circuit simulations, we implemented a simple state decoder (adapted from [13] ) which provides a digital output, being either '0' (HRS), '1' (LRS) or 'X' (undefined state).
The Stanford-PKU ReRAM model supports state variability as well as voltage switching variability. During the switching process a random variable is added to the rate change of the tunneling gap distance д between the electrode and the tip of the conductive filament (CF), and that of the CF width w. Such random variable is a zero mean Gaussian sequence χ (t) with deviation δд and δw, respectively. In our study, δд = k×δ д0 and δw = k×δ w 0 (δ д0 = 10 −4 m/s and δ w 0 = 5 · 10 −4 m/s are the default values), where k = 1,2,3... is a variability factor that permits configuring easily the amount of desired variability. This state variability affects the memristance value as well as the switching thresholds (as noticed previously in Figure 1 and Figure 2 for k = 1).
State programming of the devices may be influenced by the past history of their state. However, since our objective here is to show the impact of device variability, we rather suppress any dependencies on the previous device history via a two-step initialization process, described as follows: when programming the device to the LRS (HRS), this is done by first performing a hard RESET (SET) and then a soft SET (RESET). Hard SET/RESET completely forms/destroys the CF to thus eliminate the previous history of the memristor and also prevents the cycle-to-cycle variability. On the other hand, the soft programming initializes the memristor to a state within the LRS or HRS ranges, thus including the desired variability effect in the initialized state. The voltage pulses applied for the HRS initialization concern: 3 V amplitude, 200 ns width and 500 μA cc for hard SET, -2 V amplitude, 100 ns width for soft RESET. On the contrary, for the LRS initialization it is: -2.5 V amplitude, 200 ns width for hard RESET, 3 V amplitude, 100 ns width and 50 μA cc for soft SET. Figure 3 shows simulation results for the memristance distribution of R LRS and R H RS after the initialization.
VARIABILITY-AWARE ASSESSMENT OF MAGIC NOR GATES
We now proceed to the evaluation of possible impact of device variations on the performance of memristive logic gates. In this context, we focus on Memristor-Aided loGIC (MAGIC) [14] , a wellknown logic design scheme for its crossbar compatibility (only for the NOR operation). In MAGIC, every logic computation is evaluated in just two steps, regardless of the number of inputs. For instance, the NORn gate consists of n-input memristors m x 1 ... m xn plus an output memristor m y shown in Figure 4 . The logic operation is performed as follows: First, the output memristor m y is set to LRS. Next, a voltage pulse of amplitude V 0 is applied to the top electrode (TE) of every input memristor while the TE of the output memristor is connected to the ground. This operation is equivalent to a conditional RESET process of m y when at least one input device has a logic '1' stored. V 0 input is selected such that guarantees that m y will switch only in the appropriate case and the operation will not be destructive, i.e. the input memristors states are not affected. The MAGIC NOR2 logic gate was designed and simulated using the Stanford-PKU model in the Cadence Virtuoso suite. Considering a 200 ns-wide voltage pulse, an amplitude V 0 sweep was performed to determine which values guarantee a successful NOR2 operation.
The resistance values for m x 1 , m x 2 and m y were stored after every logic operation and are shown in Figure 5 (a) for each swept V 0 value. As it can be observed, there is some unintended state-drift causing either the input or the output memristor state to approximate the undefined region. In fact, the upper boundary for V 0 is defined at the 00-input case as V 0 ≈ 2.19 V where both input memristors exceed the lowest HRS limit, so they no longer hold an acceptable '0' logic level. Likewise, the lower boundary for V 0 is defined at the 11-input case as V 0 ≈ 1.89 V where the state of the output memristor exceeds the lowest HRS limit.
Once we have located the appropriate range for V 0 , we applied variability to explore its potential impact. We assume that an error occurs if, after the logic operation, the state of any of the devices being involved is not the expected one. Figure 5(b) shows the average error evolution of NOR2 for different V 0 values and a variability factor k = 5, concerning each time 4000 evaluations for random initialization of the input memristors. The contribution of each input combination is also shown separately with the 00-input case practically dominating as V 0 increases, but for low V 0 values it is the 11-input case that dominates; the minimum error rate is noted when V 0 = 1.95 V. Consequently, unless variability is properly taken into consideration in the design space exploration, high error rates can appear resulting in malfunction of MAGIC NOR gates.
VARIABILITY TOLERANT RATIOED LOGIC WITH MEMRISTORS
In this Section we present an alternative crossbar-compatible logic design scheme that overcomes the limitations found due to statedrift and memristor variability. It is inspired on the pseudo-NMOS logic design, working very similar to the CMOS-like memristive logic but with much lower circuit complexity. we totally avoid the drift phenomena that inevitably appeared in the MAGIC case. Duration of voltage pulses does not need to be as long as in MAGIC since no switching is required, so they are set to 20 ns just like in the reading operation. Once the gate was designed, the impact of device variability was also evaluated for this logic scheme. of NOR2 logic gate operation, 1000 for every different input combination, while applying three different variability factors: k = 1, 5, and 10 shown in blue, red and black color, respectively. Regardless of the variability factor, which was set even higher for this gate, the output logic levels are unequivocally identified and error rate is zero. Therefore, such results confirm that the proposed alternative ratioed logic scheme is very robust against variability, compared to MAGIC or Scouting Logic, being also crossbar-compatible and potentially improving further other characteristics of MAGIC gates, such as delay and unwanted state-drift phenomena. As possible drawbacks, there may be a concern in the pull-up PMOS and the delivery of the output data. The proposed topology requires the use of both NMOS and PMOS CMOS transistors, what at first glance poses difficulties to scale the circuit properly. This notwithstanding, only one PMOS device per row is required. So, in terms of area, the inclusion of PMOS devices has a minimum cost. Also, the fact that input data is stored in the form of resistance whereas output signal is voltage implies that the result of the operation has to be stored later in a memristor or used as an intermediate result and then be potentially discarded, depending on the application. This may require to enable a programming operation by means of a signal converter. All in all, compared to the MAGIC gate, the extra step required here to store the intermediate result back to the memory will assume some area overhead in the periphery but, in terms of delay, it is compensated by the need to initialize the output memristor in MAGIC, which here is not necessary.
CONCLUSIONS
Simulation results concerning a well-known logic design scheme with memristors, confirmed that a variability-aware design and more realistic circuit simulations using physics-based device models, are necessary. Our analysis showed that MAGIC behavior is sensitive to design parameters, increasing significantly the error rate with little deviations. State drift is also observed in the design process using a realistic memristor model. Unless variability is properly taken into consideration in the design flow, unacceptably high error rates could certainly appear and cause malfunction. On the other hand, the proposed ratioed logic scheme with memristors was proven much more robust against device variability, and is crossbar-compatible, fast, and takes into account device endurance, thus being a good candidate for near-memory computing systems.
