ABSTRACT Large-scale 3-D crossbar arrays offer a path to both high-density storage class memory and novel non-Von Neumann computation. However, such arrays require each non-volatile memory (NVM) element to have its own non-linear access device (AD), which must pass high currents through one or more selected cells yet maintain ultra-low leakage through all other cells. Using circuit-level SPICE simulations, we explore design constraints on crossbar arrays composed of a generic NVM element (+1R) together with the novel AD developed by our group, based on Cu-containing mixed-ionic-electronic-conduction (MIEC) materials. We show that power consumption during write, not read margin, is the most stringent constraint for large 1AD+1R crossbar arrays. As array size grows, in order to keep NVM write power-efficient, the voltage at which the AD "turns on" must outpace the NVM switching voltage. Failure to achieve this condition causes the total array power, injected into the array to ensure the success of the worst-case single-bit write, to greatly exceed the actual NVM write power. Extensive tolerancing results show that NVM switching current and other AD parameters (subthreshold slope and series resistance) are also important, but not to the same degree as AD and NVM voltage characteristics. We show that scaled MIEC devices (Voltage Margin V m ∼ 1.54V) can support 1 Mb arrays for NVM switching voltages up to 1.2V, and that stacking two MIEC devices could enable ∼2.4V. The impact of V m variability is quantified-we show that there is minimal degradation in write power and read margin at variabilities (standard deviation in V m ) not very different from those already demonstrated experimentally.
I. INTRODUCTION
Resistive RAM, PCM and STT-MRAM are emerging as promising non-volatile memory (NVM) candidates for Storage Class Memory -technologies that could combine the low latency and robustness of a solid-state memory with the non-volatility and low cost of storage technologies such as NAND Flash or magnetic hard disks [1] , [2] . One possible path towards manufacturing such memories at low cost per bit is a 3D crossbar array of NVM devices.
To minimize sneak path currents, each NVM must be in series with an Access Device (AD) with a strongly non-linear I-V characteristic. For 3D stacking, an AD must also be BackEnd-Of the-Line (BEOL) compatible, and fit within the same 4F 2 footprint as the NVM in a crossbar array. A number of potential AD candidates have been proposed [3] . Without an AD, "sneak path" currents can easily mask read currents, closing the read margin needed to distinguish between NVM elements in their high and low resistance states [4] . Worse yet, NVM writes tend to require high current density through a small number of selected cells. This means that the maximum achievable array size -while still ensuring that all write operations are successful -rapidly diminishes without a similarly strong nonlinearity [5] . The larger such arrays can be made, the lower the area overhead due to peripheral circuitry, and the higher the efficiency in terms of number of accessible bits per unit area of silicon (and thus, at least to first order, per unit fabrication cost).
Maximum array size is constrained by numerous device and circuit level parameters, including NVM switching currents and voltages, AD non-linearity, off-state leakage, AD series resistance, and interconnect resistance. Such a design space -extending across NVM, AD and circuit parameters -can be quantitatively explored through circuit analysis, in order to identify critical design parameters and constraints, to study the compatibility of an AD against a range of NVM switching properties, and to quantify the impact of technology scaling.
To be maximally useful, such a study should: 1) Use SPICE [6] , [7] or another framework [8] capable of modeling non-linear ADs and NVMs; 2) Consider both write and read operations selfconsistently, addressing the constraints for each; 3) Consider large array sizes (∼1Mb or higher); 4) Account for interconnect resistance, which becomes important at higher (write) currents and larger array sizes, and in scaled technology nodes; 5) Provide design guidance with every datapoint, computing injected power required to ensure NVM write success, and best-case read margin while avoiding read disturb; 6) Use accurate behavioral representations of NVM and AD I-V characteristics to determine exact switching conditions, power consumption, and read voltages, and; 7) Employ crossbar-bias schemes that can offer better leakage power mitigation than simple fixed-ratio (e.g., V/2 or V/3) schemes. In this paper, we present a SPICE-based circuit analysis of 1AD+1R crossbar arrays that meets all of these criteria, expanding upon our earlier brief summary [9] . The ADs used in these simulations are copper-containing Mixed Ionic Electronic Conduction (MIEC) ADs [10] - [15] , previously shown to have many desirable characteristics for integration into 1AD+1R crossbar arrays. A conference work that extended the present analysis to other published access devices has also been published [16] , but this work is not included in the present manuscript due to space considerations. Similarly, peripheral circuitry for NVM technology is not discussed -the interested reader is referred to References [17] - [20] .
In Section II we present our DC behavioral model for the bipolar selectors and a generic NVM element, and a circuit reduction technique for simulation of large arrays with reasonable run times. We then present results for write power consumption (Section III) and read margin (Section IV) and their dependence on various device and circuit parameters. In Section V, we compare this work to previous efforts at crossbar analysis, and show that no prior work has achieved all seven of the items listed above. In Section VI, we conclude the paper. Fig. 1 shows a circuit schematic of a 1AD+1R crossbar array with an Access Device and an NVM at each intersection node. In this array, the nonlinearity of the AD helps it maintain ultra-low leakage on a large number of unselected cells and low to moderate leakage on a small number of partially selected cells, while remaining capable of passing much higher read or write currents through one or more selected cells.
II. SIMULATION FRAMEWORK

A. MIEC ACCESS DEVICES
Cu-containing Mixed Ionic Electronic Conduction (MIEC) ADs have previously been shown to have a range of desirable characteristics [10] , including BEOL compatibility, large ON/OFF ratios, high voltage margin V m , the high current densities needed for PCM, and the fully bipolar operation needed for high performance RRAM and MRAM [11] - [15] . V m is always measured at the 10nA current level, roughly corresponding to the partial-select condition. Integration of these selectors at 100% yield (at 512kBit scale) has been shown [13] . These devices can provide write level (>100μA) currents within 15ns [14] , read level (5-10μA) currents within 50ns [15] , and can be scaled to the <30nm CDs and <12nm thicknesses found in advanced technology nodes [14] .
The I-V characteristic of a scaled MIEC device (top CD ∼ 35nm, Fig. 2 with high current response dominated by an effective series resistance of 2.85k . This I-V characteristic can be modeled in SPICE using a simple equivalent circuit (inset, Fig. 3 ) of two anti-parallel diodes with fitted parameters (I S , n) and series resistors (R series ), together with a current source I NF ) to model the low-voltage leakage current (3pA).
B. MODELING NVM I-V CHARACTERISTICS
The I-V switching characteristics of a generic bipolar NVM element ( Fig. 3 ) alternate between a positive-polarity SET operation (high to low resistance switching) and a negativepolarity RESET operation. through the NVM are equal to or greater than V LRS , I LRS , thus guaranteeing the delivery of the necessary switching power.
The RESET behavior is analogous, with the ohmic LRS state (R LRS = V LRS /I LRS ) switching into an HRS state when device voltage drops below −V LRS . The voltage across the device "snaps forward" (consistent with filament pinch-off in RRAM). At voltages beyond −V HRS the device is considered to be fully RESET. The exact RESET condition (e.g., the external voltage necessary to produce > V HRS at the selected device) is determined during post-processing. All transitions are modeled using conditional syntax available with standard SPICE simulators such as HSPICE [6] and LTSPICE [7] . Table 1 summarizes nominal values for all device and circuit parameters. These parameters are not intended to represent any particular NVM, but instead are chosen to represent a desired yet still reasonable target for either RRAM or PCM devices. In particular, our intent is to show what would be required from an NVM in order to achieve arrays of 1Mbit in size, not just for MIEC devices but for any access device [16] . Our choice of I LRS ensures reasonable IR drops even for large arrays (for a 1Mb array, total BL and WL IR drop is ∼68mV). While current density (1.46MA/cm 2 ) is close to the ITRS electromigration threshold (1.5MA/cm 2 [21] ), a non-volatile memory application will drive such peak currents less frequently than a logic circuit, postponing the onset of electromigration-induced damage.
C. SIMULATING LARGE ARRAYS
Simulation time for arrays with millions of individual SPICE nodes ( Fig. 1) can be prohibitively long. However, since current in unselected wordlines and bitlines is negligible and their interconnect IR drop can be neglected, the unselected portion of the array can be replaced by a single Access Device/NVM pair, with leakage equal to the aggregate unselect leakage (see Fig. 14) . This reduces the number of nodes and speeds up simulations considerably.
III. WRITE OPERATIONS
The metric of interest during a write operation is maximum total power consumption, which can occur at any one of the four switching points ( Fig. 3 ), for the worst-case device farthest from the voltage source (see Fig. 1 ), when all other devices are in the LRS state.
A. BIASING SCHEME FOR WRITE
During write operations, all unselected rows and columns are at voltages V R and V C , respectively. This inner voltage separation is chosen to ensure a total unselect leakage of 10μA. For a 1Mb array, the resulting 10pA requires V C − V R ∼ 0.6V (see Fig. 2 ), leading to 6μW of unselect power. Such a design point is only possible because of the ultra-low leakage offered by the MIEC-based AD. During each circuit simulation, voltages applied at the array edge (V W and V B ) are swept from low to high. Postprocessing identifies the applied voltage (and thus power) at which the selected NVM element switches. As shown in Fig. 4 , total applied voltage is the sum of NVM switching voltage, selected diode voltage and IR drop across the wiring, and must be balanced by the sum of the voltages across the three types of non-selected devices: partially selected (same WL), unselected, and partially selected (same BL). 
COMPARING POWER CONSTRAINT BIASING TO OTHER SCHEMES
Fixing the unselect bias based on power constraints provides for a more optimal approach than other biasing schemes such as V/2 and V/3. In a V/2 scheme, V R = V C = V TOTAL /2, eliminating leakage through unselected cells. However, since leakage through partially selected ADs increases rapidly with increasing voltage, the maximum V TOTAL this scheme can support, for a given power budget, is lower than in the constraint-based scheme described above. Without the ultra-low ( 1nA) leakage currents of the MIEC-based AD, however, our constraint-based scheme would choose a vanishingly small inner voltage separation V C − V R , becoming effectively indistinguishable from the V/2 scheme. In a V/3 scheme, the voltage drop across all non-selected cells is ∼ V TOTAL /3. While unselected and partially selected cell leakage is equal, there are quadratically more unselected cells, so the maximum supported V TOTAL is lower than with the constraint-based scheme.
NOTE ON TRANSIENT RESPONSE
Our circuit analysis is based on the steady-state I-V response of the MIEC AD. However, fast switching (<10ns) of MIEC devices into high current states can require overvoltage acceleration [15] , which increases the total voltage V W − V B that must be applied at the edges of the array. We can consider write operations along a single WL held at V W , with multiple cells along that WL written in succession by activating and deactivating individual BLs. Unselected and partially selected ADs along this WL are held for long periods of time, at biases selected from the steady-state I-V response. However, bias points for ADs along the BL, both selected and partially selected, should use the relevant transient I-V for the desired access latency. Since, for MIEC-based ADs, transient I-Vs resemble a voltage-shifted version of the steady-state I-V [15] , the overvoltage added to the selected device is counter-balanced, at least to first order, by the transient nature of leakage buildup through the BL partially selected devices.
Another consideration for transient analysis is the impact of the brief (1-2us) recovery period for an MIEC device to return to the low leakage condition. We have previously shown that, after a write operation of ∼50uA, MIEC devices return to low leakage only after 1us held at zero volts (or faster for negative bias). The recovery after a read operation is markedly faster. This criteria might affect the order in which accesses ought be arranged, and will likely affect the effective write bandwidth. However, we believe that the primary results of this paper, in terms of supported array sizes and NVM switching voltages, will remain unaffected to first order. 
B. SIMULATION RESULTS
Simulations of write operations were carried out with individual parameters from Table 1 varied one at a time. Fig. 6 plots the total write power vs. individual NVM switching parameters for the scaled MIEC device. These results are a significantly different representation than earlier work on crossbar array design: every single data point in Fig. 6 represents an operating point that will result in successful write operations, albeit at a power cost that could easily be onerous. In contrast, earlier works frequently show plots in which write or read operations "work" at one end of the graph, and completely "fail to work" at the other end, thus constituting a large set of data that conveys only one piece of information: the location of the transition from "working" to "failing." Fig. 6 shows that small changes in voltage parameters V HRS or V LRS can lead to a drastic increase in power. These 'tipping points' arise from the turn-on of partially selected MIEC ADs, which exponentially increases leakage current, increasing the IR drop along the selected WL and BL, and preventing increases in applied voltage from reaching the selected cell. This runaway condition is illustrated in Fig. 7 , showing voltage along the selected WL and BL for varying V HRS . From V HRS =1.08V to 1.32V, the total voltage needed externally to ensure the switching of the NVM element increases dramatically from 2.44V to 10.05V, causing runaway power consumption if not outright device breakdown.
The inset of Fig. 6 plots total write power consumption vs. V HRS for different stored data patterns. While random patterns show 'tipping points' similar to the worst-case (all devices in LRS state), for the best-case data pattern (all HRS), power consumption remains manageable even at 50% higher V HRS . Thus NVM+AD design constraints could be partially relaxed for 'highly asymmetric' data such as sparse matrices or otherwise appropriately-coded data. Fig. 8 plots the dependence of total write power consumption on V HRS (X-axis) and array size (Y-Axis). Even a 10% reduction in V HRS enables significant benefits in array size (1Mb to 2.25Mb) for a given MIEC Access Device. Conversely, a 10% increase in V HRS forces a reduction in array size by 60%. From these results and the sensitivity analysis in Fig. 6 , we can conclude that reducing NVM switching voltages will be extremely critical in crossbar system design, since maximum achievable array size directly translates into array efficiency and cost-per-bit. Reducing switching currents is necessary only to ensure operation of the MIEC-based AD in its non-linear regime and to keep IR drops reasonable. Fig. 9 plots total power consumption as AD parameters (voltage margin V m , turn-on slope S, and series resistance R s ) are varied. Similar to V HRS (Fig. 6) , a tipping point exists for the V m parameter. By reducing the voltage across the selected AD, decreases in V m reduce the total switching voltage and thus might be assumed to lead to lower total leakage. However, the smaller V m means that the inner voltage separation for the un-selected devices must be decreased to maintain the same un-select leakage, which increases the voltage across (and thus the leakage through) the partially selected cells.
Turn-on slope S also has an exponential impact on overall power, because of the linear relationship between the slope VOLUME 3, NO. 5, SEPTEMBER 2015 427 There is also tangible benefit to improving the turn-on slope (∼2.5× larger arrays). The increase of line resistance with technology scaling is a key challenge for crossbar memories [22] , [23] . Fig. 10 (b) plots power consumption for increasing interconnect line resistance, for the same four AD parameter assumptions. Once again, the V m parameter is critical -increasing V m by 20% could counteract as much as a 5× increase in line resistance (∼11nm technology node per ITRS 2011 [21] ). Writing multiple bits in parallel into a subarray improves overall write bandwidth. However, the resulting larger wire currents increase interconnect IR drop. Fig. 11(a) plots a color map of total IR drop (along the selected wordline and bitline) as array size and the number of bits being written are increased. Here, we use a 'safe' write design point, with insignificant leakage current through partially selected cells, and select multiple bitlines, evenly spaced across the array, along a single wordline. Such even spacing incurs a lower cumulative IR drop than if the last k-out-of-M bitlines were to be selected. The plot highlights the tradeoff between area efficiency and write bandwidth -given a 'maximum' permissible IR drop (say, 1V) for a particular NVM+AD combination, large arrays require a commensurate reduction in the number of bits being written.
Increases in the voltage margin of the AD allows an increase in write bandwidth, all other parameters staying equal. Fig. 11 (b) plots write power consumption for a 1Mb array with nominal, 10% improved V m , and 20% improved V m . For nominal V m , increasing the write parallelism from 1 to 4 causes a > 12.5× increase in write power, indicating a significant increase in leakage current whereas the corresponding number for 20% improved V m is only ∼ 4.5×.
All the above results indicate that array design can benefit from improved V m . One way to double the overall voltage margin is to use two MIEC ADs in series. This approach would also degrade the turn-on slope and AD series resistance by 2×. Before undertaking the difficult task of trying to integrate two such MIEC ADs, separated only by a thin shared metal electrode, it would be important to know whether the one expected positive effect (better V m ) is going to outweigh the two negative effects (shallower slope and larger series resistance).
Based on the sensitivity analysis above, these negative effects are indeed outweighed by benefits of the V m improvement. Fig. 12 plots the maximum array size that can be achieved for a given power budget as a function of the NVM switching voltage (V HRS ), both for the MIEC AD and a series stack of 2 MIEC ADs. Because the larger V m outweighs the increases in (and thus degradation of) turn-on slope and series resistance, the double MIEC could support NVMs with much higher switching voltages (1Mb arrays even with switching voltages ∼ 2.4V), and larger array sizes (up to 8Mb) at lower switching voltages. This provides a strong incentive for trying to experimentally implement such an integrated stacked-MIEC device pair.
C. IMPACT OF VOLTAGE MARGIN VARIABILITY
The impact of V m variability on write power was studied using Monte Carlo simulations. The intent is to study the 428 VOLUME 3, NO. 5, SEPTEMBER 2015 impact of AD variability above and beyond any variability in the NVM switching voltage. Therefore, in these results, the NVM switching voltage parameter must be interpreted not as the voltage of the median device, but of the worst-case device (or the "six-sigma" device).
In keeping with experimental data [13] from large (512Kb) arrays, V m values for individual ADs in a 1Mb array were sampled independently in every Monte Carlo trial. A standard deviation of 3.7% tracks experimentally observed V m variation {Fig. 13 (inset), [13] }. Standard deviations of 2% and 10% were also studied. Results indicate that at 3.7% variability, almost 8% of the trials resulted in write power > 1mW, whereas at 2% variability, 99.8% of trials fall within the 1mW power budget. Further process optimization (e.g., improved Critical Dimension control) is expected to significantly reduce the V m variability beyond that achieved in [13] , thus yielding tighter power CDFs.
IV. READ OPERATIONS
During read operations on a 1AD+1R array (Fig. 14) , the voltages applied at the extremities of the array develop a potential V READ across an external load resistance R LOAD that depends upon the resistance state of the selected NVM element. This potential can be measured in various ways using an appropriate sense amplifier (not shown). Read margin is the difference between the large voltage that appears across R LOAD when the selected NVM is the LRS state, and the smaller voltage when the selected NVM is in the HRS state. Read margin can be diminished, increasing the possibility of bit-errors during read, if there is significant sneak path leakage through partially selected and unselected cells.
Design considerations for read operations include the applied voltages, the biasing scheme for unselected lines, the impact of stored data patterns, and the value of R LOAD .
While a higher applied voltage increases read margin, it is critical to ensure that no device could possibly change state (undergo read disturb) despite exposure to a large number of such read events. Experimental work with Resistive RAM has shown that read disturbs are possible with repeated voltage stress over multiple cycles, even with applied voltages significantly lower than the switching voltage [24] . In this paper, we assume that the read disturb condition for 10 6 read operations permits a maximum exposure of any NVM to the voltage V DIS = 0.25×V HRS . While we consider the HRS-to-LRS disturb for positive read voltage (inadvertent SET), our analysis is unchanged if it were the LRS-to-HRS transition that constrained the value of V DIS .
The easiest (and thus worst-case) read disturb occurs at the cell closest to the voltage sources (Fig. 14, inset A) when all other NVMs are in the HRS state. With this maximum permissible applied voltage for read operations, the worstcase read margin can be calculated by reading out LRS and HRS states from the worst-case cell (the one farthest from the voltage sources, Fig. 14) under different stored data conditions (e.g., in Fig. 14, insets B and C) . 
A. IMPACT OF BIAS SCHEME AND STORED DATA PATTERNS
Unlike the write operation, the worst-case stored data patterns for read can depend on the biasing scheme. Fig. 15 compares eight different read bias schemes, with circuit definitions shown in inset, each named according to how unselected WLs and BLs are biased. Options include Float, All-GND (identical to the V/2 scheme discussed earlier), all line pull-up (ALPU), all line pull-down (ALPD), and various combinations. For each read scheme, all 16 possible combinations of worst-case stored data patterns were evaluated and the minimum read margin extracted, for three different R LOAD values (R LRS = 26.67k , R HRS = 400k and √ R LRS × R HRS = 103k ). Four different read schemes -Float, All GND, BLPU and WLPD -are shown to have similar read margin. There is also no strong dependence of read margin on stored data patterns. In the remainder of this section, the All-GND (V/2) bias scheme is used for simulating read operations, along with the data patterns shown in Fig. 14, insets B and C. 
B. CHOICE OF LOAD RESISTANCE
The inset of Fig. 16 plots the dependence of the read margin on the choice of R LOAD . Read margin is maximized at a larger R LOAD [4] , [25] , but then deteriorates due to voltage division between R LOAD and R HRS . When the NVM is in its HRS state, a large voltage drop across the NVM and a small voltage drop across the load resistance are desirable. However, at high R LOAD values, this condition can no longer be satisfied, leading to the observed drop-off in read margin. This maxima occurs at R LOAD = 8M , a relatively high value because of the PF non-linearity of the HRS state. While R HRS under switching conditions (V HRS = 1.2V) is 400k , at lower read voltages the actual sensed R HRS can be much higher (we assume a maximum R HRS of 10M at 0.1V).
Large R LOAD also increases read times and read power. Therefore, rather than operating at maximum R LOAD , it is more judicious to select the minimum R LOAD required for accurate detection. In the remainder of this section, an R LOAD of √ R LRS × R HRS = 103k is used. Fig. 16 shows how read margin depends on NVM disturb voltage V DIS and resistances, R HRS and R LRS . As R HRS and R LRS are varied, NVM currents change but the disturb voltage (and thus the maximum allowable applied voltage during read) remains unchanged. While read margin is insensitive to R HRS and R LRS , the V DIS parameter has a more significant effect, since it directly impacts the maximum permissible applied voltage. Fig. 17 (a) plots read margin sensitivity to changes in AD voltage margin V m and turn-on slope S. Read margin can degrade with increasing sneak currents as V m is decreased by 30% or more. However, this effect is significantly more subtle than the commensurate write power constraint, where a decrease in V m by even 10% increased total injected power by orders of magnitude. Similarly, the linear degradation in read margin with increasing turn-on slope is far less critical than the exponential increase in write power (see Fig. 9 ).
C. SENSITIVITY ANALYSIS
The inset of Fig. 17(b) plots the impact of increasing interconnect resistance on the read margin. Once again, a linear impact on read margins is observed, compared to the orders of magnitude increase in write power seen, for even 2× increase in interconnect resistance. Read margin can be significantly degraded by a reduction in the contrast ratio between R HRS and R LRS [ Fig. 17(b) ]. Here, the ON resistance is progressively increased, reducing the OFF/ON resistance contrast from 15× to 1.5×. (An ohmic I-V is assumed for both R HRS and R LRS states, which further diminishes the read margin.) With low resistance contrast, as is common with many MRAM elements [2] , even "good" ADs are of little help and read-out becomes a significant challenge that must be addressed through a combination of novel read schemes and careful sense amplifier design. Fig. 18 plots the impact of V m variation on read margins. Under experimentally characterized V m variability (3.7%), 99.8% of trials yielded read margins in excess of 110mV. Even under relatively high V m variation of 10%, 99.8% of trials yielded read margins in excess of 75mV. By contrast, under the same variability assumption, write operations would be untenable with more than 50% of trials falling outside the 1mW power budget.
From all of the above results, it is clear that constraints for read operations are far less stringent than those for write. Furthermore, read margins can potentially be optimized by design of the external load resistance and the sense amplifier. In terms of the Access Device, if the AD supports large arrays at reasonable write power, it is quite likely to do an excellent job at the much easier task of suppressing sneak currents from affecting read operations. Conversely, any sign of sneak currents (such as read margins that depend on the data patterns) during read probably indicates that the Access Device will simply not be capable of supporting low power write operations.
V. RELATED WORK
Building large arrays would be impossible without non-linear IV at every crosspoint [26] . Non-linearity can be part of the RRAM itself (selector-less crossbar), or incorporated through a discrete AD/selector device.
The impact of RRAM non-linearity on read margin of selector-less crossbar arrays was studied in [4] , but write operations and interconnect resistance were not considered. [5] , [27] considered both write and read operations, but non-linearity was studied only with respect to read. Despite considering only small arrays (up to 64×64) write power was prohibitively high (>10mW). [25] considered arrays up to 16Kb, but did not study the impact of write power. All of the above papers also used percentage read margin as their defining metric, which is less important than the absolute sensing voltage difference across an external resistance.
Reference [28] considers optimal bias schemes that could enable larger arrays than a fixed ratio scheme but does not study the impact of varying NVM switching current and voltage.
Reference [23] includes large arrays, the impact of interconnect resistance, absolute read margins and non-linearity impact for write operations. However, they also do not study sensitivity to NVM switching voltage nor do they explicitly address write power (instead using a normalized write energy metric). A key finding is that non-linearity of the LRS state needs to be carefully tuned to simultaneously meet requirements for write and read -higher LRS non-linearity, while benefiting write operations by reducing sneak path leakage, can diminish read margins by also reducing the effective resistance contrast of the RRAM. The requirement of optimum non-linearity at the appropriate partial-select voltage, added on top of all other requirements for RRAM including low voltage operation, high endurance and retention characteristic, low variability etc. makes selector-less RRAM manufacturing and array design extremely challenging.
Reference [29] presented a selector-less RRAM crossbar (8×8 sub-arrays) with multi-level SET in one polarity and rectifying behavior with abrupt RESET in the opposite polarity. This NVM is not expected to scale up to large array sizes given the ohmic LRS state and low RESET voltage. Reference [30] showed a 2Mb chip with 54nm technology with 16Kb sub-arrays. The authors mention that lower reset current and higher non-linearity would enable larger arrays, though rigorous circuit analysis is not performed. Another selector-less RRAM [31] is also expected to suffer from high sneak path leakage, and the authors discuss a variety of circuit techniques such as active tuning of the analog bias voltage, highly asymmetric arrays etc. to manage this [17] .
1AD1R crossbars can decouple the issue of conflicting non-linearity requirements -AD characteristics must be optimized for sneak path suppression during write, whereas accurate sensing is possible so long as the NVM has a sufficient resistance contrast. References [32] - [37] all quantify read margins on 1AD1R crossbar arrays without considering write power. As stated earlier, the pitfall is that if sneak path leakage is appreciable at read voltages, it would be exponentially higher for write operations making power consumption prohibitive. Some papers also state that higher AD non-linearity degrades read margins, but this conclusion is often a consequence of assuming that the applied read voltage will not change with increasing AD non-linearity (which implies lower effective voltage across the NVM).
Reference [8] does consider read and write operations but does not study write power and array sizes are small (4Kb). VOLUME 3, NO. 5, SEPTEMBER 2015 431
References [22] and [38] consider the impact of interconnect resistance but do not study the NVM or AD design space. Similar to our own findings, Mandapati et al. [39] conclude that write power poses the most stringent constraint for 1AD1R design. However, neither that paper nor Zhang et al. [40] study sensitivity to individual NVM parameters such as the switching voltage. While the latter paper does consider optimal bias schemes and choice of read voltage based on disturb conditions, it does not provide justifications for some of its figures of merit such as a high percentage (25%) read margin requirement. These metrics lead to the conclusion that read margin can be more critical than write power at certain design points. A 1AD1R, 2-layer RRAM was presented in [18] and [19] . The select device provides a relatively low half-select nonlinearity ratio of 150 which implies relatively small sub-array sizes (16 WLs × 576 BLs across 2 layers) and poor area efficiency. Reference [20] presented circuit designs for large 1D1R sub-arrays with up to 62% area efficiency. However NVM and selector characteristics that could accommodate such large arrays were not demonstrated.
VI. CONCLUSION
We have explored the NVM-AD design space, using extensive sensitivity analyzes across device and circuit parameters. For both write and read, we carefully considered biasing schemes, interconnect resistance, and the impact of stored data patterns. From these evaluations, we derive the following key insights: 1) Power consumption during write operations poses the most stringent constraint in determining the maximum array size that can be achieved for a given AD-NVM combination, not read margin. 2) Power consumption during write is highly sensitive to voltage parameters such as the switching voltages of the NVM and the voltage margin of the AD. For these parameters, our simulations show clear 'tipping points,' beyond which even a small change in the voltage parameter leads to a several orders-of-magnitude increase in write power. 3) Developing ADs with wide voltage margins, and NVM devices with low switching voltages and reasonable switching currents, are key requirements for crossbar system design. We show that experimentally demonstrated scaled MIEC ADs can support 1Mb arrays of NVMs with switching voltages up to 1.2V. Small improvements in V m can yield significant benefits in array size. For example, a 20% increase in V m was shown to either enable a 4× increase (1Mb to 4Mb) in array size, or to counteract up to a 5× increase in the interconnect line resistance. 4) While variability can in fact cause the worse-case write power or read margin to increase, these effects remain within tractable bounds for selector variability not much different than what has already been achieved experimentally.
