Abstract-The voltage margin of a resistor-logic demultiplexer can be improved significantly by basing its connection pattern on a constant-weight code. Each distinct code determines a unique demultiplexer, and therefore a large family of circuits is defined. We consider using these demultiplexers for building nano-scale crossbar memories, and determine the voltage margin of the memory system based on a particular code. We determine a purely code-theoretic criterion for selecting codes that will yield memories with large voltage margins, which is to minimize the ratio of the maximum to the minimum Hamming distance between distinct codewords. For the specific example of a 64×64 crossbar, we discuss what codes provide optimal performance for a memory.
I. INTRODUCTION
A nanoelectronic demultiplexer (demux) circuit [1] can be laid out on a crossbar [2] [3] using configurable resistors [4] or configurable diodes [1] at the crosspoint junctions. In both cases, error-correcting codes can be used to improve the circuit performance. In comparison with diode-based circuits, demux circuits based on resistors have an inherent problem in achieving an adequate voltage margin to distinguish between their activated and non-activated output lines. On the other hand, resistor-based demux circuits have better defect-tolerance properties, in some respects, than diode-based demuxes. This defect tolerance arises from the redundancy (redundant signal paths) introduced by the coding (not from "error correction" in the traditional sense, since there is in fact no decoder in the circuit to perform correction). These defecttolerance properties of resistor-based demuxes are described elsewhere [5] ; in this paper, we focus on an improved means of handling the resistor-demux's primary problem: achieving an adequate voltage margin. Further motivation for choosing resistor-based demuxes is that nanoscale crossbars containing resistors (rather than diodes) are far more feasible to manufacture with current processes. In [6] , it was shown that error-correcting codes of a certain type can be employed to construct resistor-based demuxes with superior analog circuit propertiesspecifically, good voltage margins. In this paper, we improve on the performance of the circuit construction in [6] by extending it to a broader class of codes -the constant-weight codes (also called "N-hot codes" or "m-of-n codes").
Constant-weight codes have been previously presented in a nano-scale demux design [7] (called a "decoder" in that paper), using a specific code with weight w=2 and size M=6. In this paper, we expand the analysis of constant-weight codes. We (a) present the entire family of demux circuit designs based on constant-weight codes; (b) present a quantitative model for the voltages occurring on each of the output lines of a resistor-logic demux based on an arbitrary constantweight code; (c) analyze the use of these demuxes in a nano-scale crossbar memory to develop a performance measure; and (d) characterize the best possible codes for 1-of-64 demuxes.
We briefly review the terminology of coding theory [8] [9] relevant to this paper. The Hamming weight 
, which is the number of bit-positions in which the two bitstrings differ. A binary code of size M and length n is a set consisting of M length-n bitstrings (or codewords) that we use as nanowire addresses in our demux circuits. The minimum Hamming distance between any pair of distinct codewords is denoted by d min . Any code may be described using these parameters as an (n,M,d min ) code. The maximum Hamming distance between any pair of codewords is denoted by d max . The dimension of a code is defined as M 2 log . An important type of code is the binary linear code, which consists of a set of M = 2 k length-n codewords that is closed under component-wise addition modulo 2, and thus forms a linear vector space over the integers modulo 2 code. Another important type of code is the constant-weight code, in which all codewords have the same weight w; such a code is described as an (n,M,d min ,w) code. A subtype of the constant-weight code is the balanced code, in which each codeword is "balanced" by having an equal number of ones and zeroes, and which therefore has the property that 2 n w = .
II. DEMUX CIRCUIT
The fundamental design pattern of these code-based demuxes is that each codeword defines the connection pattern for one nanowire in the array of output lines. The code has M codewords and the demux circuit has M output lines. Fig. 1 shows the circuit diagram for an example demux based on a particular M=4 code. The encoder circuit and configuration pattern must match one another, since they are determined by the same code. The code can be read directly from the configuration pattern of the crossbar, one codeword per output line -in Fig. 1 , the code used is {000111, 011100, 101010, 110001}, and each output line is labeled with its encoded address h. The encoding function E, implemented by the encoder circuit, must produce codewords u matching those defined in the configuration pattern. Thus the specific configuration pattern shown for the crossbar in Fig. 1a implies a particular code which must be used in the encoder.
The design pattern illustrated above may be stated formally as follows. First, the length k of address vector a must be a whole number of bits long,
where ) (r ceiling is the smallest integer i with r i ≥ . The encoding function E takes the length-k vector a as input, and produces a length-n codeword u as output.
( )
We enumerate the demux's output lines using a k-bit index vector i (the unencoded address of each output line) and we may then express the encoded address h of each output line as
, with a interpreted as a kbit integer. Likewise, valid values of i fall into the same range. We will assume the values of a and i are valid in the rest of the paper. 4 For an arbitrary output line S i associated with codeword ) (i h E = , each bit of the n-bit vector h specifies whether a connection is configured at each of the n junctions where the n CMOS address lines cross this particular nanowire (See Fig. 1a) . A "1" specifies a configured resistor at the junction, and a "0" specifies no connection.
In Fig. 1 , for example, the top output line S00 has index i=00 and encoded address h=000111. It is useful to label each nanowire output line with an encoded address h, since the operation of the AND gate on an output line S i may be thought of as recognizing a particular encoded address ) (i h E = when it appears on the CMOS address lines (as signal vector ) (a u E = ). Output line S i thus turns on when it recognizes the condition u h = . This condition occurs if and only if a i = , showing that each input address a causes exactly one output line (S a ) to turn on.
A. Demux Output Voltages: Demux Based on General Code
We consider a single arbitrary demux output line, similar to those shown in Fig. 1 , in a demux based on an arbitrary code. We assume that all the resistors in the crossbar are linear and have the same resistance values R. The current input address signal a produces an output vector ( ) a u E = from the encoder, which drives all the CMOS wires in the crossbar. However, for a particular nanowire output line, not all of the junctions on that nanowire are configured, and so only a subset of these address bits affects the output voltage of the nanowire. The voltage on an arbitrary output line may be calculated by counting up the number of ones (n ones ) and zeroes (n zeroes ) driving this output line's configured resistors, and analyzing the circuit as a voltage divider as shown in Fig. 2 . (Fig. 1) . The ones in h specify which resistors are connected; this subset of the signals in u forms a voltage divider. Within that subset, the ratio of ones to total connections determines the output voltage. The resistors connected to the "logic 1" voltage form the upper parallel bundle, and the resistors connected to "logic 0" form the lower parallel bundle.
We assume normalized driving voltages in which a "zero" bit in the address is at GROUND=0 V, and a "one" bit is at V DD =1 V, with the output voltage measured with respect to GROUND. The normalized output voltage v of the voltage divider may be calculated as a ratio of the equivalent resistance of the lower parallel bundle of resistors lower R to the total resistance of both parallel bundles upper lower R R + . This simplifies to a ratio of the number of ones to the total number of connections (which is equal to the total number of ones and zeroes). 
We consider the output line S i labeled by encoded address 
This quantitatively characterizes the normalized voltage v h of all output lines of a demux based on an arbitrary code (in particular, also a constant-weight code), for any valid addresses a that may occur as input signals to the demux. By (2) and (3), u and h are codewords of the code. Equation (5) is valid for any code whatsoever, and is not limited to constant-weight codes.
B. Demux Output Voltages: Demux Based on Constant-Weight Code
We now derive a version of (5) specialized to constant-weight codes. We denote the complement of a codeword c by c . For any two codewords u and h of a constant-weight (n,M,d min ,w) code, considered together as shown in Fig. 3 , we partition the 2-element columns into four subsets, based on the four possible values of the corresponding bit-pairs; thus the weights of these subsets are ) ( The distance between u and h is thus
and so 2 01 10
We denote the weight of the intersection as
Therefore, for constant-weight codes with weight w, and any pair of codewords u and h, it is always true that
6
The distance d between codewords in a constant-weight code is always even, and so 2 d is always an integer. For a constant-weight code with weight w, we may combine (5), (6) and (7) to get
This characterizes the normalized voltage at any demux output line (labeled with its encoded address h), for any input signal (represented by the encoded input address u), valid for a demux based on any constant-weight code with weight w. The distance d between codewords of a constant-weight code must fall in the interval ] 2 , 0 [ w , and therefore the normalized output voltage remains in the unit interval:
We denote the distance profile of a code (the distinct distances that occur between pairs of codewords, sorted into ascending order) as integer vector d, and the (normalized) distinct voltages that appear on the output lines of the demux as real vector v. From (8) we get
where 1 denotes the all-one vector.
The normalized output voltage h v in (8) Fig. 1a ), they can be freely specified by the circuit designer. Combining (9) and (10) gives
This characterizes the distinct output voltages of the demux in terms of the weight w and distance profile d of the code, and the two driving voltages 
III. NANO-SCALE CROSSBAR MEMORY APPLICATION
Our application is a pair of demux circuits that drive the rows and columns of a nano-scale crossbar memory array [6] . In this system, the two demuxes are implemented using mixed-scale crossbars (conventional wires cross nanowires) whereas the memory array itself is a pure nanoscale crossbar (Fig. 5a ). The crosspoint junctions of the memory array are assumed to be hysteretic resistors [10] , which can be dynamically reconfigured by applying certain voltage drops across them. The idea is to realize a memory system by storing bits in the individual junctions of the memory array, with the convention that high resistance represents a zero, and low resistance represents a one. The demuxes are configured once, at manufacturing time, and are assumed to be stable thereafter; their function is to get the proper voltages to the junctions of interest in the memory array. To make such a system work, we must be able, for an arbitrary junction in the memory array, to (a) write a one or (b) write a zero or (c) read the current state of the bit.
The behavior of a hysteretic resistor, which is a two-terminal device, is that it is controlled by the voltage drop that is put across it [10] . Putting large-magnitude voltage drops (either positive or negative) across the device will destroy it; putting a moderate-magnitude voltage drop across it will write either a one or a zero, depending on the polarity; and putting a low-magnitude voltage drop across it will not change its resistance. Specific voltage thresholds for recentlymanufactured devices are given in [10] ; however, these values vary as the manufacturing process changes.
Because of the nature of the hysteretic resistors, the circuits used to perform READ and WRITE operations have quite different requirements. They must both access the memory array through the row and column demux circuits, but otherwise may be completely independent CMOS circuits, that are alternately enabled, depending on whether reading or writing is currently occurring. In this paper, we focus on the WRITE operation, in which the primary problem is to deliver to the selected junction a voltage drop which is reliably above the write-threshold voltage, and below the destruction voltage (while guaranteeing that all the non-selected junction voltage drops are well below the write threshold). Briefly, the main problem faced by the READ circuit is one of discrimination. It makes sense to use lower-magnitude voltages in reading to make sure that accidental writes do not occur. With a low-magnitude voltage drop across the a hysteretic resistor, it will act as a normal resistor and its resistance can be measured to read the state of the bit stored in it. The problem is to design a circuit that can discriminate the resistance of the addressed junction from the other 1 In order to analyze the voltages that can be applied to memory junctions by demuxes driving the rows and columns of a crossbar, we make some assumptions:
• the resistance of the nanowires and conventional wires composing the system are negligible compared with the resistances of the configured junctions, and that the nonconfigured junctions are effectively open circuits.
• the loading of the demuxes by the memory array is negligible, so that the voltages that appear on the demuxes in isolation are very close to the voltages that actually appear when driving the memory array.
• the CMOS driving circuits can be treated as ideal voltage sources, which do not sag with increasing load. This enables the voltage-divider circuit model expressed in (8) , in which the voltage on each output line can be calculated independently of all other output lines. Given these assumptions, the voltage drop across a junction in the memory array is simply the difference between the voltages on the two crossing wires (see Fig. 5a ),
By our assumptions, the voltages on each row or column wire are constant across the entire memory array, and thus the wires convey the demux output voltages (8), without distortion, to either side of the junctions. We write one bit at a time in the memory by using the two demuxes to activate a selected row and a selected column, which cross at a specified junction. The address input signals to the row demux and column demux determine which output line is selected, for each of the two demuxes. However, all the non-selected junctions are still active, and could be accidentally written to or destroyed if over-threshold voltage drops happen to be unintentionally applied to them. Thus our goal is to set up the demux driving circuits and voltages such that we deliver the intended voltage drop to the selected junction, and deliver voltage drops as small as possible to all the other nonselected junctions in the memory array.
Design Goal: We wish to minimize the ratio v q ,
of the largest magnitude of the voltage drops delivered to any non-selected junction with respect to the magnitude of the voltage drop delivered to the selected junction.
It is natural to define the normalized voltage margin m v as
since this is the (normalized) difference between the voltage drop we wish to deliver to the selected junction, and the worst-case voltage drop incidentally delivered to a non-selected junction.
Two versions of (11) are required for the row and column demuxes, with two separate pairs of driving voltages, producing one set of voltages out row v driving the rows, and another set of voltages out col v driving the columns. We use (12) to combine these absolute voltages to obtain a matrix specifying all voltage drops that can occur in the memory array. We use a graphical technique (Fig. 5b) to visualize the distinct voltage drops that occur.
We must first choose the four driving voltages for the row and column demuxes (logic-1 and logic-0 voltages for the row demux, and logic-1 and logic-0 voltages for the column demux). To keep things simple, it was valuable to use the digital circuit abstraction as much as possible, and to discuss the resistor-logic demux circuit as a demultiplexer (a bona fide digital circuit); that is why we used the names "logic-1" and "logic-0" for the two voltages driving the demux input lines. However, at this point, as we combine the outputs of two resistor-logic demux circuits, we leave the digital abstraction. The four voltages are no longer considered to be digital signals, but are four independently-adjustable voltages, which we need to control in order to deliver the desired voltage drops to certain junctions in the memory array. For simplicity, we assume that the row and column demuxes are identical (same code, same number of output and input lines, same encoder circuitry). This restriction is not necessary, since, for example, there may be a reason to implement a rectangular rather than a square crossbar memory.
There are only three degrees of freedom, since adding a constant to all four voltages has no effect on the circuit behavior. We can arbitrarily choose one of these voltages to be GROUND. Because the hysteretic resistors have fixed voltage-drop thresholds governing their behavior, the voltage difference between "logic-1" and "logic-0" for both demuxes is constrained; this consumes two more degrees of freedom. One degree of freedom remains: the voltage offset between the row voltages and the column voltages. The situation is diagrammed in Fig. 5c . Our goal is to maximize the variable t in Fig. 5c , which represents the normalized offset voltage between the row and column voltages. The variable s in Fig. 5c represents the splay (how spread out the non-selected output voltages of a demux are -specifically, the relative width of the voltage range occupied by the non-selected output lines); the splay s is determined by the choice of code, and can be calculated as 
The diagram of Fig. 5c provides a visual metaphor. To consider various voltage offsets t between the row and column voltages, we imagine that there are two solid structures holding the row and column voltages, which may be slid up and down vertically with respect to one another, while the dotted lines (representing voltage drops) are rubber bands that stretch. We slide the two structures back and forth until we have found an offset t where the ratio v q (13a) is at a minimum. Excluding the selected-junction voltage drop in order to focus on non-selected junctions, we consider the two largest-magnitude voltage drops of each polarity. The dotted line between the lowest row voltage (lowest hollow circle on the left) and the highest column voltage (highest hollow circle on the right) represents the maximum-magnitude voltage drop with negative polarity (using the polarity convention of (12)). As shown in Fig. 5c , this voltage drop has magnitude t − 1 . For the positive polarity, since the selected-junction voltage drop is excluded, we must choose either the voltage drop between the highest row voltage and the second-lowest column voltage, or between the lowest column voltage and the second-highest row voltage. But by the symmetry of the system, these two voltage drops are equal. We arbitrarily choose the latter, which is shown in Fig. 5c as the dotted line between the highest hollow circle on the left, and the bottom filled circle on the right. The magnitude of this voltage drop is s t + . The quantity ( )
is equal to the numerator of (13a). The denominator of (13a) represents the magnitude of the voltage drop across the selected junction, which can be seen in Fig. 5c to be t + 1 . We wish to minimize We can solve (16a) for t, the optimal voltage offset between demuxes. . This last normalization is dependent on both max d and the optimal voltage offset t. These three normalizations were inconvenient for analyzing the memory circuit; to apply them to a real circuit, a final voltage scaling is needed to make the delivered voltage drop at the selected junction match the write thresholds of the hysteretic resistors in the junctions. To achieve our design goal of maximizing m v , we must minimize
, which is a purely code-theoretic quantity. This gives us a simple criterion for evaluating codes for this crossbar memory application.
Code selection criterion for the memory application: minimize
Codes that have low values of d q will produce memory systems with low values of v q , which will have higher voltage margins m v than a code with a higher v q .
For a given memory size (or code size) M, very low v q can be achieved if the code length n is allowed to grow without limit. However, increasing n incurs a cost in chip area. Therefore there is a trade-off between v q and n, or, in other words, given a specification for the code size M and the voltage margin m v , we shall seek the shortest possible code satisfying the specification.
A. Example of Constant-Weight Code Demuxes Used in a Memory System
We will use an (11,64,4,5) code (also referred to in Figures 4, 5 and 6 ) to illustrate the use of constant-weight codes for constructing demux circuits for memory systems. This code has n=11 and yields a memory system with a 4 . 0 = m v (voltage margin of 40%). The 40% voltage margin of this system is a 1.0 V difference between the 2.5 V delivered to the selected junction, versus the worst-case 1.5 V delivered to any other junction. This margin is allocated in this example so that we exceed the 2.0 V write threshold by 0.5 V for the selected junction, and the selected junction is 1.5 V below the destruction threshold of 4.0 V. For all the non-selected junctions, the worst-case delivered voltage drop is 1.5 V, which is 0.5 V below the write threshold.
The three different normalizations used in the analysis can be seen in Fig. 7 . The input voltage range for each demux is 2.5 V. The range of output voltages for each demux is 2.0 V. The value of 4 1 = t is calculated as the optimal normalized voltage offset for this code, which is thus an offset of 0.5 V. The voltage drop delivered to the selected junction is 2.5 V. In this example, there are only two distinct voltages among the four driving voltages, but in general the four voltages may be distinct.
IV. WHAT ARE THE BEST CONSTANT-WEIGHT CODES FOR 64×64 MEMORIES?
For the nano-scale crossbar memory application, in the foreseeable future, practical circuit issues limit the size M of the demuxes we might build to the range 1024 64
, where M is not required to be a power of 2. As an example, we will choose a specific code size M=64. Codes of this size determine 1-of-64 demux circuits, which could be used to make 64×64 memory systems. With the number of output lines M held constant, and the number of CMOS address lines n allowed to vary, we ask: what are the best constant-weight codes (we can find) for each value of n? The area of the mixed-scale crossbar implementing an (n,M,d min ,w) demux is proportional to nM, so codes with low n are desirable to minimize chip area requirements. In [11] , for M=64 constant-weight codes, some lower bounds for d q were established for certain ranges of n. Some specific codes were also given, for which the d q -values equal the lower bounds. These are the codes in Table I ; the actual codes are explicitly described in the appendix. For codes shown in the ; the n=19 code of row 5 of Table I is more efficient (shorter) than this code.
Constant-weight codes appropriate for other demultiplexer sizes (e.g., M = 128 or 256) can be obtained similarly to those presented above for M = 64. The starting point is usually a "good" constant-weight code in the conventional sense of coding theory (e.g., with a large value of M for given n and d min ), to which coding-theoretic operations of expurgation, shortening, and augmentation [7] are applied to optimize the code for our requirements (i.e., a low q d ratio). The mathematical tools from [10] still provide lower bounds on the best achievable value of q d given M and n. However, as the code size increases, there might be gaps between the best parameters achieved by the constructions, and the bounds of [10] , which might leave the question of code optimality open. Nevertheless, from a practical point of view, the codes obtained still provide excellent performance for the application at hand.
In [6] , we presented a demux circuit construction that utilized complementary repeated codes based on linear codes. They are constructed (see Eq. 8 of [6] ) by taking a starting linear code and to each codeword appending its complement; which was called "balancing" in [6] . When starting from an ( )
linear code, this operation produces an ( )
constant-weight code. These are balanced codes, since they have equal numbers of ones and zeroes in each codeword. They are therefore a specialized subclass of the constant-weight codes. However, they cannot be efficient in terms of length n, because of the doubling of length inherent in their construction, and general constant-weight codes will be shorter. For example, the (11, 66, 4, 5) code from Row 3 of Table I is , but with n=22 it is twice as long as the optimal code.
V. ENCODER AREA
The complexity of the encoder circuit for a general constant-weight code may be greater than that of a complementary repeated code, resulting in a cost of increased chip area. However, there are several reasons why this may be irrelevant for code-based demux circuits. (a) A large number of memory arrays may be serviced by a single encoder, so that the encoder cost is amortized over many memory blocks [1] . (b) Even without amortization, the "good" codes described for this application are often constructed by starting from a good linear code (e.g., the Golay code [8] ) and modifying it. Thus, the structure of the original linear code can still be exploited to design encoders that are significantly simpler than a full lookup table. (c) The area of the possibly larger encoder is counterbalanced by the reduced area of the demux's mixed-scale crossbar (half the area, in the above example of the n=11 vs. n=22 codes). Therefore, a larger encoder area is not likely to be a significant disadvantage when using constant-weight codes in designing demux circuits for the nano-scale memory application.
VI. CONCLUSIONS
We analyzed demuxes based on general constant-weight codes and the entire memory circuit controlled by them. We were able to set lower bounds on the shortest codes (smallest demuxes) that could provide a particular voltage margin in a memory, and then we identified codes that satisfied these bounds for the case of a 64×64 crossbar. We made a number of simplifying assumptions in order to obtain a quantitative criterion (18) for evaluating the efficiency of the codes:
• The resistors in the crossbar are linear.
• The configured resistors in the crossbar all have the same value of resistance.
• The non-configured resistors in the crossbar are open (infinite resistance).
• The conventional wires and nanowires of the crossbar have negligible resistance.
• The CMOS voltage sources driving the demuxes do not sag under load.
• The loading by the memory array has a negligible effect on the demux output voltages.
• Two identical demuxes are used to drive the rows and columns.
• The hysteretic resistors in the crossbar all have identical write-thresholds and destruction thresholds, and the magnitudes of these thresholds are equal for both polarities. Although not all of these assumptions will hold for real nano-scale crossbar memories, the constant-weight codes defined by the criterion of (18) will still be preferable in most such systems because the shorter codes will still be more efficient for memories in which the analog circuit properties deviate from our assumptions. We have therefore disentangled the issue of choosing a code from the complicated analog circuit issues that arise in designing a resistorbased demux.
ACKNOWLEDGMENT
We gratefully acknowledge J. Straznicky for valuable discussions, and the Defense Advanced Research Projects Agency (DARPA) of the United States for partial support. 
