Nanowire addressing with randomized-contact decoders  by Rachlin, Eric & Savage, John E.
Theoretical Computer Science 408 (2008) 241–261
Contents lists available at ScienceDirect
Theoretical Computer Science
journal homepage: www.elsevier.com/locate/tcs
Nanowire addressing with randomized-contact decoders
Eric Rachlin, John E. Savage ∗
Computer Science, Brown University, Providence, RI 02912-1910, United States






a b s t r a c t
Methods for assembling crossbars from nanowires (NWs) have been designed and
implemented andmethods for controlling individual NWswithin a crossbar have also been
proposed. However, implementation remains a challenge. A NW decoder is a device that
controls many NWs with a much smaller number of lithographically produced mesoscale
wires (MWs). Unlike traditional demultiplexers, all proposed NW decoders are assembled
stochastically. In a randomized-contact decoder (RCD), for example, field-effect transistors
are randomly created at about half of the NW/MW junctions.
In this paper, we tightly bound the number of MWs required to produce a correctly
functioning RCD with high probability. We show that the number of MWs is logarithmic
in the number of NWs, even when manufacturing errors occur. We also analyze the
overhead associated with controlling a stochastically assembled decoder. As we explain,
lithographically-produced control circuitry must store information regarding which MWs
control which NWs. This requires more area than the MWs themselves, but has received
little attention elsewhere. Finally we analyze several simple testing algorithms for
configuring this control circuitry.We demonstrate an unexpected tradeoff between testing
time and the number of MWs required by an RCD.
© 2008 Elsevier B.V. All rights reserved.
1. Introduction
The dream of nanoscale computing was first articulated by Richard Feynman in his 1959 speech to the American Physical
Society. He argued that no physical law prevented the room-sized computers of the 1950’s from being replaced with vastly
more powerful pin-sized computers built from billions of nanoscale devices. Since then computers have become orders of
magnitude smaller, faster and more powerful. Their wires and gates, however, have not yet reached the nanoscale (i.e., the
dimensions of individual molecules).
Although individual nanoscale devices have been demonstrated, we lack the ability to place these devices with nanoscale
precision. As a result, near-termnanoscale architectureswill be assembled stochastically. The range of device variation these
architectures must tolerate, makes them fundamentally different from today’s CMOS. Our approach to nanoscale circuit
design must change accordingly.
For the past 30 years, chipswith ever shrinking features have been produced, using photolithography.Wires and gates are
defined using light on a silicon substrate. This allows many copies of a chip to be produced from a single set of costly masks.
Although photolithography allows for a very wide range of circuit designs, the wavelength of light (193 nm or greater) is
too large to allow for features on the order of a few nanometers, the range considered in this paper. Nanoscale architectures
require new manufacturing technology.
A particularly viable basis for nanoscale architectures that has received significant attention in the physical science and
engineering communities is the nanowire crossbar [1,2] (see Fig. 1). Here a grid of nanowires (NWs) provides control over
∗ Corresponding author.
E-mail addresses: eerac@cs.brown.edu (E. Rachlin), jes@cs.brown.edu (J.E. Savage).
0304-3975/$ – see front matter© 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.tcs.2008.08.011
242 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
Fig. 1. A crossbar formed from two orthogonal sets of NWs with programmable molecules (PMs) at the crosspoints defined by intersecting NWs. NWs are
divided into contact groups by connecting them to ohmic contacts (OCs). To activate a NW in one dimension, a contact group is activated and MWs are
used to deactivate all but one NW in that group. Data is stored at a crosspoint by applying a large electric field across it. Data is sensed with a smaller field.
molecular devices that reside at their crosspoints. Like traditional crossbars, NW crossbars can act as memories (such as
RAM) and circuits (such as PLAs) [3,4]. Unlike traditional crossbars, however, their assembly is stochastic. This results in
three very important challenges:
(1) NWs are randomly assigned physical addresses.
(2) A testing procedure is required to configure a crossbar’s control circuitry.
(3) Permanent and transient faults must be tolerated.
To overcome these challenges, nanoscale crossbar-based architectures rely on stochastically assembled NW decoders.
A nanowire decoder is any device capable of controlling the resistances of individual NWs, using larger lithographically-
produced mesoscale wires (MWs) and ohmic contacts (OCs). In Section 2 we explain how NW decoders can be used to
control a NW crossbar-based memory.
To date, all proposed NWdecoders rely on a stochastic assembly process. Three types of NWdecoder have been analyzed
probabilistically, ‘‘encoded NW decoders’’, ‘‘mask-based decoders’’, and ‘‘randomized-contact decoders’’. In Section 3 we
describe these decoders and summarize their performance.We also provide a newbound on the number ofMWs an encoded
NW decoder requires to control a large fraction of all NWs.
In Section 4 we examine in detail the addressing of NWs by MWs and derive conditions that must be met by the
resistances of NW/MW junctions in order to address NWs correctly. This allows us to give a model of NW decoders that
explicitly takes manufacturing errors into account.
In Section 5 we use this model to derive probabilistic bounds on the number of MWs needed to address NWs with the
randomized-contact decoder (RCD). We bound the number of MWs needed to address all NWs connected to a single OC.We
also bound the number of MWs required to address some fixed fraction of NWs across all OCs. Our analysis demonstrates
that RCDs are efficient and robust. They can reliably control a large number of NWs using a small number of MWs, even
when some fraction of contacts between MWs and NWs are defective.
Our bounds improve upon the analysis of [5]. They take errors into account, and alsomake explicit the probability that the
bounds hold. Additionally, their derivation is more precise, as it avoids an unnecessary independence approximation (see
Section 3.3). This ensures that our bounds apply even to small groups of NWs connected to OCs. In the case of a large number
of NWs connected to an OC, our bound on the number of MWs required to address all NWs agrees with the asymptotic
analysis in in [5].
We also note that previous work on coping with manufacturing defects in NW decoders has focused on the use of error
correcting codes [6–8]. In these coding-based approaches, a specific error correcting code is used to determinewhich subsets
of MWs have the ability to address NWs. Viewed in this light, the bounds of Section 5 show that even randomly generated
codes provide good defect-tolerance, with only constant factor overhead.
This is a significant insight, since it eliminates the requirement that a NW decoder provide designers with a great deal
of control over which subsets of MWs can control NWs. This level of control is present in encoded NW decoders [8], or
programmable NW decoders [9,7] (which themselves require a second NW decoder to configure), but is not present in
RCDs, which are arguably simpler to manufacture.
Since we cannot predict in advance which MWswill control which NWs, NW addresses must be discovered after an RCD
is assembled. These addresses must then be stored and mapped to fixed binary addresses, using programmable circuitry.
Strategies for implementing this mapping and a method for estimating the area used by a crossbar as well as its addressing
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 243
circuitry are discussed in Section 6. The need for such circuitry is also mentioned in [3], but the varying overhead associated
with specific mapping strategies is not considered. In [10] specific mappings are considered in the context of encoded NW
decoders, but not the ‘‘Almost All Wires Addressable’’ and ‘‘Take What You Get’’ strategies presented here.
The problem of discovering which subsets of MWs address NWs is discussed in Section 7. An exhaustive search is
considered, as is a search randomized algorithm. The randomized algorithm is analyzed in Section 8. Again, we improve
on work done in [5], relaxing an assumption about what testing circuitry is present. We also correct an independence
assumption that is not met, unless the number of MWs used to control the NWs connected to an OC is much larger than
what the bounds of Section 5 require. Conclusions are drawn in Section 9.
2. Crossbar overview
Tomotivate our analysis, we briefly describe howNWcrossbars can be used asmemories. This approach can be extended
to circuits as well [4]. Since our research focuses on controlling individual NWs with MWs, crossbar-based memories offer
sufficient motivation.
2.1. Crossbar assembly
There are multiple approaches to constructing a NW crossbar. Undifferentiated NWs can be stamped onto a chip, [11–
13], or alternatively, many types of differentiated NWs can be grown off chip, collected in a large ensemble, then deposited
fluidically [14,15]. In either approach, a molecular layer is deposited between two layers of parallel NWs. This layer can be
comprised of molecular diodes that switch between a low and high resistance in a large electric field [16,17]. A layer of
amorphous silicon has also been proposed as a storage medium [18]. Programmable devices that do not behave like diodes
(e.g. nanoscale resistors or transistors), have also been considered [19,9]. Some of these alternatives have been compared
to diodes with regard to their information storage capacity and ability to control NWs [20,7,21]. When used for information
storage, resistors are inferior to other alternatives [20].
Once a NW crossbar is assembled, g OCs and M MWs can be placed along each dimension of the crossbar, using
photolithography (see Fig. 1). Although eachMWcontrols (makes nonconducting) a subset of the NWs, these subsets cannot
be chosen deterministically. For each NW, we can describe the subset of MWs that control it using a binaryM-tuple. We call
this a NW codeword.
In the case of undifferentiated NWs, two methods have been proposed to control NWs with MWs. The first, the
randomized-contact decoder (RCD), is analyzed here. A proposed approach for producing an RCD is to randomly deposit
nanometer-sized particles between NWs and MWs, making each NW/MW junction controlling with some fixed probability
[5,22]. The secondmethod for controlling undifferentiated NWs relies on randomly shifted lithographically-defined regions
of high-K dielectric material between NWs and MWs [23,24]. Here again, each MW is made to control some random subset
of the NWs. Additional methods for producing decoders are possible using differentiated NWs [25,26].
2.2. Crossbar operation
In a crossbar memory, NWs along each dimension are divided up into g contact groups of N NWs each. NWs in each
contact group are connected to a common OC. To use the crossbar as amemory, a voltage is applied to a single contact group
of NWs along each dimension of the crossbar. Subsets of MWs along each dimension are then used to address NWs within
each of the two groups. This operation can either read or write a single bit to the crosspoints of the NWs being addressed
(see Fig. 2). If multiple NWs connected to an OC are addressed by the same set of MWs, it may be acceptable to store the
same bit at multiple crosspoints.
In a write operation, the diodes at crosspoints are turned on or off by applying a large potential between one or more
pairs of orthogonal NWs, by addressing (giving low resistance to) one or more NWs in each dimension. Both ends of the
NWs are maintained at the same potential. The polarity of the potential determines the state of a crosspoint and the value
written.
In a read operation, a smaller voltage is used, allowing the decoder to detect the state of crosspoints. In a read operation
eachNW is disconnected fromone of its ohmic contacts. Currentwill either flow or not flow through a crosspoint, depending
on its state. The amount of current reveals the resistive state of the crosspoints, and thus the value being stored.
Both read andwrite operations require that the NWs being addressed have a significantly lower resistance than the other
NWs in the same contact group. This requirement is formalized at the beginning of Section 4.
2.3. Address translation circuitry
When a memory is supplied with a particular external binary address, address translation circuitry (ATC) along each
dimension of the crossbar maps that address to a contact group and MW input. This mapping depends on the stochastic
assembly of the decoder. To ensure each external address addresses some NW, the ATCmust store information about which
MWs control which NWs. In the next subsection, we discuss how this information can be obtained. For now, assume it is
known and consider the storage overhead required.
244 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
Fig. 2. A crossbar-based memory in which OCs and MWs read and write data to programmable molecules at crosspoints. The darkened segments along
each NW indicate lightly doped regions. These regions become nonconducting when the adjacent MW is turned on. In a read operation an OC at each end
of a NW is disconnected from ground. Current flows through any conducting NW crosspoints that are addressed by MWs. The amount of current reveals
the value stored at the crosspoints. In a write operation, NWs along each dimension apply a larger electric field across their crosspoints. The direction of
the field determines the value stored at the crosspoints. In this figure, the same bit of information is stored at two crosspoints.
In order to make address translation circuitry fast, reliable, and easy to manufacture, it may be implemented in CMOS.
Any approximation of the area of the memory must take into account not just the area of MWs and ohmic contacts, but also
the area used to store physical NW addresses using CMOS. We explicitly model the size of address translation circuitry in
[10], but it has received less attention elsewhere. The appendix of [3], also estimates the area required for the ATC, but does
so without exploring how different address mapping strategies affect area requirements. The prospect of implementing the
ATC using nanoscale storage is considered in [9].
Most previous work on NW decoders has focused on the number of MWs required to control NWs. Although MWs are
much wider than NWs, they are still relatively small. In an RCD, however, each NW/MW junction, corresponds to a bit of
storage in address translation circuitry. As a result, these bits, when stored in mesoscale devices, collectively take up far
more area than the NW/MW junctions.
The ATC must associate a contact group and codeword with each external address. In the worst case, this requires
log2 g + M bits of storage per NW. In some cases fewer bits are required. For example, if all NWs can be addressed, and
the number of NWs per contact group is a power of 2, the high order bits of an external address can be used to index a
contact group. The address translation circuitry now requires onlyM bits of storage for each NW. The way in which external
addresses are mapped to NWs is called an addressing strategy. Later, several addressing strategies are discussed in detail.
As we explain, some addressing strategies require more overall area than others.
2.4. Address discovery
We also explore the problem of testing. In an RCD, each NW has a physical address determined by which MWs control
it. Since addresses are randomly generated during decoder assembly, they must be discovered. This is a difficult problem,
as some addresses mask others, and faults make test outputs unreliable. We evaluate the effectiveness of several simple
testing procedures that do not require read/write operations.
In [10], an efficient testing procedure involving read/write operations was given for differentiated NW decoders. The
algorithm’s reliance on nanoscale storage devices is a drawback. Read/write operations are relatively time consuming, and
possibly faulty. Also, in circuits, they may not be possible at all (as not all NWs will be used to control nanoscale storage
devices).
The testing algorithms we consider are only allowed to apply a voltage across a contact group, turn on a subset of the
MWs, and observe if any NW remains conducting. This test does not reveal which NW is on, nor does it reveal if multiple
NWs are on. Nonetheless, it is sufficiently powerful to determine which subsets of MWs address individual NWs. As it turns
out, the algorithm in [10] can also be adapted to this model. Unfortunately it does not work for RCDs. A discovery algorithm
for RCDs is given in [5], which we improve upon in Section 7 on this algorithm here. We also improve upon its analysis.
3. Decoding technologies
In this section we describe three types of NW decoder. Each type of decoder can itself be manufactured in multiple ways.
As shown in Section 4, however, all three decoders can be modeled in a unified way. Using this model, we analyze the
number of MWs required by an RCD in Section 5. In Section 6 we estimate the total amount of area RCDs require.
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 245
3.1. The encoded NW decoder
Encoded NW decoders work with two kinds of NWs, modulation-doped NWs [27,28], NWs with sequences of lightly
and heavily doped regions, and radially encoded NWs [29], NWs with removable shells. In both cases many NW types
are prepared separately, each with a different encoding. When modulation-doped NWs are used, encodings correspond
to patterns of lightly and heavily doped regions. When radially encoded NWs are used, encodings correspond to sequences
of shells. In either case many NWs of each type are all collected in a large ensemble then deposited onto a chip using fluidic
methods that align the NWs in parallel [30].
When MWs are placed across the NWs, any NW/MW junction comprised of a lightly doped region forms a field effect
transistor (FET). The application of an immobilizing electric field to the MW causes the resistance of the junction to become
high. ANW is addressed by applying fields to allMWs that do not significantly increase its resistance. If doping sequences are
properly chosen, only one type of NWwill become nonconducting. (See Figs. 1 and 2.) In practice, lightly doped NW regions
will not align perfectly with MWs [29]. Consequently, a MW’s control over a NW can be ambiguous. Several methods for
encoding modulation-doped NWs are studied in [10].
The encoded NWdecoder also works with radially encoded NWs, that is, NWs that have shells composed of differentially
etchable NWs [29]. There are several ways to control these NWs with MWs. The simplest method uses one MW to control
each type of NW. Each NW type is grown with a different sequence of s shells surrounding a lightly doped core. Once NWs
have been deposited, a different sequence of s shells are etched away in the space reserved for each MW. Under each MW,
only the lightly doped cores of oneNW type are exposed, and if shells are sufficiently thick, eachMWcontrols only NWswith
exposed cores. Radially encodedNWsdonot suffer frommisalignment butmay require slightly larger radii thanmodulation-
doped NWs.
3.1.1. Known results
In an encoded NW decoder, the number of controllable NWs depends on the number of differently-encoded NW types,
C , being used. This, in turn, determines how many MWs, M , are required. In an encoded NW decoder, each NW is equally
likely to contain each encoding. NWs with different encodings can be addressed separately. When NWs are encoded using
‘‘binary reflected codes’’ [10], M = 2 log2 C . Using ‘‘h-hot codes’’ [25], M can be reduced to close to log2 C . Despite using
more MWs, binary reflected codes require the same amount of ATC. The main advantage of binary reflected codes is that
they allow for ‘‘wildcarding’’, which in turn yields a simple testing algorithm [10]. If the N NWs in a given contact group
each have a different encoding, these encodings can be discovered in time O(NM), which is optimal.
The area required for an encodedNWdecoder also depends on C . In order for all NWs in a contact group to be addressable
with probability 1 − , the number of NW types, C , must be at least N(N − 1)/(−2 ln(1 − )) [10]. Half of the NWs in a
contact group are addressable with probability 1 −  if C is at least e(N−1−2 ln )/(N+1)(N − 1)/2 [10]. It was demonstrated
in [10] that requiring half of the NWs be addressable requires significantly less total area than requiring that all NWs be
addressable. Additionally, it requires fewer NW encodings be manufactured.
As explained in [10], NWs in an encoded NWdecoder are assigned encodings with equal probability. Since each encoding
can be addressed separately, the probability that a NW is individually addressable is (1−1/C)N−1, and the expected number
of individually addressable NWs per contact group is N(1 − 1/C)N−1. This observation allows us to directly apply the
derivations of Theorem 5.2 and Corollary 5.2 that appear in Section 5.1. Doing this gives following result, which does not
appear elsewhere.
Theorem 3.1. Let N ′a be the total number of individually addressable NWs in an encoded NW decoder with g contact groups, N
NWs per contact group, N ′ = gN NWs in total and M MWs.
P(N ′a > κN
′) ≥ 1− 
if κ ≤ (1− 1/C)N−1 −√− ln /(2g∗) where g∗ = g(N/(N − 1))2.
Also, as mentioned in Section 2.1, it is not always necessary to address NWs individually. If two NWs have the same
encoding (see Fig. 2), they can still be addressed collectively and used to store the same bit at multiple NW crosspoints. In
an encoded NW decoder, the expected number of different encodings per contact group is C(1− (1− 1/C)N). If addressing
groups of NWs is acceptable, the above theorem can be modified.
Theorem 3.2. Let N ′a be the total number of addresses in an encoded NW decoder with g contact groups, N NWs per contact
group, N ′ = gN NWs in total and M MWs.
P(N ′a > κN
′) ≥ 1− 
if κ ≤ (C/N)(1− (1− 1/C)N)−√− ln /(2g∗) where g∗ = g(N/(N − 1))2.
3.2. The masked-based decoder
Mask-based decoders [23] work with uniform NWs [13,31]. Lithograpically-defined high-K dielectric rectangles are
deposited between NWs and MWs. A rectangle amplifies a MW’s electric field, allowing it to increase the resistance of the
246 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
Fig. 3. (a) Amasked-based NW decoder in which regions of high-K dielectric allow eachMW to control a different subset of NWs. If arbitrarily small high-K
dielectric regions could be manufactured and placed with nanoscale precision, 2 log2(N)MWs could be used to address each of N NWs. (b) Since this is not
the case, many randomly shifted copies of the smallest manufacturable region can be used to gain control over individual NWs.
Fig. 4. A randomized-contact decoder in which random particle deposition causes each MW to control certain NWs.
lightly-doped NWs underneath. If rectangles can be as small as the pitch of NWs, they can be used with M = 2 log2 N
MWs to cause all but one NW to have high resistance, as suggested in Fig. 3a. Unfortunately, rectangles cannot be made as
small as the pitch of NWs. Instead, many randomly shifted copies of the smallest lithographically-defined rectangles can be
deposited on a chip. The natural randomness in their locations provides control over NWs with high probability [23] (see
Fig. 3b). One difficulty, however, is that nanoscale misalignment of a high-K dielectric regions can cause a particular MW
to only partially turn off a particular NW. A similar problem will arise if the boundary of a high-K dielectric regions is not
sufficiently sharp, although [23] appears to indicate that nanoscale transitions between high-K and low-K dielectric regions
are feasible.
3.2.1. Known results
The number of MWs, M , needed to control N NWs is estimated to be at least six times the number required with an
encoded NW decoder [32]. Unless masks can be placed with sub-NW pitch accuracy, the number of MWs required for all N
NWs in a contact group to be addressable with probability  is approximately 2(N − 1) ln(2(N − 1)/). Even though M is
large, each NW can be addressed using a small number of MWs. As a result, the area required for ATC remains reasonable.
3.3. The randomized-contact decoder
A randomized-contact decoder is any decoder in which NW/MW connections can be modeled as independent random
variables. In an RCD, a MW provides strong control over a NWwith probability p, very weak or no control with probability q
and intermediate or ambiguous control with probability r = 1− (p+q). Since this third casemodels a manufacturing error,
we do not assume that p+ q = 1. This is a very practical generalization of the error-free model given in [5]. In Section 5 we
bound the number of MWsM required to tolerate a given error rate.
Williams and Kuekes first proposed the randomized-contact decoder (RCD) in [33]. There is a number of ways an RCD
might be produced. One approach is to randomly deposit impurities (such as gold particles) onto undifferentiated NWs (see
Fig. 4). Another approach is to randomly deposit small regions of high-K dielectric. An RCD can also be constructed from
axially encoded NWs. If many sets of axially encoded NWs are produced with randomly placed lightly doped regions, each
NW/MW junction can be treated as an independent random variable. As a result, analysis of RCDs provides bounds that
apply to axial (and similarly radial) decoders.
There is also another interesting relationship between RCDs, encoded NW decoders, and masked-based decoders. In
a masked-based decoder, there is significant correlation regarding which NWs a given MW controls. In an encoded NW
decoder there is correlation between which MWs control a given NW. In an RCD, neither correlation should exist, although
in practice a small amount of spacial correlation between NW/MW junctions might be present.
Hogg et al. [5] have explored the conditions under which most of the N NWs in an RCD can be controlled by a set of M
MWs. They demonstrate through simulation that when M passes a threshold, which is around 4.8 log2 N , the probability
that all NWs are addressable grows rapidly as N increases. This is in agreement with our Corollary 5.3.
Their asymptotic analysis does not make explicit the dependence ofM on N and the probability  of failing to having all
NWs be addressable. It also fails to capture the impact of manufacturing errors. We develop tight bounds for both purposes
in Section 5, and do so without the independence approximation used in [5], namely that pairs of NWs can be analyzed
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 247
Fig. 5. On the left, the crosspoint being read has a high resistance, but all other crosspoints have a low resistance. On the right, however, the crosspoint
being read has a low resistance, but all other crosspoints have a high resistance. To correctly determine the state of the crosspoint, the amount of current
flowing from one dimension of the crossbar to the other must be greater on the the right than the left.
independently with regard to whether or not each can be controlled separately. We also give a more careful analysis of the
value ofM required for at least some fixed fraction of NWs to be addressable.
4. Decoder requirements
In Section 5, we bound the number of MWs, M , required by RCDs with N NWs per OC to control many individual NWs.
To derive these bounds, we first define the requirements that decoders must meet. The conditions we obtain in this section
apply to other types of decoders as well.
4.1. Nanowire addressing
As explained in Section 1, read/write operations are performed in aNWcrossbar-basedmemory, by employing an address
decoder in each dimension of thememory. If each decoder addresses at leastD disjoint sets of NWs, they collectively control
D2 disjoint sets of NW crosspoints each of which can store a bit.
Since each of the two decoders is comprised of g contact groups, D =∑gi=1 Di, where Di is the number of disjoint sets of
NWs that can be addressed within the ith contact group.
Let Ri be the resistance of NW ni. When a decoder addresses a set S of NWs within a single contact group, each NW in S
has a low resistance, while the NWs not in S have a high resistance. In awrite operation, every NW in S must have a much
lower resistance than every NW not in S, that is, max(Ri | ni ∈ S) min(Ri | ni 6∈ S). This ensures that the bits associated
with NWs in S are written whereas those not in S are not written. A read operation requires that the combined resistance
of all NWs in S, RIN , be much less than the combined resistance of NWs not in S, ROUT , that is 1/ROUT  1/RIN . Consider
the two extremes illustrated in Fig. 5. Since the resistance R of a set of n resistances, R1, . . . , Rn, placed in parallel satisfies
1/R = 1/R1 + · · · + 1/Rn, this is equivalent to∑ni 6∈S 1/Ri ∑ni∈S 1/Ri.
Definition 4.1. A set, S, of NWs is addressed if and only if a) every NW not in S has a resistance that is at least α times that
of every NW in S and b) the combined resistance of all NWs not in S is at least α times that of the combined resistance of
all NWs in S, where α  1.
In general, the choice of an actual value of α is application specific. A larger value, for example, would be required to read
data from molecular devices with poor on/off ratios. A larger value would also facilitate reading data more quickly or more
reliably.
Following the above analysis, if Ri ≤ RL when ni ∈ S and Ri ≥ RH when ni 6∈ S, the condition on writing is satisfied
when RH ≥ αRL and that on reading is satisfied when RH/(N − |S|) ≥ αRL/|S|. This read condition is hardest to meet when
|S| = 1 in which case RH ≥ α(N − 1)RL. This is clearly stronger than the write condition RH ≥ αRL.
4.2. Resistive and ideal models of control
A NW decoder addresses a set of NWs by applying an electric field to a subset of the MWs. These MWs are said to be
activated. The set of activated MWs is called an activation pattern. A particular activation pattern, a, is represented as a
binary vector where aj = 1 if and only if the jth MW is activated. Each activated MW increases each NW’s resistance by
some amount (possibly 0). More formally, NWs behave as follows.
Definition 4.2. In the resistive model of NW control, each NW ni has initial resistance ηi when no MWs are activated.
Associated with each NW is a length-M vector of reals, or a real-valued nanowire codeword, r i. The jth entry of r i, r ij , is
248 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
the amount by which the jth MW increases the resistance of ni when activated. When the decoder is supplied with a, the
resistance of NW ni is ηi + a · r i where a · r i is the inner product of activation pattern a and codeword r i.
When the jth MWprovides strong control over NW ni, r ij is large. r
i
j is small when the jth MWprovides weak control over
ni. In the ideal case, each r ij is either 0 or∞ and a codeword is associated with each NW. Note that multiple NWs may have
the same codeword.
Definition 4.3. In the ideal model of NW control, each NW, ni is assigned a binary codeword, c i, where c ij = 1 if and only
if r ij = ∞. For a particular activation pattern, a, a · c i > 0 if and only if a · r i = ∞. A set S of NWs is addressed when
a · c i = 0 for NWs in S and a · c j ≥ 1 for NWs not in S.
In either model of control, a set, S, of NWs is considered addressable if there is some activation pattern such that S
is addressed. Similarly, a particular NW ni is individually addressable if there is an activation pattern such that {ni} is
addressed. A codeword ci is individually addressable if each NWwith that codeword is individually addressable.
Notice that in the ideal model of NW control, if a binary codeword is addressable, the NWs with that codeword are
addressed by activation pattern a = c i. Furthermore, if c i is not addressable, there is some other codeword ck such that for
each j it is not true that c ij = 0 and ckj = 1. This is the mathematical definition of implication; that is, ckj implies c ij . When
this condition holds for all values of j, we say that ck implies c i, and write ck ⇒ c i. The following is immediate.
Lemma 4.1. In a simple NW decoder in the ideal model of control, a NW codeword c i is addressable, if and only if, no other
codeword that is present implies c i. The decoder can address D disjoint sets of NWs if and only if D distinct NW codewords are
addressable.
4.3. Modeling errors
If each r ij takes value rlow = 0 or rhigh = ∞, each real-valued codeword can be mapped to a binary codeword, which are
simple to work with. When rlow and rhigh do not hold these extreme values we map r i to c i such that:
• c ij = 0 if r ij ≤ rlow
• c ij = 1 if rhigh ≤ r ij
• c ij = e if rlow ≤ r ij ≤ rhigh, meaning that c ij is in error.
Our goal is to choose values for rlow and rhigh so that a set S of NWs is addressed by an activation pattern a if the following
conditions hold:
• for ni ∈ S, c ij = 0 when aj = 1,
• for nk 6∈ S, there exists j such that ckj = 1 and aj = 1.
Consider an activation pattern a that meets these two conditions. Let rbase = maxi ηi. Observe that every NW in S has
resistance at most RL = rbase + (M − 1)rlow because at most M − 1 MWs are activated. Also, note that every NW not in
S has resistance at least RH = rhigh. From Definition 4.1 and the discussion that follows it is clear that S is addressed if
RH ≥ α(N − 1)RL or rhigh ≥ α(N − 1)(rbase + (M − 1)rlow). To simplify the discussion, let rlow = crbase for some constant
c > 0. Then, S is addressed if
rhigh ≥ α(N − 1)(cM − c + 1)rbase
where α  1.
In the above model with errors we say that NW ni is addressable if for each NW nk there is at least one index (MW) j
such that c ij = 0 and ckj = 1. The ensures that c i has low resistance while ck has high resistance. When this condition fails,
c i may still be addressable but this cannot be guaranteed. We say that a codeword c i fails to be addressable if there exists
a codeword ck such that the conditions c ij = 0 and ckj = 1 fail to be satisfied for some j. In this case, and by analogy with the
ideal model, we say that codeword ck possibly implies c i, denoted ck ?⇒ c i. If ck ?⇒ c i, there is no guarantee that ni can be
addressed separately from nk .
Lemma 4.2. In a simple decoder in themodel with errors, a codeword, c i, is addressable if for no other codeword ck does ck ?⇒ c i.
The decoder can address D disjoint sets of NWs if and only if D distinct NW codewords are addressable.
If rhigh is too low or rlow is too high to be realized using a particular manufacturing technology, NWs can still be addressed
if we set rlow = crbase and rhigh = (α/d)(N − 1)(cM − c + 1)rbase, but instead require that each NW is addressed by an
activation pattern that activates at least d high resistance junctions in the other NWs. This ensure that RH = drhigh.
It is possible for an RCD to be realized with diodes instead of FETs. The decoder model with errors can also be used in this
case to capture diodes with imperfect behavior.
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 249
5. Analysis of the RCD
In an RCD, consider a simple decoder, consisting of single contact group with N NWs and M MWs. As mentioned, we
assume that NW/MW junctions are controlling (i.e. c ij = 1) with probability p, noncontrolling (i.e. c ij = 0) with probability
q, and ambiguous (i.e. c ij is in error) with probability r = 1 − p − q. We also assume that these events are statistically
independent and identically distributed.
We now boundNa, the number of individually addressable NWs in each contact group in terms ofM , the number ofMWs.
Recall that for a NWwith codeword c i to be individually addressable there must be no other codeword ck such that ck ?⇒ c i
(see Lemma 4.2).
We take two approaches to deriving bounds on M . First, in Theorem 5.1 we bound the expected value of Na, E[Na].
This allows us to apply Hoeffding’s Inequality and derive a lower bound on M , such that the total number of individually
addressable NWs across all g contact groups is some at least a fixed fraction of gN , with probability 1− .
Second, in Theorem 5.3, we use the principle of inclusion–exclusion, to derive upper and lower bounds on M such that
all NWs in all (or almost all) contact groups are independently addressable. The first bound is used to evalute the TakeWhat
You Get addressing strategy evaluated in Section 6. The second is used to evaluate the All Wires Addressable and Almost
All Wires Addressable strategies.
5.1. Bounds using expectation
We now bound the mean number of individually addressable NWs. We use this to bound the fraction of NWs in a
compound RCD that are addressable with high probability.
Theorem 5.1. In an RCD, let Na be the number of independently addressable NWs in a contact group with N NWs and M MWs.
N(1− (N − 1)(1− pq)M) ≤ E[Na] ≤ N(1− (1− pq)M).
Proof. Let xi = 1 if NW ni is independently addressable and 0 otherwise. Since Na = ∑Ni=1 xi, E[Na] = ∑Ni=1 E[xi]. Also,
since the {xi} are identically distributed 0-1 random variables, E[Na] = NE[x1] = NP(x1 = 1).
Let Ek,i be the event that ck
?⇒ c i. P(x1 = 1) = 1− P(x1 = 0) = 1− P(E2,1 ∪ E3,1 ∪ · · · ∪ EN,1). Since P(E2,1) ≤ P(E2,1 ∪
E3,1 ∪ · · · ∪ EN,1) ≤∑Nk=2 P(Ek,1) and P(E2,1) = P(E3,1) = · · · = P(EN,1), 1− (N − 1)P(E2,1) ≤ P(x1 = 1) ≤ 1− P(E2,1).
c2 ?⇒ c1 if for all 1 ≤ j ≤ M it is not the case that both c1j = 0 and c2j = 1, thus P(E2,1) = (1 − pq)M and
1− (N − 1)(1− pq)M ≤ P(x1 = 1) ≤ 1− (1− pq)M .
Corollary 5.1. Let N ′ = gN be the total number of NWs contained in the g contact groups of a RCD, and let N ′a be the number of
those NWs that are individually addressable. Then,
N ′(1− (N − 1)(1− pq)M) ≤ E[N ′a] ≤ N ′(1− (1− pq)M).
Proof. N ′a is the sum of the number of individually addressable NWs in each contact group. Since each contact group has N
NWs, E[N ′a] = gE[Na]. Substituting the bounds from Theorem 5.1 yields the desired result.
Let S = n1+ n2+ · · ·+ nt be the sum of t independent random variables, where each ni ranges from ai to bi. Hoeffding’s
Inequality [34] states that
P(E[S] − S ≥ d) ≤ e−2d2/
∑
c2i
where ci = bi − ai, and d ≥ 0. We use this to bound the total number of independently addressable NWs with high
probability.
Theorem 5.2. Let N ′a be the total number of addressable NWs in an RCD with g contact groups, N NWs per contact group, and
N ′ = gN NWs in total.
P(N ′a ≤ E[N ′a] − N ′k) ≤ e−2k
2N ′N/(N−1)2 = e−2k2g∗
for any k ≥ 0 where g∗ = g(N/(N − 1))2.
Proof. In Hoeffding’s Inequality, let t = g , d = N ′k, S = N ′a and ci = (N − 1). This gives P(E[N ′a] − N ′a ≥ N ′k) ≤
e−2(N ′k)2/g(N−1)2 = e−2k2N ′N/(N−1)2 . We can then rewrite P(E[N ′a] − N ′a ≥ N ′k) as P(N ′a ≤ E[N ′a] − N ′k).
From this we obtain a corollary
Corollary 5.2. Let N ′a be the total number of addressable NWs in an RCDwith g contact groups, N NWs per contact group, N ′ = gN
NWs in total and M MWs.
P(N ′a > κN
′) ≥ 1− 
if κ ≤ 1−√− ln /(2g∗)− (N − 1)(1− pq)M where g∗ = g(N/(N − 1))2.
250 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
Proof. From Theorem 5.1 we have E[N ′a] ≥ N ′(1− (N − 1)(1− pq)M) and by the above theorem,
P(N ′a ≤ N ′(1− (N − 1)(1− pq)M)− N ′k) ≤ e−2k
2g∗ .
Thus, if k = (1− (N − 1)(1− pq)M)− κ , then
P(N ′a ≤ κN ′) ≤ e−2g
∗(1−(N−1)(1−pq)M−κ)2 .
Thus, when e−2g∗(1−(N−1)(1−pq)M−κ)2 ≤  the desired conclusion follows. This occurs when ln  ≥ −2g∗(1 − (N − 1)(1 −
pq)M − κ)2 or√− ln /(2g∗) ≤ (1− κ)− (N − 1)(1− pq)M .
As an example, suppose p = q = 1/2, g = 175, N = 8, N ′ = 1400,  = .01, and κ = .733. When M = 13,
κ = .733 ≤ 1−√− ln .01/(2 ∗ 175 ∗ (8/7)2)− 7 ∗ (3/4)13. Thus at least d.733 ∗ 1400e = 1027 NWs are addressable with
probability .99.
If errors occur, that is, when p+ q < 1, but g is held fixed,M must increase to keep κ constant. For example, if pq = .2
rather than pq = .25 in the error-free case, M must grow by a factor of ln(4/3)/ ln(5/4) = 1.29. If pq = .1, the factor is
ln(4/3)/ ln(10/9) = 2.73. Even for relatively high error rates,M is not prohibitively large.
5.2. Bounds using inclusion/exclusion
In this section we derive bounds on the number of MWs required for all NWs to be individually addressable with high
probability.
Theorem 5.3. In an RCD, let Γ be the probability that M NWs fail to control all N NWs in a single contact group. Γ satisfies the
following bounds
Q (1− Q/2)−∆ ≤ Γ ≤ Q (1)
where Q = N(N − 1)µM1 and∆ = 2N(N − 1)(N − 2)
(
µM3 + µM5 − 2µ2M1
)
and µ1 = (1− pq), µ3 = (1− pq(p+ 2q)), and
µ5 = (1− pq(2p+ q)).
Proof. See Appendix.
This theorem implies upper and lower bounds onM in terms of N and Γ . For the cases examined belowwhen p = q and
Γ is small, these bounds are tight, meaning the upper and lower bounds they give onM agree. Slightly weaker but simpler
bounds are given in the following corollary, in which upper and lower bounds onM differ by ln(2)/ ln(1− pq).
Corollary 5.3. In an RCD, the minimum value of M such that all N NWs in a contact group are individually addressable with
probability 1−  satisfies the following.
ln(N(N − 1)/2)
− ln(1− pq) ≤ M ≤
ln(N(N − 1)/)
− ln(1− pq)
where the lower bound holds when  ≤ .05 and the actual minimum value of M is itself at least (1− pq)/(pqmin(p, q)).
Proof. The upper bound follows from (1). For the lower bound, assume Q ≤ 0.1, which implies that M ≥ ln(10N(N −
1))/(− lnµ1). This is less than ln(N(N − 1)/2/(− ln(1− pq))when  ≤ .05. In∆ drop the last term and replaceµM3 +µM5
by 2max(µ3, µ5)M . Sinceµ3 = µ1− pq2 andµ5 = µ1− p2q, max(µ3, µ5) = µ1(1−min(pq2, p2q)/µ1). The lower bound
on Γ becomes Γ ≥ Q (.95−4N(1− pqmin(p, q)/µ1)M). Using the inequality (1− x)n ≤ 1−nx, the lower bound is at least
Q/2 ifM ≥ (1− .45/4N)(1− pq)/(pqmin(p, q))which is less than (1− pq)/(pqmin(p, q)).
In this corollary, when  ≤ .05, the second condition associated with our lower always holds when p and q are fairly
close to 1/2. To see why, consider the extreme case when N = 2. Here the minimum value of M such that neither NW
implies the other must satisfy (1 − pq)M ≤ , or equivalently M ≥ ln()/ ln(1 − pq). It is easy to verify numerically that
ln(.05)/ ln(1− pq) ≥ (1− pq)/(pqmin(p, q))when pq ≥ .21.
Corollary 5.4. In an RCD with N ′ NWs divided into g contact groups, all NWs are independently addressable with probability
(1− ) if
M ≥ ln(N ′((N ′/g)− 1)/)/(− ln(1− pq)).
Proof. Let δ be the probability of failure of all NWs in a contact group to be individually addressable. Then, the probability
that one or more contact groups fails to have all its NWs be individually addressable is at most gδ. If gδ ≤ , the probability
that all N ′ NWs are addressable is at least 1− . We use the upper bound onM given in Corollary 5.3 when N is replaced by
N ′/g and  by /g .
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 251
When N ′ = 1024, g = 128 and M ≥ 47, all N ′ NWs will be individually addressable with probability 0.99 or better.
In fact, evaluating Theorem 5.3 numerically shows this threshold value ofM to be exact. These parameters apply to the All
Wires Addressable addressing strategy in which every NW address is used.
The number of MWs is reduced if we do not require that all NWs in each contact group be individually addressable.
We illustrate this with an example. Corollary 5.3 says that a failure rate of at most  = .01 can be achieved with a simple
RCD when p = q = .5 and N = 8 if M ≥ 30. (As above, this threshold value of M is exact.) The number of individually
addressable NWs in each contact group is statistically independent. If allN NWs in a particular contact group are individually
addressable with probability 1 − , the probability that f or fewer contact groups fail to have all NWs addressable is
φ(, f , g) = ∑fi=0 (gi) i(1 − )g−i. Let  = .01, g = 133 and f = 5. Because φ(.01, 5, 133) ≥ .99, at least 128 of g = 133
contact groups have all NWs addressable with probability 0.99.
In summary, whenM = 30, g = 133, and N = 8∗133 = 1064, N ′a = 8∗128 = 1, 024 NWs are individually addressable
with probability 0.99. These parameters apply to the Almost All Wires Addressable addressing strategy in which almost
every NW address is used.
As discussed at the end Section 5.1, manufacturing errors only increase the number of required MWs by a small constant
factor.
6. Addressing strategies
We now use the bounds on M to estimate the total area required for a crossbar-based memory that uses RCDs. As
explained in Section 2.3, this area estimate depends not just on the number of MWs used, but also on the size of an ATC. In
this section we consider three addressing strategies, that is, ways of using an ATC to map an external binary address E of
b = |E| bits to an internal NW address consisting of a contact group σ and an activation pattern a onM MWs.
AllWires Addressable:Herewe chooseM so that,with probability (1−), all NWs in every contact group are individually
addressable. If we assume that the number of NWs in each contact group is 2k, we can simply use the first b− k bits of E to
select σ . This fixed mapping does not depend on the particular NW codewords that are present, although the mapping of E
to a does. To execute the second mapping, the ATC stores each NW codeword that is present in a lookup table. This requires
N ′aM bits of storage where N ′a is the number of addressable NWs in the decoder.
All Wires Almost Always Addressable: Here we chooseM so that with probability (1− ), all NWs in nearly all contact
groups are addressable. Contact groups in which not all NWs are addressable are not used. Since the particular contact
groups that are not used will vary from decoder to decoder, the ATC cannot use a fixed mapping from E to contact groups σ .
Instead, a lookup table is used to obtain an integer to be added to the first b− k bits of E so that it corresponds to the proper
contact group. Let g be the number of contact groups and g ′ be the number for which all NWs are addressable. Then g − g ′
is an upper bound on the values in the table. We also use a lookup table to map E to a. The two tables combined require
approximately g ′dlog2 g − g ′e + N ′aM bits.
Take What You Get: Here we choose M so that a fixed fraction of the NWs are individually addressable. In this case,
some contact groups may have all NWs addressable, but some may not. Since the number of addressable NWs per contact
group varies, we can no longer map fixed blocks of binarymemory addresses to a particular contact group. Instead, we store
a value of σ and a for each addressable NW. This requires N ′a(dlog2 ge +M) bits.
6.1. Area estimate
To estimate the total area, AT , required to produce a crossbar memory using each of the three strategies, we use the
approach of [10] and write:
AT ≈ 2χβ + 2λ2mesogdlog2 ge + (λmesoM + λnanoN ′)2.
Here λmeso and λnano denote the pitch of MWs and NWs respectively, that is, the center-to-center distance between
wires. Also, χ denotes the area of a mesoscale memory cell, and β denotes the number bits stored in each dimension of the
crossbar’s ATC. Thus, 2χβ approximates the amount of programmable storage required, 2λ2mesogdlog2 ge approximates the
area required to implement a standard demultiplexer used to activate contact groups, and (λmesoM+λnanoN ′)2 approximates
the area occupied by the NW crossbar.
6.2. Comparison of strategies
To compare the three addressing strategies, we estimate their area when used to produce amemory with a given storage
capacity. In our comparison, we fix , the probability of failure, and N/g , the number of NWs per contact group. Given these
values, we would ideally like to also fix N ′a, the number of addressable NWs along each dimension of the crossbar, then
estimate AT for all three strategies. Unfortunately, for a given strategy, it is difficult to choose M and N ′ to yield an exact
value for N ′a, but in all three cases we show that about 1024 NWs are individually addressable along each dimension.
To compare the strategies, we consider the case when p = q = 1/2 and use the numerical results given above.
252 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
• All Wires Addressable:
HereM = 47, g = 128, and N ′ = N ′a = 1024 with probability at least .99. The ATC requires β = N ′aM = 47,990 bits.
This gives
AT ≈ 95,982χ + λ2meso1792+ (λmeso49+ λnano1600)2.
• All Wires Almost Always Addressable:
HereM = 30, g = 133, and N ′ = 1064 yields N ′a = 1024 and g ′ = 128 with probability at least .99. The ATC requires
β = g ′dlog g − g ′e + N ′AM = 31,104 bits. This gives
AT ≈ 62,208χ + 1877λ2meso+(λmeso30+ λnano1064)2.
• Take What You Get:
Here M = 13, g = 175 and N ′ = 1400, yields N ′a of 1027 with probability at least .99. The ATC requires
β = N ′a(dlog ge +M) = 21,567 bits. This gives
AT ≈ 43,134χ + 2800λ2meso + (λmeso13+ λnano1400)2.
Since the parameter χ , the area of a mesoscale memory unit, will be many times λ2meso, and it is expected that λmeso ≥
10λnano, the Take What You Get addressing strategy is clearly best.
7. Codeword discovery
Although required of all NW decoders, codeword discovery has received significantly less attention than NW addressing.
In this section, we consider several codeword discovery algorithms that do not require the addition of specialized testing
circuitry.Wemake three assumptions: (a) the ideal decodermodel applies so noNW/MW junctions are in error; (b) arbitrary
subsets of MWs can be activated; and (c) for any subset of MWs the total amount of current flowing across all NWs in a
contact group can be measured, but not with high precision. In Section 7.5 we re-examine (a).
Even when many or all NWs are individually addressable, their codewords (or at least some portion of their codewords)
must be discovered to properly configure the ATC. It is not feasible to individually probe each NW/MW junction.
In the ideal model if all MWs in a contact group are activated, and all NWs are controlled by at least one MW, no current
will flow. As MWs are turned off one by one, one or more NWs will become conducting. At this point, current is detected. In
theory, accurate current measurements and knowledge of NW resistances could allow a testing procedure to estimate how
many NWs are conducting, but we avoid this assumption.
In [5] it is assumed that one can distinguish how between all NWs being off, one NW being addressed, and two or more
NWs being addressed. We avoid this assumption and assume only the ability to distinguish between all NWs being off and
at least on NW being on. Since α in Definition 4.1 is much greater than 1, this is a very reasonable assumption. It is already
met, for example, by circuitry used to read data from a crossbar-based memory.
We also examine the number of tests a discovery algorithm must perform. In doing so, we point out an important flaw
in the less rigorous analysis given in [5].
7.1. Exhaustive search
The simplest codeword discovery algorithm is exhaustive search. For each contact group, we determine whether current
flows for every possible MW activation pattern. The outputs of all 2M tests are reviewed offline to determine which
codewords are present on individually addressable NWs.
Suppose codeword c i is present on the ith NW. Activation pattern a = c i turns on ni and turns off all other NWs. Also,
any other activation pattern, a′, that turns on a strictly larger subset of MWs turns off all NWs. For this reason we call a
maximal.
An activation pattern a is maximal if and only if c i = a is individually addressable. Once exhaustive testing is complete,
the set of maximal activation patterns can be identified.
7.2. Parallel exhaustive search
The runtime of this algorithm is exponential inM , but as shown in the previous sectionM may be relatively small. In our
analysis of the ‘‘TakeWhat You Get’’ addressing strategy, we demonstrated through analysis that aM = 13 suffices. Smaller
values ofM are also possible if one is willing to tolerate a smaller fraction of addressable NWs.
The exponential running time of exhaustive search can be amortized across contact groups, if all contact groups can be
tested in parallel. If current measurements for each contact group can be taken simultaneously, each of the algorithm’s 2M
tests can be performed on all contact groups at once.
In an exhaustive search, every possible activation pattern is tested. A more efficient search algorithmwould be adaptive.
It would use the outcome of previous tests, to determine which activation pattern to apply next. For certain values ofM and
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 253
g , however, parallel exhaustive search is superior to any adaptive search procedure in which contact groups are tested one
at a time.
Suppose all contact groups can be tested in parallel whenM = 13, N = 8, g = 175 and p = q = 1/2, the conditions on
the Take What You Get strategy discussed in the previous section. The number of tests per contact group is 2M/175 < 47.
We show that more tests are required by any adaptive discovery algorithm operating on contact groups one at a time using
tests with binary outcomes (e.g. the current measured is ‘‘high’’ or ‘‘low’’).
An adaptive discovery procedure must produce the codeword for each individually addressable NW in a contact group.
As shown in Theorem 5.1, the expected number of addressable NWs in a contact group is at leastN(1−(N−1)(1−pq)M) =
1 − 7(3/4)13 > 6.6. This indicates that at least six NWs are addressable at least 1/2 the time. (Given that N = 8, if less
than six NWs are addressable half the time, the average number of addressable NWs is at most (5 + 8)/2 = 6.5 because
at most five NWs are addressable half the time and at most eight NWs the rest of time.) There are 2MN assignments of M-
bit codewords to N NWs. We call these codes. Since all assignments are equally likely, at least (1/2)2MN of these have six
individually addressable NWs. The codewords of these NWs will be produced by a discovery algorithm.
Letσ be themaximumnumber of codes containing any fixed set of six, seven or eight individually addressable codewords.
When an adaptive discovery algorithm produces six or more codewords as output, one of at most σ codes is present in the
contact group. If eight codewords are produced, one of 8! codes is present. If seven codewords are produced, one of at most
7!(8)2M codes is present. These codes contain all 7! permutations of the codewords and eight locations for the remaining
codeword, which takes at most 2M values. Finally, if six codewords are produced, the number of associated codes is at most
6!(82)22M . Since the last case yields the most codes, σ ≤ 6!(82)22M .
It follows that any discovery algorithmmust be able to identify at least (1/2)2MN/σ codes. Since it is assumed that tests
produce binary outcomes, the number of testing steps for an adaptive algorithm must be at least T = log2[(1/2)2MN/σ =
MN − 1− 2M − log2(6! ∗ 28)]. WhenM = 13 and N = 8, T = 67. Thus, an adaptive algorithm that examines one contact
group at a time will need to perform at least 67 tests per group.
7.3. Randomized codeword discovery
For large values of M exhaustive search is prohibitively slow. In this regime adaptive algorithms need to be explored.
Also, if multiple searches cannot be run in parallel, an adaptive algorithmwill always be faster. In this section we consider a
simple adaptive algorithm and examine its runtime. A less efficient version of this algorithm appeared in [5]. As we explain,
however, its analysis was based on faulty assumptions.
The goal of our algorithm is to discover themaximal activation patterns that address codewords. The algorithm, sketched
below, chooses a random permutation of MWs, pi , and activates the MWs in order specified by this permutation until no
current is produced. When current is turned off, the last MW to be turned on is deactivated, and the process continues.
procedure Discover_Codewords
pi = RandomPermutation(MW1,MW2, . . . ,MWM)
for i = 1 toM do
Activate pi(i)
if All NWs are turned off, then deactivate MW pi(i)
After each execution of this procedure, a maximal activation pattern is identified. Its complement yields the discovered
codeword. For ease of simulation, it is convenient to note that the discovered codeword is the codeword that comes first
when all codewords are sorted lexicographically according to pi .
Each execution of the discovery procedure requiresM tests. After each test, some codeword is discovered. The total time
required for codeword discovery thus depends on the relative likelihood of discovering each codeword. If all codewords are
equally likely to be discovered the well-known coupon collector problem, stated in Section 8, shows that approximately
Na log(Na/) executions are required to discover all Na individually addressable codewords with probability (1− ).
As an optimization, we note that it is not actually necessary to activate subsets ofMWswhen they do not turn off all of the
codewords that have already been discovered. This observation was also made in [35], which evaluates a similar codeword
discovery algorithm through simulation.
This is the faulty assumption made in [5]. In fact, experiments indicate that for small values or medium-sized values of
M , some codewords will often be much less likely to be discovered than others. For example, when M is 30, all NWs in a
contact group of N = 8 NWs are addressable with very high probability. If all NWs were equally likely to be discovered,
N log(N/.01) = 69 executions are required with probability .01. Our simulations reported in Section 8.2 show this values
to be approximately 270. WhenM is 100, however, the value shrinks to 72.
The reason for this discrepancy is that, when M is small, some NWs that are addressable are much less likely to be
discovered than others. For example, more than 1/10 of the time there was at least one NW that had only a 1/70 chance
being discovered on each run of the algorithm. For intuition as to why this occurs, consider the following four codewords:
c1 = 111100000000, c2 = 000011110000, c3 = 000000001111, c4 = 011101110111. By symmetry, c1, c2 and c3 are
equally likely to be discovered, but c4 can only be discovered if at least two of MW1, MW5 and MW9 are activated before
any of the other MWs. This observation reveals that c4 is discovered is with probability 3/12 ∗ 2/12 ∗ 1/4 = 1/96, where
254 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
as all other codewords are discovered with probability (1 − 1/96)/3 = 95/288. When M is small, these sorts of extreme
examples are much more likely to occur.
As M increases the probability of discovering a NW address approaches 1/N . This points to an interesting trade-off not
just between the number ofMWs and the number of addressable NWs, but the number of NWcodewords that can be quickly
discovered. When the number of MWs is in an intermediate range, adding additional MWs may actually increase the speed
with which codewords are discovered.
7.4. Possible extensions
Our codeword discovery algorithm does not require specialized testing circuitry, or the ability to measure current in
individual NWs. This makes our algorithm highly practical. However, the algorithm can be improved if the testing is done in
the context of a memory. Consider testing a horizontal contact group. First activate all the NWs in one vertical contact group
to make contacts (i.e. write 1’s) at the intersections of NWs in the two groups. When testing the horizontal contact group,
measure current using the vertical contact group. After discovering a maximal activation pattern, use the corresponding
codeword to open the contacts (i.e. write 0’s) at any intersections formed from horizontal NWs with that codeword. This
ensures that the codeword will not be discovered again. Using the discovery procedure and this method of eliminating
previously discovered NWs, all NW address will be discovered. It will then be necessary to analyze the addresses to find the
individually addressable NWs.
The main disadvantage of this modified randomized algorithm is the read/write requirement. In a circuitry (as opposed
to a memory) molecular switches may not be present. Furthermore, read/write operations may be faulty and slow. If the
number of MWs is small, it may still be faster to implement the parallel exhaustive search algorithm described above.
A codeword discovery algorithm that uses read/write operations for encoded NWdecoders was described in [10]. In fact,
the algorithm can be adapted to find codewords without the use of read/write operations, but the algorithm will not work
with the randomly generated codewords found in an RCD. The read/write discovery algorithm described above uses at most
M tests per NW and works for both encoded NW and RCD decoders.
7.5. Coping with errors
Anevenmore important extension to codeworddiscovery, is learning to copewith errors. For simplicity,we only consider
the exhaustive search case. In the case of errors, it is no longer possible to describe certain activations patterns as maximal,
because it is no longer reasonable to treat the output of each test as binary. An error can produce an intermediate level of
current flow along a NW.
We have already shown that, for sufficiently large M , all NWs are addressable with high probability. This result holds,
even if errors occur. If N NWs are addressable, there is some activation pattern that causes each of these NWs to conduct
while all other N − 1 NWs are turned off. Furthermore, if we combine any two of these activation patterns, all N NWs will
be turned off. If we have a pair of activation patterns that satisfy this property, we call them disjoint.
If N NWs are addressable and we exhaustively test all 2M activation patterns, we can then identify N patterns that are all
disjoint. One method for identifying these patterns from the testing data, is to construct a graph G with vertex associated
with each pattern that causes at least oneNWto conduct. The testing data is then used to place an edge between between any
two vertices that correspond to disjoint activation patterns. A clique of N vertices in G, corresponds to a set of N addresses
that each address a distinct NW. Exhaustive testing is currently the only known method for discovery of codewords in the
presence of errors.
8. Analysis of the discovery procedure
We now bound the number of tests required by the randomized codeword discovery algorithm given in Section 7.3.
The number of runs needed to ensure that, with high probability, all codewords are discovered is modeled by the coupon
collection problem with non-uniform probabilities. We now state bounds on the number of trials that are needed to
collect all coupons with probability at least . They can be derived using established methods [32].
Lemma 8.1. Consider the collection of N coupons in which each coupon is collected with probability at least u. The expected
number of trials to collect all N coupons is at most 1+ 1uHN−1.
Proof. The average time to collect N coupons is T = ∑Ni=1 xi where xi is the time to collect the ith coupon. Let{p1, p2, . . . , pN} be the probabilities of collecting the coupons and let j1, j2, . . . , jN be the order in which they are collected.
Because the first new coupon is collected on the first trial, x1 = 1. For i ≥ 2 the probability distribution for xi is geometric
with probability (1− (pj1 + pj2 + · · · + pji−1)). Thus, xi = 1/(1− (pj1 + pj2 + · · · + pji−1)). It follows that T is maximized
by maximizing (pj1 + pj2 + · · · + pjN−1). Since pN ≥ u, T is largest when pN = u. Similarly, the remaining terms in the sum
for T are maximized by setting pj = u for 2 ≤ j ≤ N , which provides the desired result.
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 255
8.1. The likelihood of generating discoverable codewords
Weshow that the probabilityQ (u) of choosing a code such that each codeword can be discovered byDiscover_Codewords
with probability at least u is close to one whenM , the number of MWs, is proportional to log2 N where N is the number of
NWs.
The codeword associated with the ith NW is defined in Section 5 as c i = {c i1, . . . , c iM}where c ij = 1 (0) if the jth NW/MW
junction in the ith codeword is controlling (noncontrolling). If c ij = e, the control of the junction is ambiguous. A code C is a
collection of codewords. We let p, q, and r = 1− p− q be the probabilities that c ij = 1, 0 and e, respectively. In this section
we consider codeword discovery when there is no ambiguity, that is, when r = 0.
We consider codes containing codewords that are all about equally likely, C0, and C0, the complement of that set. C0 is
defined in terms of Bi(0) = {i | c ij = 0}, the indices for which c i has value 0, and Bi(0)
⋂
Bk(0), the indices for which c i and
ck have 0s in common locations. The first condition on C0 is that each codeword must satisfy is the following:
|Bi(0)| ≥ Mq− k1. (2)
This ensures that each codeword has approximately the average number of 1s and 0s. The second, given below, also ensures
that pairs of codewords are typical, namely, that the number of 0s they have in common is approximately the average.
|Bi(0) ∩ Bk(0)| ≤ Mq2 + k2. (3)
Let D(C, u) be the event that the each codeword in code C is discovered with probability at least u. It follows that








P(D(C, u) | E(C))P(E(C)).
Let Di(C) be the event that codeword c i in a code C is discovered and let P(Di(C)) be the probability of this event. If
P(Di(C)) ≥ u for all words in codes in C0, then P(D(C, u) | E(C)) = 1. Below we derive such a bound, the proof of which in
the Appendix.
Theorem 8.1. If C is a code in the ensemble C0, the probability P(Di(C)) that the ith codeword in C is discovered is given below















If M is large relative to k1 and (ln 4N)2/γ 2, the lower bound approaches P(Di) ≥ 12 (4N)−
ln q
γ . When M is also large relative to k2,
γ approaches 1/q and the limiting value of P(Di) becomes 12 (4N)
−q ln q or 12 (4N)
−.35 when q = 1/2.





But P(C0) = 1− P(C0)where P(C0) is the probability that either |Bi(0)| ≥ Mq− k1 is violated for one of the N codewords




pairs of codewords. Thus, P(C0) satisfies the following.
P(C0) ≤ NP







(|Bi(0) ∩ Bk(0)| > Mq2 + k2) . (6)
Bits in codewords are i.i.d 0-1 random variables in which 0s (1s)occur with probability q (p = 1− q). A 0 occurs in a given
position in two codewords simultaneously with probability q2. We use the Chernoff bound cited below to bound these
probabilities [34, p. 66].
Theorem 8.2. Let X be the sum of n independent and identically distributed random variables, with meanµ. Then, the following
holds when k ≤ µ.
P(X ≤ µ− k) ≤ e−k2/2µ.
256 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
Corollary 8.1. Let Y be the sum of n independent and identically distributed random variables, with meanµ andmaximum value
P. Then, the following holds when k ≤ P − µ.
P(Y ≥ µ+ k) ≤ e−k2/2(P−µ).
To obtain the corollary let X = P − Y . Then, P(Y ≥ µ+ k) = P(X ≤ (P − µ)− k)where P − µ is the average of X .
When applied these bounds are applied to the events in question the following holds.
P
(|Bi(0)| < Mq− k1) ≤ e−k21/(2Mq)
P
(|Bi(0) ∩ Bk(0)| > Mq2 + k2) ≤ e−k22/(2M(1−q2)).
Thus, P(C0) satisfies the following bound.








Summarizing, we have the following result concerning the performance of the Discover_Codewords procedure.
Theorem 8.3. Consider RCD codes consisting of N codewords of length M, in which 0s (1s) occur independently with probability














k1 ≥ √2Mq ln(2N/(1− Q (u))), and k2 ≥
√
2M(1− q2) ln(N2/(1− Q (u))), then the probability that a code is selected for
which Discover_Codewords discovers each codeword with probability at least u, is at least Q (u).
Proof. The results follow from Theorem 8.1 and (5) if k1 and k2 are chosen so that Ne−k
2
1/(2Mq) ≤ (1 − Q (u))/2 and
N2e−k22/(2M(1−q2) ≤ (1− Q (u)).
When Q (u) = .99 and q = 0.5, k1 ≥ √M(lnN + 5.3) and k2 ≥ √M(2 lnN + 4.6).
8.2. Experimental results
The bound on the probability P(C0) given in (5) and the bound on u in Theorem 8.3 have been calculated for a variety of
values of k1 and k2. Choosing k1 = 10 and k2 = 12 when N = 8 yields Q (u) = 0.93 when u = .005, that is, for 93% of the
RNC codes each codeword is discovered with probability of at least 1/2 of one percent. In practice, a much higher value of
u is achieved for a given value of Q (u).
We simulated 2000 runs inMatlab of the Discover_Codewords procedure on each of 5000 randomly generated, error-free
contact groups. Each contact group had 8 NWs. In Fig. 6 we plot the cumulative distribution of the number of runs before all
individually addressable codewords were discovered for both 30 and 100 MWs. We also plot the cumulative distribution of
the fraction of runs that discovered whichever codeword was discovered least often, that is, an empirical estimate of u.
As discussed at the end of Section 7.3 as the number of MWs increases from 30 to 100, the minimum probability
with which a codeword is discovered increases. Similarly, the number of runs to discover nearly all codewords with high
probability decreases as M increases. In fact, approximately 270 runs are needed to discover all codewords with probability
0.99 when M = 30 and approximately 72 when M = 100. The latter number is very close to the number predicted, when
all codewords are equally likely to be discovered using the coupon collector problem.
9. Conclusions
We have shown analytically, that stochastically assembled RCD decoders can control large number of NWs using a smal
number of MWs. Our results are obtained using a simple, but broadly applicable model, that quantifies the requirements a
decodermustmeet to address sets of NWs. Ourmodel is robust in the sense that it takesmanufacturing defects into account.
By applying ourmodel to RCDs, we obtain tight bounds on the probability thatM MWs control all N NWs in all, or almost
all contact groups. We also bound the total fraction of individually addressable NWs. Both bounds allow us to investigate
multiple addressing strategies for implementing a NW crossbar-based memories. We conclude that ‘‘Take What You Get’’
addressing strategy uses the smallest area to individually address at least 1024 NWs along each dimension of the crossbar.
What is more, only 13 MWs are required.
We have also considered the problem of codeword discovery.We have given the first formal analysis of several codeword
discovery algorithms. As explained, parallel exhaustive search may be preferable to an adaptive search algorithm that must
test only one contact group at a time.When an adaptive algorithm is used, there appears to be a tradeoff between the number
of MWs and its runtime. The specific algorithm we consider can be modeled as a coupon collection problem, where some
coupons are more likely to be collected than others.
Although RCDs have not yet been demonstrated experimentally, we believe they are a very promising NW decoder
technology. Their ability to cope with manufacturing errors, as well as to be produced using a range of manufacturing
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 257
Fig. 6. Shown are empirical plots obtained by simulating 2,000 runs of Discover_Codewords on 5000 randomly generated, error-free contact groups each
of which has 8 NWs. The plots show the cumulative distribution of the number of runs before all individually addressable codewords are discovered, and
the fraction of the runs in which the least frequently discovered codeword were found.
methods,makes thempractically appealing. Their highly stochastic assembly represents a significant departure fromcurrent
lithographic manufacturing techniques. They serve as an important example of how nanoscale architectures can cope with
randomness and still achieve significant gains over CMOS.
Acknowledgements
The authors acknowledge support by the National Science Foundation under NSF Grant CCF-0403674. A preliminary
version of this paper but without the material on codeword discovery appeared in the Proceedings of ICCAD 2006.
Appendix
Theorem 5.3. In an RCD, let Γ be the probability that M NWs fail to control all N NWs in a single contact group. Γ satisfies the
following bounds
Q (1− Q/2)−∆ ≤ Γ ≤ Q
where Q = N(N − 1)µM1 and∆ = 2N(N − 1)(N − 2)
(
µM3 + µM5 − 2µ2M1
)
and µ1 = (1− pq), µ3 = (1− pq(p+ 2q)), and
µ5 = (1− pq(2p+ q)).
Proof. The principle of inclusion–exclusion states that P(E1∪E0∪· · ·∪En) ≤∑ni=1 P(Ei) and∑ni=1 P(Ei)−1/2∑i6=j P(Ei∩
Ej) ≤ P(E1 ∪ E0 ∪ · · · ∪ En).
Let Ea,b (where a 6= b) be the event that ca ?⇒ cb. By Lemma 4.2, we know that all NWs are independently addressable,
if no event Ea,b occurs. The probability that not all NWs are individually addressable, Γ , satisfies Γ = P(∪(a,b)Ea,b). We use
inclusion–exclusion to bound Γ .
258 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261
As established in the proof of Theorem 5.1, P(Ea,b) = µM1 where µ1 = (1− pq). Let Q =
∑
a6=b P(Ea,b). Since a and b can
both take values from 1 to N , Q = N(N − 1)µM1 . We must now bound
∑
(a,b)6=(c,d) P(Ea,b ∩ Ec,d). Here 1 ≤ a, b, c, d ≤ N
provided that (a, b) 6= (c, d), i.e., either a 6= b or c 6= d or both.
To compute P(Ea,b ∩ Ec,d), we consider 3 cases:
In case (1), a, b, c and d are all different. There are N(N − 1)(N − 2)(N − 3) ways of selecting them. Since Ea,b and Ec,d
are independent, P(Ea,b ∩ Ec,d) = P(Ea,b)P(Ec,d) = µ2M1 .
In case (2), two of the four variables are equal. Here, either a = c , a = d, b = c or b = d. As stated earlier, we do not
allow a = b or c = d. There are N(N − 1)(N − 2)ways to choose indices in each case. These cases are considered below.
In case (3), there are only two different values for a, b, c , and d. Since (a, b) 6= (c, d), a = d and b = c , which can occur in
N(N − 1) ways. Here P(Ea,b ∩ Ec,d) = P(Ea,b ∩ Eb,a), which is the probability that, for no j is caj = 0 and cbj = 1, or caj = 1 and
cbj = 0. So P(Ea,b ∩ Eb,a) = µM2 where µ2 = (1− 2pq).
Returning to case 2, we have four subcases to consider.
Let Fa,b(m) be the event that cam = 0 and cbm = 1. Let Ea,b(m) be the complement of Fa,b(m). Since the probability of Fa,b(m)
is pq, it follows that the probability of event Ea,b(m) is P(Ea,b(m)) = 1−pq. Since the event Ea,b is∏m Ea,b(m), P(Ea,b) = µM1 .
(1) na = nc . Fa,b(m) ∪ Fa,d(m) occurs only if (ca,m, cb,m, cd,m) assumes the value (0, 1, 0), (0, 1, 1), or (0, 0, 1). Thus,
P(Fa,b(m) ∪ Fa,d(m)) = pq(p+ 2q) and P(Ea,b ∩ Ec,d) = µM3 where µ3 = (1− pq(p+ 2q)).
(2) na = nd . Thus, Fa,b(m) ∪ Fc,a(m) occurs if (ca,m, cb,m, cc,m) assumes the value (0, 1, 0), (0, 1, 1), (1, 1, 0), or (1, 0, 0).
Thus, P(Fa,b(m) ∪ Fc,a(m)) = 2pq(p+ q) and P(Ea,b) ∩ Ec,a) = µM4 where µ4 = (1− 2pq(p+ q)).
(3) nb = nc . Thus, Fa,b(m) ∪ Fb,d(m) occurs if (ca,m, cb,m, cd,m) assumes the value (0, 1, 0), (0, 1, 1), (0, 0, 1), or (1, 0, 1).
Thus, P(Fa,b(m) ∪ Fc,b(m)) = 2pq(p+ q) and P(Ea,b) ∩ Eb,d) = µM4 .
(4) nb = nd . Thus, Fa,b(m) ∪ Fc,b(m) occurs if (ca,m, cb,m, cc,m) assumes the value (0, 1, 0), (0, 1, 1), or (1, 1, 0). Thus,
P(Fa,b(m) ∪ Fc,b(m)) = pq(2p+ q) and P(Ea,b) ∩ Ec,a) = µM5 where µ5 = (1− pq(2p+ q)).
Let D =∑(a,b)6=(c,d) P(Ea,b ∩ Ec,d). Then,
D/(N(N − 1)) = (N − 2)(N − 3)µ2M1 + µM2 + (N − 2)
(
µM3 + 2µM4 + µM5
)
where µ1 = (1 − pq), µ2 = (1 − 2pq), µ3 = (1 − pq(p + 2q)), µ4 = (1 − 2pq(p + q)), and µ5 = (1 − pq(2p + q)). The
behavior of D is dominated by the largest term µMi . Note that µ2 ≤ µ21 and µ4 ≤ min(µ3, µ5) ≤ (µ3 + µ5)/2. It follows
that (N − 2)(N − 3)µ2M1 + µM2 ≤ N(N − 1)µ2M1 and (µM3 + 2µM4 + µM5 ) ≤ 2(µM3 + µM5 ). Thus, D satisfies the following
bound.
D ≤ Q 2 + 2N(N − 1)(N − 2) (µM3 + µM5 − 2µ2M1 ) .
The lower bound to Γ follows directly from the above.
Theorem 8.1. If C is a code in the ensemble C0, the probability P(Di(C)) that the ith codeword in C is discovered is given below















If M is large relative to k1 and (ln 4N)2/γ 2, the lower bound approaches P(Di) ≥ 12 (4N)−
ln q
γ . When M is also large relative to k2,
γ approaches 1/q and the limiting value of P(Di) becomes 12 (4N)
−q ln q or 12 (4N)
−.35 when q = 1/2.
Proof. The event Di that codeword c i in a code C is discovered is the event that for some 1 ≤ ρ ≤ M after ρ MWs are
activated c i remains on and for no other codeword ck do both c i and ck remain on. Let E(c i, ρ) be the event that c i remains








E(c i, ρ) ∩ E(ck, ρ)
)
.




















R(i, j, ρ) = P(E(c i, ρ) ∩ E(ck, ρ))/P(E(c i, ρ)). (9)
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 259
Fig. 7.When f (x) is decreasing,
∑β
α f (x) ≤ f (α)+
∫ β
α
f (x) dx, as suggested in (a). Also,
∑β
α f (x) ≥
∫ β
α
f (x) dx+ f (β), as suggested in (b).





P(E(c i, ρ)) (1− (N − 1)R0(z))
)
. (10)
Because all permutations under Discover_Codewords are equally likely, P(E(c i, ρ)) is the probability that one of the
|Bi(0)| 0s of c i is activated by the first MW, which occurs with probability |Bi(0)|/M , that one of the remaining |Bi(0)| − 1
0s is activated by the second MW, which occurs with probability (|Bi(0)| − 1)/(M − 1), etc, giving the following expression
for P(E(c i, ρ)).




M − t .
Similarly,
P(E(c i, ρ) ∩ E(ck, ρ)) =
∏
0≤t≤ρ−1
|Bi(0) ∩ Bk(0)| − t
M − t .
It follows that R(i, j, z) has the following form.
R(i, j, ρ) =
∏
0≤t≤ρ−1
|Bi(0) ∩ Bk(0)| − t
|Bi(0)| − t . (11)
Fig. 7 illustrates the use of integration to obtain bounds on decreasing functions such as ln f (m, z) where f (m, z) =∏
0≤t≤z(m− t). Bounds are stated in terms of h(α, β,m) =
∫ β
α
ln(m−x) dx = (y ln y−y) |m−αm−β . The following is immediate.








F(m, z) ≤ f (m, z) ≤ F(m, z)
where






To simplify these bounds, consider the function g(x) = (1 − x) ln(1 − x). Because its Taylor series expansion is g(x) =
−x +∑j=2 xjj(j−1) , g(x) ≥ −x + x2/2. Also, because ln(1 − x) ≤ −x, g(x) ≤ −x + x2. These results imply the following
bounds on F(m, ρ).
mze−(z)
2/(2m) ≤ F(m, z) ≤ mze−z2/m.





2/(2m) ≤ F(m, z) ≤ mze−z2/m.
Using the assumptions of (2) and (3) and these results provides the following upper bound on R(i, j, ρ)where z = ρ−1.













260 E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261





In (10) R0(z) is multiplied by (N − 1). For the bound to be meaningful, (N − 1)R0(z) must be less than 1. Thus, we let






+ z ln γ − ln 4N ≥ 0
z has two solutions, one positive and one negative. The positive solution,which is shownbelow, is the only viable alternative.










2(γ − 1) .
Using
√
1+ x ≤ 1+ x/2, it follows that NR0(z) ≤ 1/2 is satisfied if z ≥ z+ = (ln 4N)/ ln γ when γ = Mq−k1Mq2+k2 > 1. Under







Because ρ ≤ M , the condition ρ ≥ (ln 4N)/ ln γ + 1 implies thatM must satisfyM ≥ (ln 4N)/ ln γ + 1.
To finish this analysis, we derive a lower bound to P(E(c i, ρ)).
P(E(c i, ρ)) ≥
∏
0≤t≤ρ−1
























The latter holds, because ρ− 1 ≤ (Mq− k1)/2. Because q− k1/M < 1, this lower bound decreases with increasing ρ. Thus,















WhenM is large, relative to (ln 4N)2/γ 2, the lower bound approaches the following when γ = (Mq− k1)/(Mq2 + k2) > 1






As k2/M approaches 0, γ approaches 1/q and the limiting value of P(Di) is 12 (4N)
−q ln q or 12 (4N)
−.35 when q = 1/2.
References
[1] G.Y. Jung, S. Ganapathiappan, A.A. Ohlberg, L. Olynick, Y. Chen, WilliamM. Tong, R. StanleyWilliams, Fabrication of a 34× 34 crossbar structure at 50
nm half-pitch by UV-based nanoimprint lithography, Nano Letters 4 (7) (2004) 1225–1229.
[2] P.J. Kuekes, R.S. Williams, J.R. Heath, Molecular wire crossbar memory, US Patent Number 6,128,214, October 3, 2000.
[3] André DeHon, Seth Copen Goldstein, Philip Kuekes, Patrick Lincoln, Nonphotolithographic nanoscale memory density prospects, IEEE Transactions
on Nanotechnology 4 (2) (2005) 215–228.
[4] André DeHon, Nanowire-based programmable architectures, Journal on Emerging Technologies in Computing Systems 1 (2) (2005) 109–162.
[5] Tad Hogg, Yong Chen, Philip J. Kuekes, Assembling nanoscale circuits with randomized connections, IEEE Transactions onNanotechnology 5 (2) (2006)
110–122.
[6] Philip J. Kuekes, Warren Robinett, Gabriel Seroussi, R. Stanley Williams, Defect-tolerant interconnect to nanoelectronic circuits, Nanotechnology 16
(2005) 869–882.
[7] G.S. Snider, W. Robinett, Crossbar demultiplexers for nanoelectronics based on n-hot codes, IEEE Transactions on Nanotechnology 4 (2) (2005)
249–254.
[8] Eric Rachlin, John E. Savage, Nanowire addressing in the face of uncertainty, in: J. Becker, A. Herkersdorf, A. Mukherjee, A. Smailagic (Eds.), Procs. 2006
Int. Symp. on VLSI, Karlsruhe, Germany, March 2–3, 2006, pp. 225–230.
[9] A. DeHon, Deterministic addressing of nanoscale devices assembled at sublithographic pitches, IEEE Transactions on Nanotechnology 4 (6) (2005)
681–687.
[10] Benjamin Gojman, Eric Rachlin, John E. Savage, Evaluation of design strategies for stochastically assembled nanoarraymemories, Journal on Emerging
Technologies in Computing Systems 1 (2) (2005) 73–108.
[11] S.Y. Chou, P.R. Krauss, P.J. Renstrom, Imprint lithography with 25-nanometer resolution, Science 272 (1996) 85–87.
E. Rachlin, J.E. Savage / Theoretical Computer Science 408 (2008) 241–261 261
[12] Yong Chen, Gun-Young Jung, Doublas A.A. Ohlberg, Xuema Li, Duncan R. Stewart, Jon O. Jeppeson, Kent A. Nielson, J. Fraser Stoddart, R. Stanley
Williams, Nanoscale molecular-switch crossbar circuits, Nanotechnology 14 (2003) 462–468.
[13] Nicholas A. Melosh, Akram Boukai, Frederic Diana, Brian Gerardot, Antonio Badolato, Pierre M. Petroff, James R. Heath, Ultrahigh-density nanowire
lattices and circuits, Science 300 (2003) 112–115.
[14] Dongmok Whang, Song Jin, Charles M. Lieber, Nanolithography using hierarchically assembled nanowire masks, Nano Letters 3 (7) (2003) 951–954.
[15] Zhaohui Zhong, Deli Wang, Yi Cui, Marc W. Bockrath, Charles M. Lieber, Nanowire crossbar arrays as address decoders for integrated nanosystems,
Science 302 (2003) 1377–1379.
[16] C.P. Collier, E.W. Wong, M. Belohradský, F.M. Raymo, J.F. Stoddart, P.J. Kuekes, R.S. Williams, J.R. Heath, Electronically configurable molecular-based
logic gates, Science 285 (1999) 391–394.
[17] Charles P. Collier, Gunter Mattersteig, Eric W. Wong, Yi Luo, Kristen Beverly, José Sampaio, Francisco Raymo, J. Fraser Stoddart, James R. Heath, A
[2]catenate-based solid state electronically reconfigurable switch, Science 290 (2000) 1172–1175.
[18] K. Gopalakrishnan, R.S. Shenoy, C. Rettner, R. King, Y. Zhang, B. Kurdi, L.D. Bozano, J.J. Welser, M.B. Rothwell, M. Jurich, M.I. Sanchez, M. Hernandez,
P.M. Rice, W.P. Risk, H.K. Wickramasinghe, The micro to nano addressing block, in: Procs. IEEE Int. Electron Devices Mtng., December 2005.
[19] M.R. Stan, P.D. Franzon, S.C. Goldstein, J.C. Lach, M.M. Ziegler, Molecular electronics: From devices and interconnect to circuits and architecture,
Proceedings of the IEEE 91 (11) (2003) 1940–1957.
[20] P.P. Sotiriadis, Information capacity of nanowire crossbar switching networks, IEEE Transactions on Information Theory 52 (7) (2006) 3019–3032.
[21] W. Robinett, G.S. Snider, D.R. Stewart, J. Straznicky, R. Williams, Demultiplexers for nanoelectronics constructed from nonlinear tunneling resistors,
IEEE Transactions on Nanotechnology 6 (3) (2007) 280–290.
[22] Eric Rachlin, John E. Savage, Nanowire addressing with randomized-contact decoders, in: Procs. ICCAD, November, 2006.
[23] Robert Beckman, Ezekiel Johnston-Halperin, Yi Luo, Jonathan E. Green, James R. Heath, Bridging dimensions: Demultiplexing ultrahigh-density
nanowire circuits, Science 310 (2005) 465–468.
[24] Eric Rachlin, John E. Savage, Analysis of mask-based nanowire decoders, IEEE Transactions on Computers 57 (2) (2008) 175–187.
[25] André DeHon, Array-based architecture for FET-based, nanoscale electronics, IEEE Transactions on Nanotechnology 2 (1) (2003) 23–32.
[26] Chen Yang, Zhaohui Zhon, Charles M. Lieber, Encoding electronic properties by synthesis of axial modulation-doped silicon nanowires, Science 310
(2005) 1304–1307.
[27] André DeHon, Patrick Lincoln, John E. Savage, Stochastic assembly of sublithographic nanoscale interfaces, IEEE Transactions on Nanotechnology 2
(3) (2003) 165–174.
[28] Benjamin Gojman, Eric Rachlin, John E. Savage, Decoding of stochastically assembled nanoarrays, in: Procs 2004 Int. Symp. on VLSI, Lafayette, LA,
February 19–20, 2004.
[29] John E. Savage, Eric Rachlin, André DeHon, CharlesM. Lieber, YueWu, Radial addressing of nanowires, Journal on Emerging Technologies in Computing
Systems 2 (2) (2006) 129–154.
[30] Franklin Kim, Serena Kwan, Jennifer Akana, Peidong Yang, Langmuir–Blodgett nanorod assembly, Journal of the American Chemical Society 123 (18)
(2001) 4360–4361.
[31] E. Johnston-Halperin, R. Beckman, Y. Luo, N. Melosh, J. Green, J.R. Heath, Fabrication of conducting silicon nanowire arrays, Journal of Applied Physics
Letters 96 (10) (2004) 5921–5923.
[32] Eric Rachlin, John E. Savage, Benjamin Gojman, Analysis of a mask-based nanowire decoder, in: Procs 2005 Int. Symp. on VLSI, Tampa, FL, May 11–12,
2005.
[33] R.S. Williams, P.J. Kuekes, Demultiplexer for a molecular wire crossbar network, US Patent Number 6,256,767, July 3, 2001.
[34] Michael Mitzenmacher, Eli Upfal, Probability and Computing: Randomized Algorithms and Probabilistic Analysis, Cambridge University Press,
Cambridge, 2005.
[35] Jia Wang, Ming-Yang Kao, Hai Zhou, Address generation for nanowire decoders, in: GLSVLSI ’07: Proceedings of the 17th Great lakes Symposium on
VLSI, 2007, pp. 525–528.
