Nano-crossbar arrays have emerged as a promising and viable technology to improve computing performance of electronic circuits beyond the limits of current CMOS. Arrays offer both structural efficiency with reconfiguration and prospective capability of integration with different technologies. However, certain problems need to be addressed, and the most important one is the prevailing occurrence of faults. Considering fault rate projections as high as 20% that is much higher than those of CMOS, it is fair to expect sophisticated fault-tolerance methods. The focus of this survey article is the assessment and evaluation of these methods and related algorithms applied in logic mapping and configuration processes. As a start, we concisely explain reconfigurable nano-crossbar arrays with their fault characteristics and models. Following that, we demonstrate configuration techniques of the arrays in the presence of permanent faults and elaborate on two main fault-tolerance methodologies, namely defect-unaware and defect-aware approaches, with a short review on advantages and disadvantages. For both methodologies, we present detailed experimental results of related algorithms regarding their strengths and weaknesses with a comprehensive yield, success rate and runtime analysis. Next, we overview fault-tolerance approaches for transient faults. As a conclusion, we overview the proposed algorithms with future directions and upcoming challenges.
INTRODUCTION
Since the first digital computer was developed in 1930s, several breakthroughs have been made both in technological and computational levels to improve the performance of computers. Researchers have aimed at finding the best way to realize the fundamental building element of computers that is a two-terminal switch. First, electromechanical systems and then, respectively, vacuum tubes, p-n junction-based diodes and transistors, and CMOS transistors are used as switches. Considering these different eras, without question the CMOS era is the longest prevailing and the 79:2 O. Tunali and M. Altun most fruitful one. For more than 50 years, CMOS computing performance has increased almost in a regular manner that is often called Moore's law (Schaller 1997) . However, this trend has started to slow down, and it is widely accepted that another transition to a new era is soon to occurr (Dubash 2005; Conte and Gargini 2015) .
At this point, a significant amount of research has been dedicated to nanoscale technologies with new materials including nanotubes, nanowires, and individual molecules being used to implement a switch (Waser 2012) . Although computing performances of these individual switches are quite satisfactory, generally much better than those of CMOS, there are still problems to be solved before the commercialization: (1) Integration of individual switching elements to generate a fully functional computing architecture is quite costly, and (2) the final forms of architectures are not immune to unusually high fault rates. In this regard, nano-crossbar arrays have been favored by researchers such as nanofabric using molecular switches (programmable diodes) (Goldstein and Budiu 2001) , nanoPLA using programmable diode (Dehon 2005) , CMOS-like structures using nFET or pFET transistors (Snider et al. 2005) , CMOL using programmable diodes (Strukov and Likharev 2005) , NASIC using FET transistors or diodes (Wang et al. 2008; Morgul et al. 2016; Alexandrescu et al. 2016) . Another prospective technology is based on memristors or memristive switches that are constructed as crossbarlike structures (Yang et al. 2013 ) and having the same logic mapping approach (Xie et al. 2015) . Moreover, as a physical realization, three fully operational implementations of nano-crossbar arrays as nanocomputers are shown to be feasible in Yan et al. (2011) , Shulaker et al. (2013) , and . Even though these technologies differ in certain levels, basic computing blocks are always crossbars with each crosspoint behaving as a switch, so logic mapping schemes are similar.
Arrays offer both structural efficiency with reconfiguration and prospective capability of integration. In addition, inherent redundancy present in nano-crossbars provides flexibility for fault tolerance that is much needed considering that fault tolerance is the main challenge to be resolved. Fault rate projections of nano-crossbars are as high as 20%, which is much higher than those of CMOS due mainly to the used bottom-up fabrication techniques with self-assembly that has stochastic nature as opposed to using conventional top-down fabrication techniques with directed assembly (Wu et al. 2005; Huang et al. 2001; Chen et al. 2003; Haselman and Hauck 2010) . Therefore, it is fair to expect sophisticated fault-tolerance methods for nano-crossbars. The focus of this survey article is the assessment and evaluation of fault-tolerance methods and related algorithms applied in logic mapping and configuration processes of nano-crossbar arrays.
Prior to examining fault-tolerance techniques in the literature, a very coarse description of a common crossbar structure and its computing fundamentals can be summarized as follows. Nanocrossbar arrays are formed by placing a group of lines/wires aligned parallel to each other on another group of array lines/wires orthogonally. Vertical and horizontal lines are used as input and outputs, respectively. This is illustrated in Figure 1 (a). Boolean literals are applied to input lines, and each output line corresponds to a product, that is, AND of literals. Therefore a given function in sum-of-products form can be directly implemented by using each product with an output line by deactivating and activating relevant crosspoints. Note that other forms based on factored Boolean expressions or binary decision diagrams cannot be used, since these forms require certain wirings/connections between lines and crosspoints that is not applicable for nano arrays (Dehon 2005; Snider et al. 2005; Alexandrescu et al. 2016) . While a deactivated crosspoint behaves as an open circuit between the crossed lines, an activated crosspoint is a two-terminal switch that can behave as a diode (Ziegler and Stan 2003) or a FET (Zhong et al. 2003) . This is shown in Figure 1 (b) . If the components are ON (OFF), then their terminals are shorted (open) . Note that the distinction of the components is that their ON (OFF) states connect (disconnect) terminals in different lines. 
Fault Tolerance in Programmable Logic Arrays
In a historical context, nano-crossbar structures are very similar to programmable logic arrays (PLA's) introduced in Fleisher and Maissel (1975) , in terms of circuit structures, programming features, and utilization. Therefore, examining the progression of fault-tolerance studies regarding PLA's can be insightful. Particular aspects are as follows: (1) Test generation and fault detection (Ostapko and Hong 1979; Smith 1979) , which basically deals with producing comprehensive test vectors; (2) yield analysis and redundancy employment (Wey et al. 1987; Wey 1988) , which aims to maximize yield and allocate redundant elements; and (3) fault modeling with simulations (Ligthart and Stans 1991) , which formalizes the type of faults for stuck-at, bridging, missing, and broken crosspoints that is also adopted in nano-crossbar terminology. Additionally, different from these studies, in Demjanenko and Upadhyaya (1990) fault tolerance is achieved with reconfiguration of a PLA by using bipartite graph model that can be considered as the archetype of the techniques used in nano-crossbars. In this study, exaggeratedly high fault rates are considered similar to the treatment in nano-crossbars. Indeed, developed with a well-established CMOS technology, PLA's do have considerably low fault rates. Therefore simple configuration approaches are adequate for fault tolerance that justifies the lack of related studies in the literature after 1990s. • Tolerated by reconfiguration or redundancy
Fault Tolerance in Nano-Crossbar Arrays
Examining the fault-tolerance techniques in the literature, we see a common tendency of considering faults causing only phase shifts between activated and deactivated crosspoints (Tahoori 2006; Al-Yamani et al. 2007; Zheng and Huang 2009; Gören et al. 2011; Su and Rao 2014; Tunali and Altun 2017) . Only a few studies consider faults affecting the functionality of electrical components in crosspoints that causes phase shifts between ON and OFF states of the components (Bhaduri et al. 2004; Gil et al. 2008; Zamani et al. 2013 ). Indeed, component-based fault modeling is more appropriate for failures seen in field, and this is not quite applicable for emerging technologies, because they have very limited field data including nano-crossbar arrays. Another reason favoring activated/deactivated crosspoint-based fault modeling is its capability to deal with faults occurring in input and output lines such as broken and bridging faults. For example, all crosspoints of a broken line can be modeled as deactivated. Another examination is that distinct approaches are proposed to tolerate permanent and transient faults regarding their exclusive natures as summarized in Table 1 . While permanent faults called as defects are related to the configuration of nano-crossbars that is performed during post fabrication, transient faults occurring in field are tolerated using either redundancies or detectionreconfiguration cycles.
Permanent Fault Tolerance.
In the presence of permanent faults called as defects, tolerance is achieved by mapping Boolean logic functions on a defective crossbar using crossbar row and column permutations. This is an NP-complete problem (Shrestha et al. 2009 ). For the worst-case scenario, implementing a given function with an N × M crossbar requires N !M! permutations; computing time quickly grows to intractable levels with the crossbar size. Additionally, high fault rates complicate the mapping process by constraining the possible valid choices. Nevertheless, seminal Teramac experiment in Amerson et al. (1995) shows that it is possible to produce reliable computing with using components having excessive faulty parts. As specified in Heath et al. (1998) , as long as adequate connectivity and efficient algorithms for configuring present, it is feasible to use a reconfigurable nano-crossbar to obtain reliable computing structures. In the literature, proposed defect-tolerant logic mapping algorithms of nano-crossbars can be categorized under two main methodologies: defect-unaware and defect-aware. However, it should be noted that both methods use a defect map that shows the location of faults in nano-crossbar, so a more intuitive naming would be "defect-avoiding" and "defect-employing." Nevertheless, we follow the prevalent terminology present widely in the literature.
Defect-unaware methods determine the size of an n × n nano-crossbar to obtain a k × k defect-free sub-crossbar, so it is possible to know the required size of a crossbar in advance to implement a given logic function. Using the defect-free sub-crossbar, a straightforward mapping process can be applied. However, the number of studies in this field is limited due to the inefficient area yield. There is a common shortcoming especially for high fault rates-obtained k values are much smaller than n. When N = 250 and the fault rate is 15% that is a reasonable value for nano arrays, the proposed algorithms find k values as high as 30 . It means that only 1% of the crossbar can be used. Proposed algorithms use graph-based models and heuristics to solve the maximum independent set problem in a complement graph (Tahoori 2006; Al-Yamani et al. 2007; Li 2011, 2014) .
Defect-aware logic mapping methods employ defective elements in the mapping process that results in much better area yields. However, the mapping process is more complicated. The number of studies in this field is abundant due to flexible nature of formalizing the problem. Earlier works utilize graph-based models including Naeimi and DeHon (2004) solving the bipartite matching problem with a greedy approach and solving the graph embedding problem with a recursive approach. Additionally, Yellambalase and Choi (2008) examine the effect of clustered defects with a matrix-based algorithm, and Zheng and Huang (2009) uses satisfiability approach for the mapping process. Another greedy algorithm is proposed in Simsir et al. (2009) using partial graph constructing. Apart from the graph-based models, an ILP model is used in Yang and Datta (2011) and Zamani et al. (2013) by introducing constraints related to nano-crossbar defects. Furthermore, a novel approach benefiting from graph canonization with sorting is used in Gören et al. (2011) . To handle scalability more efficiently compared to the above methods, Naeimi and DeHon (2004) uses a greedy approach; uses a graph-based approach with memetic fitness approximation; and Tunali and Altun (2017) implements matrix sorting supported by greedy backtracking.
Transient Fault
Tolerance. Another aspect of fault tolerance in nano-crossbars is the transient faults occurring in field. Similarly to conventional technologies targeting transient faults, hardware redundant solutions are proposed. In Rao et al. (2007) , two approaches using an online test with reconfiguration and a fault masking scheme are investigated. Comparing the approaches, fault masking offers smaller hardware overhead at the cost of having very limited capability of tolerating multiple faults. In , another fault masking method with a focus on missing devices (denoted with stuck-at deactivated faults in this article) is proposed with utilizing logic tautologies. In Garcia and Orailoglu (2008) , an alternative reconfiguration-based fault-tolerance scheme is proposed with novel online testing using a text vector compaction. It is possible to accomplish input/output level diagnoses with reduced runtime by means of the proposed test vectors. Even though a fault-tolerance mechanism is excluded, a novel error detection method proposed in Farazmand and Tahoori (2009) . Logic implementation is realized with a dual rail structure having both the function and its negation as outputs. By comparing the outputs, it is possible to detect faults conforming to certain assumptions of fault characteristics. So far mentioned articles are related to the logic synthesis level of fault tolerance. In He et al. (2005) and He and Jacome (2007) , a high-level synthesis paradigm with reconfiguration is proposed with choosing certain mapping units with an optimization among many solutions.
It should be noted that transient fault tolerance of nano-crossbar arrays is in exploratory phase, and only small fraction of the aforementioned articles fully target nano-crossbar arrays. Mostly, conventional PLA-based architectures are used. Additionally, as for all emerging technologies, nano-crossbar arrays have very limited field data that is needed for accurate modeling of transient faults.
Variation Tolerance in Nano-Crossbar Arrays
Considering that similar techniques are used in fault and variance tolerance, it is worth mentioning studies considering variations in crosspoint delay values for performance optimization of nano-crossbar arrays. As an earlier example, Gojman and DeHon (2009) used fan-out matching to minimize the path delays and extrapolate delay values from Committee et al. (2008) . In Ghavami et al. (2010) and Zamani et al. (2013) , the authors focus on minimizing the maximum variation of the overall crossbar. A bipartite graph and integer linear programming models are used respectively. In Tunc and Tahoori (2010) and Tahoori (2010) , two objectives are employed as minimizing the maximum delay and minimizing the output variations. As an algorithmic aspect, they use a simulated annealing approach. In , the problem is formulated as a multiojbective optimization problem, and an evolutionary algorithm is utilized. In Zhong et al. (2016) , a hybrid evolutionary algorithm is used and the problem is formulated as a bilevel multiobjective optimization.
Overview and Organization
A selection of key fault-tolerance studies is given in Table 2 . We determine these articles in terms of their novel contribution to the state of the art. Since configuration is the main power of fault tolerance in nano-crossbars, the list starts with some early efforts exploiting configurability for tolerance (not necessarily for nano-crossbars). Following studies in the list show relatively recent trends and developments specifically for nano-crossbars. Moreover, we add a concise lifecycle and fault-tolerance steps of a nano-crossbar in Figure 2 . This high-level integration picture demonstrates when certain fault-tolerance mechanisms come into the picture. The step "function and crossbar models" in the figure correspond to the algorithms' individual problem formalization, so it is approach dependent. The step "connectivity checks between crossbars/planes" is performed after achieving configuration for each nano-crossbar to form a fully functional architecture. Fault-tolerance algorithms that constitute the considerable portion of this survey are used in post-fabrication configuration and also employed during reconfiguration to tolerate in-field faults. Another aspect of in-field fault tolerance is hardware redundancy based on fault masking.
The rest of the article is organized as follows. In Section 2, we present fault characteristics and models. In Section 3, we demonstrate configuration of nano-crossbars for logic mapping. In the following two sections, we focus on permanent faults. In Sections 4 and 5, we explain defectunaware and defect-aware logic mapping algorithms, respectively, and we present experimental results of the algorithms by comparing yield, success rate, and runtime parameters. In Section 6, we examine fault-tolerance techniques for transient faults. In Section 7, we discuss future directions of the methods with upcoming challenges.
FAULT CHARACTERISTICS
In this section, we first define fault models of nano-crossbars in the logic level. Then, we elaborate on the distinction between permanent and transient faults with an overview of literature 79:7 tendencies towards the adoption of different fault models. As a note, we enunciate that there is no consistent modeling preference of faults in the literature. Most of the works only consider certain type of crosspoint faults. In experimental results, we explore the effects of using different fault models in depth.
Fault Models
We use fault as a generic term for problems that might cause en error in computing. Faults in nano-crossbars can be considered under two categories: (1) faults affecting the configuration of crosspoints that cause phase shifts between activated and deactivated phases and (2) faults affecting the functionality of electrical components in crosspoints that causes phase shifts between ON and OFF states of the components. In the first category, configuring a crosspoint switch includes activating and deactivating processes. When a crosspoint is activated, it means that there is an electrical component at the crosspoint and its functionality is intact, so phase shifts between ON and OFF states is possible. When a crosspoint is deactivated, it means as if no component is present at the crosspoint. Configuration level faults are defined as follows: -Stuck-at deactivated fault makes the corresponding crosspoint switch always deactivated that cannot be used as a functional component any more; and -Stuck-at activated fault makes the corresponding crosspoint switch always activated, so there is a functional component.
Representation of configuration level faults and their effects are shown in Figure 3 . As can be seen from the figure, this type of faults only affect the corresponding crosspoint itself. In addition to the faults directly affecting crosspoint switches, broken and bridging faults might occur on input and output lines. They can also be modeled using crosspoint faults such that all crosspoints of broken or adjacent lines can be considered stuck-at deactivated.
For the second category of faults, we consider the functionality of electrical components in crosspoints. This type of faults are defined as follows:
-Stuck-at OFF fault makes the corresponding crosspoint component not capable of conducting current, so the component ideally has an infinite resistance; and -Stuck-at ON fault makes the corresponding crosspoint component constantly conduct current, so the component ideally has a zero resistance.
These types of faults affect the other switches of the crossbar according to the technology preference of either having diode-based or FET-based crosspoints. For a diode-based crosspoint, a stuckat OFF fault means no connection between the two terminals of the diode, placed in the crossed input and output lines, so it only affects the faulty crosspoint. On the other hand, a stuck-at ON 79:9 fault means a constant connection between the terminals, so all crosspoints in the corresponding output line are discarded.
For a FET-based crosspoint, a stuck-at OFF fault breaks the connection between the two terminals, both placed in the output line, so all crosspoints in the corresponding output line are discarded. On the other hand, a stuck-at ON fault means a constant connection between the terminals, so it only affects the corresponding crosspoint. Representation of faults for diode and FET-based arrays are shown in Figures 4(a) and (b), respectively.
Permanent and Transient Faults
Permanent and transient fault concepts are more related to the lifecycle of a nano-crossbar than the physical characteristics of faults.
-Permanent faults or defects occur during fabrication process due to physical problems or variations; and -Transient faults occur in field during the operation of a product.
Considering the two categories of faults presented in the previous subsection related to configuration of crosspoints and functionality of components, we can say that both categories are applicable for permanent and transient faults. We can also claim that permanent faults mostly comprise of configuration level faults, since it is unlikely to see a case after fabrication that a crosspoint can be perfectly activated/deactivated, but its electrical component is not properly operating. Therefore, faults related to the functionality of components are more likely to be transient that is 79:10 O. Tunali and M. Altun also supported by the fact that degradation and aging phenomena of components show transient characteristics.
LOGIC MAPPING PROBLEM
Nano-crossbar-based architectures are generally composed of specific crossbars/planes, each of which implement a two-level logic such as AND-OR as illustrated in Dehon (2005) , Strukov and Likharev (2005) , Strukov and Likharev (2007) , , and . Therefore realization of a target logic function whether two-level or multi-level is closely related to the architecture. However, if we focus on a single plane and use an abstraction (independent of plane character), logic mapping process (whether two-level or multi-level) gradually can be applied to every connecting/succeeding planes to accomplish the desired result. This is a common practice in the literature with using AND planes for benchmark simulations. The reason of using AND planes is that they are generally much larger than OR planes; using a reconfigurability feature, a single line/wire as an OR plane is even sufficient to have every output at a time. Note that with defect-free OR planes, one can make connections between planes without any constraint for the orderings of input and output lines. For this purpose defect-unaware logic mapping techniques can be preferred. Logic mapping and connectivity checks of planes are previously illustrated in Figure 2 with an integrated high-level view.
Logic mapping is the configuration of crosspoint switches of a nano-crossbar to implement a given Boolean logic function given in sum-of-products form such as f = P 1 +, . . . , +P k . The main goal is finding a valid mapping, namely a correct assignment of literals and products of the function to inputs and outputs of a given crossbar. Input and output assignments can be represented with an input array I = [I 1 , . . . , In case of having a fault-free crossbar, every assignment produces a valid mapping, so the configuration process is simple and straightforward. Defect-unaware approaches benefit from this feature by finding a defect-free sub-crossbar, so physical design is not troubled with the locations of defects. Figure 5 (a) shows an example.
In case of having faults, it is not guaranteed that an assignment produces a valid mapping. Figure 5 (b) shows an example. Configuration with the same input and output arrays as used for a fault-free crossbar, produces a different logic function, since certain switches cannot be activated or deactivated. However, with using a defect-aware method one can implement the given function with a valid assignment. Figure 5 (c) shows an example.
In the following two sections, we consider defect-unaware and defect-aware algorithms targeting permanent faults or defects. In short, defect-unaware approaches aim to find a defect-free sub-crossbar, so the follow-up assignment procedure is straightforward and trouble free. Defectaware approaches aim to find valid input and output assignments using the full-size crossbar by considering every defect in a crossbar.
DEFECT-UNAWARE LOGIC MAPPING
Defect-unaware logic mapping methods search for a defect-free k × k sub-crossbar in an n × n nano-crossbar using graph-based algorithms. The main goal is to maximize yield or (k/n) 2 . Since the obtained k × k sub-crossbar is defect free, logic mapping process is straightforward afterwards. Let us first give the common concepts employed in the algorithms.
Definitions
(1) Nano-crossbar has vertical lines as inputs and horizontal lines as outputs. There is a configurable switch in every functional crosspoint. An example is shown in Figure 6 (a). (2) Bipartite graph representation has two disjoint node sets; no edge exists between nodes in the same set. Elements of the node sets are represented by V and U showing output and input lines, respectively. A configurable switch in a crosspoint is shown with an edge connecting nodes from V and U . Connected nodes via edges are called adjacent and a degree of a node is the number of edges connected to the node. A stuck-at activated defect results in an erasure of the corresponding nodes from V and U as well as all edges connected to these nodes. A stuck-at deactivated defect results in an erasure of the corresponding edge. (4) Independent set consists of nodes such that no node pair in the set has an edge connecting the nodes. (5) Biclique is a subgraph of a bipartite graph such that every node has the maximum possible degree. Note that a defect-free sub-crossbar can be denoted with a biclique.
Algorithms
Finding a k × k sub-crossbar is equivalent to determining a balanced maximum biclique in a bipartite graph. Condition of being fully balanced ensures that dimensions (k inputs and k outputs) are equal to each other. The problem of obtaining a maximum biclique in a bipartite graph is shown to be an NP-hard in Garey and Johnson (2002) . For this reason, heuristic algorithms are proposed to find sub-optimal solutions under reasonable time constraints. The proposed algorithms formulate the problem as finding the maximum independent set in the complementary graph. Since only stuck-at deactivated defects in crosspoints are denoted with edges, degree of a node in the complementary bipartite graph is equal to the number of defects present in the corresponding line. If a node has a degree of zero, no edges with other nodes, then it can be included to the independent node set immediately. Proposed algorithms focus on deciding which node to remove for efficiently obtaining zero-degree nodes. Outline of the algorithms in a modular composition is given in Algorithm 1.
Heuristic(t) Preferred heuristic is called. Formulation of the problem as finding the maximum independent set in a complementary bipartite graph was first proposed by Tahoori (2006) . It is observed that by removing nodes having maximum degrees, it is possible to discard lines having maximum number of defects. After the removal, degrees of the remaining nodes are updated and searching process is initialized again. Iterations are performed until either V or U becomes empty. A pseudocode of the algorithm is given in Heuristic 1. As can be seen from the code, the algorithm flips between V and U to assure a balanced condition by using the flag value. This way, a sub-crossbar is guaranteed to be balanced by achieving a difference of 1 between its dimensions: |U b | − |V b | = ±1. Nevertheless, regarding the number of variables and products of given logic functions to be realized, restrictions can be relaxed.
The same problem formalization is also used in Al-Yamani et al. (2007) . As a first step, the algorithm checks the nodes having minimum degrees. In the second step, the adjacent nodes to the checked ones are determined as candidates. As a final step, the candidates adjacent to most checked nodes are removed. The removal process increases the probability of obtaining zero-degree nodes. The pseudocode is given in Heuristic 2. This algorithm produces better results in terms of yield compared to the first algorithm in Heuristic 1. However, finding adjacent nodes increases the computational load of the algorithm.
Fundamentally using the Yamani's approach, Yuan proposes two algorithms. In the first one, the node with the minimum degree is checked and the adjacent nodes are determined as candidates. At the end, the one with the maximum degree is chosen to be removed . The pseudocode is given in Heuristic 3. In the second study, analyzing the major loop iterations, it is concluded that removing all of the adjacent nodes determined as candidates cuts down considerable number of iterations, decreases the runtime, and improves the yield marginally (Yuan and 79:14 O. Tunali and M. Altun Li 2014). The pseudocode is given in Heuristic 4. The only disadvantage of the approach is that balance condition is not guaranteed for the resulted biclique. In simulation results, they show that
Summary of all four heuristics and their node removal preferences are shown in Figure 7 .
Evaluation of Algorithms
The presented algorithms directly use a graph representation of a defective nano-crossbar as an input. They do not deal with defect types and their representations in graphs. Indeed, one can directly find a graph representation of a crossbar by erasing edges for stuck-at deactivated defects and nodes for stuck-at activated defects as previously explained in Section 4.1.
A real concern of defect-unaware methods is their considerably low yield or (k/n) 2 , especially for high defect rates. When n = 250 and the fault rate is 15% that is a reasonable value for nano arrays (Chen et al. 2003) , the best algorithm finds k values as high as 30. It means that only 1% of the crossbar can be used. This phenomena undermines the basic attraction of nano-crossbars offering superior density features.
Another concern is that all of the mentioned articles study only stuck-at deactivated defects except for Tahoori (2006) . Tahoori shows that in case of having stuck-at activated defects that results in removal of all corresponding nodes, yield is impractically low.
All of the examined heuristic algorithms have polynomial runtime complexities. As long as crossbar size is under 1000 × 1000 and defect rates smaller than 15%, they run in a micro second domain that is quite satisfactory. Another interesting observation is that, the algorithms are immune to the differentiation of fault rates. In the next subsection, we conduct detailed simulations regarding their runtime and yield.
As a final note, defect-unaware methods in principle try to solve the problem of finding the maximum biclique in a bipartite graph. For this reason, similar problem formulations used in different fields such as those proposed in Mubayi and Turán (2010) and Yuan et al. (2015) can be directly applied to the defect-unaware methods especially for the improvement of yield.
Simulation Results of Algorithms
In this section, we present experimental results of the examined defect-unaware algorithms. We generate defective nano-crossbars with assigning an independent defect probability/rate to each crosspoint that shows a uniform distribution. All defects are modeled as configuration level defects either stuck-at activated or stuck-at deactivated that is the common tendency in the literature. Monte Carlo simulations are performed for assessment with a sample size of 200. We observe that fluctuating of parameter values stabilize nearly after this threshold value. All algorithms are implemented in MATLAB. All experiments run on a 3.30GHz Intel Core i5 CPU (only single core used) with 4GB memory.
To evaluate the performance of the algorithms, two different parameters are used: runtime and area yield. Area yield is the ratio of the defect-free sub-crossbar size to the initial defective crossbar size. In simulations, two defect settings are used to evaluate the four proposed heuristics. For the first setting, we only consider stuck-at deactivated types with a corresponding defect rate of P D having values of 5%, 10%, and 15%. In the second setting, stuck-at activated defects are also included with a corresponding defect rate P A = 1% with P D = 10%. As a reminder, we define yield as (k/n) 2 with respect to k 2 being the size of a defect-free sub-crossbar and n 2 being the size of an initial defective crossbar.
Results of the first setting are given in Table 3 . Here, we select crossbar sizes up to 200 × 200 after which yield values become impratically low. The table shows that area yield values differ at most 6% and Heuristic 4 is the most efficient one in terms of runtime. However, yield values are still inadequate that kills the main advantage of using nano-crossbars having high area density. In the best case scenario, only 33% of a nano-crossbar can be utilized and it significantly decreases for larger crossbar sizes.
Results of the second setting are given in Figure 8 . We see that defect-unaware methods is very vulnerable to stuck-at activated defects that cause eliminating both input and output lines as we previously explain in Section 4. Although we select a relatively low value of 1% for P A , area yield values are not satisfactorily. Due to the very low yield values, we pursue no other experiments regarding stuck-at activated defects higher than 1%. Table 4 summarizes the general features of the algorithms by using four levels: poor, moderate, good, and excellent. Defect-unaware algorithms produce poor area yield results. Also, stuck-at activated defects severely decrease the already very low yield values. Their runtime results are satisfactory, but it is not not generally acceptable to discard all other features for a straightforward mapping process. Obviously, this inference is obtained by considering the existing Matrix-based greedy backtracking (Tunali and Altun 2017) methods; further increase of the yield would clearly make defect-unaware approaches much more attractive.
DEFECT-AWARE LOGIC MAPPING
Defect-aware approaches employ defects present in a nano-crossbar during the configuration process to map a given logic function. It is indicated that mapping a logic function on a defective nano-crossbar array is an NP-complete problem (Shrestha et al. 2009 ). The problem is directly equivalent to finding a subgraph isomorphism between graph representations of a logic function and a nano-crossbar. However, in formalization of the problem, a variety of different approaches are adopted to fasten the process, especially for high defect rates. As a common practice, maximum bipartite matching and graph-based heuristics are used to lighten the computational load (Naeimi and DeHon 2004; Simsir et al. 2009; . A different approach using matrix representations of a given logic function and a defective nano-crossbar is used with row by row matching (Gören et al. 2011; Tunali and Altun 2017) . Another method is integer linear programming with transforming defects as constraints (Yang and Datta 2011; Zamani et al. 2013) . A comprehensive and chronological list of the main articles and their contributions to the defect tolerant logic mapping problem is shown in Table 5 . Concerning the different types of defects, we see an overwhelming tendency in the literature with only considering stuck-at deactivated types (different names used in the literature are nonprogrammable, stuck-at 0, stuck-open) ( Su and Rao (2014) , which analyze the occurrence of multiple-type defects. As follows, we separately explain the main approaches in subsections followed by their evaluations.
Maximum Bipartite Matching
In a defective nano-crossbar, arbitrary assignment of products of the given function to the crossbar outputs might result in an error as previously visualized in Figure 5 . To determine the erroneous cases, one can use a bipartite graph having product (P) and output (O) nodes. If there is an edge Fan-in is # of variables in a product 2: while P ∅ do 3:
Choose the first P i 4:
while between a product and an output, then the product can be mapped to the output. Thus, all possible mapping configurations can be represented by edges. An example is given in Figure 9 . After the graph construction, finding a maximum or perfect matching that corresponds to a set of edges such that every node is incident to exactly one edge, would yield a valid mapping. For this purpose, exact algorithms can be used including Ford-Fulkerson maximum flow network (Ford Jr and Fulkerson 1955) and Hopcroft-Karp algorithm (Hopcroft and Karp 1973) . However, constructing a bipartite graph is a costly process especially for high defect rates seen in nano-crossbars, so certain heuristics are proposed.
Naeimi proposes a greedy heuristic algorithm without constructing a bipartite graph; instead he uses expected values of node degrees (Naeimi and DeHon 2004) . Since a product with a high number of variables (fan-in) are harder to map, its node degree would be smaller. The algorithm starts matching products in a decreasing order of fan-in's with choosing random output nodes. Naemi's approach is fairly competent in terms of scalability. A pseudocode of the algorithm is given in Algorithm 2.
Simsir also uses a heuristic algorithm with partially constructing the bipartite graph (Simsir et al. 2009 ). First, variables from most common to least common (fan-out values) are assigned to least defective to most defective input lines, respectively. This process is named as pin assignment. Second, a distinct edge is found for every product node and then an exact algorithm is performed to find a maximum matching. In case no matching is found, an extra edge is searched for product nodes of bipartite graph and the exact algorithm is performed at the end. This process continues until a valid matching is found or another pin assignment is tried in case of no matching. A pseudocode of the algorithm is given in Algorithm 3.
Yuan proposes a memetic algorithm with fitness approximation ). Unlike Simsir's approach, an initial random input assignment is made and fitness of the assignment is evaluated with an objective function f . Ford-Fulkerson's maximum flow method is mainly used for finding a maximum bipartite matching. Furthermore while searching for a matching, a greedy reassignment is performed by changing the input assignment for better fitness. Greediness factor (λ) of the method determines the number of input assignments to be changed. In addition, for every 10 trials (chosen as an exact evaluation gap Δ in the article), an approximate matching algorithm is used for the first 9 and an exact matching algorithm is used just for the last one that is very similar to Naeimi's greedy method. A pseudocode of the algorithm is given in Algorithm 4.
Matrix Matching
A logic function and a defective nano-crossbar can be both denoted with matrices similar to incident matrices of graphs. Figures 10(a) and (b) , respectively, show a function matrix (FM) and a crossbar matrix (CM) of the logic function and the defective nano-crossbar previously used in Figure 9 . If we define which elements of logic and crossbar matrices can be matched, then it is possible to decide a valid mapping between a product and an output by checking corresponding rows. Figure 10 (c) demonstrates a compatibility table for matching. The key idea behind matrix matching is to make two matrices easily matchable by assigning proper elements for defects and variables. As follows, we examine two studies.
In the first study, Goren appoints k-neighbor values to all rows and columns individually (Gören et al. 2011) . After determining the values, rows and columns are sorted in ascending order according to the k-neighbor values. A 1-neighbor value of a row or a column of an FM (CM) is the number of +1's (0's and +1's) in the corresponding row or column. A 2-neighbor value of a row of an FM (CM) is the sum of 1-neighbor values of the columns having intersections of +1's (0's and +1's) with the row. Following the same logic, to find a k-neighbor value of a row of an FM (CM), we add (k-1)-neighbor values of the columns if they are +1 (0 or +1). The same procedure is applied for columns. An example for finding 2-neighbor values is shown in Figure 11(a) .
After the initial operations, a two-dimensional sorting is applied to rows and columns of matrices. Every row/column is regarded as a k-ary number (radix or base equals to k) with most to least significant bits (MSB and LSB) being arranged from left to right and from top to bottom. In Figures 11(b) and (c), manipulation of the FM and CM is shown according to the radix values. To sort rows and columns, radix sort algorithm is used in ascending order. Starting with the rows, radix sort begins with the LSB and sorts the rows and moves to the next bit (next column in our context) until it reaches the MSB. The same process is applied to the columns as well. This interleaving sorting continues till a stable sort is obtained. Important point is that, since an FM and a CM has two and three different elements, respectively, radix being 2 and 3 is applied.
After all of these sorting processes are finalized, the matrices are matched row by row using the element compatibility as previously given in Figure 10 (c). The pseudocode is given in Algorithm 5.
In the second study, Tunali first sorts matrix columns according to the number of compatible elements and rows with placing most defective rows at the top of the crossbar matrix (Tunali and Altun 2017) . Then a greedy row by row matching using Hadamard multiplication is applied with backtracking excluding previously matched rows. Since most defective rows are at the top, harder matchings are eliminated in the beginning of the process. If no matching is found for a row even with backtracking, then column permutation is altered and row by row matching is initialized again. It should be noted that, greedy row by row matching discard the overhead of the bipartite graph construction. The pseudocode is given in Algorithm 6.
Graph Embedding
Rao proposes a recursive algorithm with heuristics to prune impossible mappings by denoting both a given logic function and a crossbar with bipartite graphs ). The algorithm, called as Embed, explores solution space with using fanout, fanout-fanout, and fanout-chain heuristics. Since a graph model is used, fanout of the logic function is shown with node degrees; graphs G 1 (N var , N pr od , E 1 ) for the function and G 2 (N ver t , N hor , E 2 ) for the crossbar are defined. The Embed algorithm finds a matching between these graphs. Fanout heuristic compares the degrees of the nodes to eliminate impossible mappings. Fanout-fanout heuristic checks the degrees of the separate and connected nodes to discard impossible mappings. Fanout-chain heuristic constructs all possible node-pair connections to conclude if there is a matching. Using backtracking, nodes with insufficient degrees are eliminated. All these heuristics are implemented in the 12th step of the pseudocode shown in Algorithm 7.
ILP-Based Algorithms
Yang and Zamani propose to transform the logic mapping problem into a constrained integer linear programming (ILP) problem in Yang and Datta (2011) and Zamani et al. (2013) . To obtain a valid mapping, constraint equations are derived. Parameters X and Y respectively show the matching status of the rows corresponding to function product and crossbar output, and columns corresponding to function variable and crossbar input. If a product P 1 can be matched with an output O 1 , then X P 1 _O 1 becomes 1; otherwise, X P 1 _O 1 = 0. The same condition is defined for Y for column matching.
To map a k 1 x k 2 function matrix to n 1 x n 2 crossbar matrix following constraints need to be met. If the ith row (product P) of a function matrix can be matched with the jth row (output O) of a crossbar matrix, then X i_j = 1; otherwise, X i_j = 0. For a valid row mapping two conditions are imposed: (1) Each row in the function matrix should be matched with only a single row in the crossbar matrix, and (2) each row in the crossbar matrix should be matched at most one row in the function matrix. The following constraint (1) formalize these conditions. Same procedures are applied for column matching as given in a constraint (2).
One more parameter is introduced Z i_j,i _j such that it takes the value of 1 if X i_j = Y i _j = 1; otherwise, Z = 0. To find a valid mapping, every element of a function matrix requires a valid mapping. The below constraint (3) ensures this condition,
By using these three constraints, a valid mapping can be found. An important point is that the proposed algorithm helps to prune solution space and decrease the computation time by finding impossible matchings corresponding to zero valued X 's and Y 's.
Evaluation of Algorithms
It is reasonable to assume that defect rates of nano-crossbars is high up to 20% that dramatically elevate the computational load of the algorithms. Because of that, runtime parameter of the algorithms can be considered as the major factor in evaluations. Generally, ILP and graph embedding approaches yield poor runtime results. Especially, recursive nature of the embedding algorithm causes impractical runtimes for larger crossbar sizes. On the other hand, heuristic nature of maximum bipartite matching approaches provides an upper hand in terms of runtime. Nevertheless constructing the bipartite graph is very time-consuming. Also, if the graph is rather sparse, meaning that a few edges are present for possible matchings, exact algorithms need to be applied, which is also time-consuming. As for matrix matching approaches, runtime results are fairly satisfactory for even larger size logic functions. However, since matching is advancing through one dimension (rows), increase of the column size drastically worsens the success rate of the algorithms.
Considering the yield, maximum bipartite-matching-and matrix-matching-based algorithms produce the best results. However, a prevalent trend in the literature is realizing a logic function with a larger size crossbar, generally 1.5 times larger, than the optimal size. Therefore, yield analysis is not conducted extensively for most of the studies.
Another important point is the algorithms' capability of handling multiple-type defects. As we previously mention, only (Gören et al. 2011) fine-tunes the algorithm according to defect types. The rest of the articles only consider stuck-at deactivated types, so their proposed heuristics perform under this restriction. If multiple-type defects were to be considered, then bipartite graphs would be sparser (lower possibility of matching between products and outputs) that makes harder to find a maximum bipartite matching. This problem is also applicable for graph embedding and ILP-based methods.
As a final note, defect-aware methods use a wide range of different problem formalizations not necessarily proposed for nano-crossbar arrays. Therefore studies related to subgraph isomorphism (Cordella et al. 2004 ) and assignment problems (Chu and Beasley 1997) can be directly applied to the defect-aware methods.
Simulation Results of Algorithms
The same system and simulation settings of defect-unaware methods are also used in this section. Unlike defect-unaware algorithms for which the algorithms' performance are independent of the implemented functions, defect-aware algorithms give different results for different functions. Therefore, we use standard benchmark circuits presented in McElvain (1993) .
To evaluate the performance of the algorithms, four different parameters are used: success rate, runtime and its standard deviation, and area yield. Area yield is defined as the ratio of the optimal size defect-free array to the defective crossbar size adequate to realize a given benchmark function. For example, a logic function with 4 variables and 10 products would require an optimal crossbar size of 10 × 8 (10 for outputs and 8 for inputs with 4 variables and 4 negated forms).
We also introduce a new parameter logic inclusion ration (IR) to help us to form a more intuitive understanding of the mapping problem. The number of switching crosspoints adequate to realize a logic function is denoted by IR in the form of percentage. For example, if the optimal crossbar size to implement a logic function is 10 × 10 and IR = 40%, then it means that a logic function has a literal count of 40 and we need 40 switching crosspoints to implement the function. Since we can use stuck-at activated defects as switching crosspoints, higher IR values ease the tolerance of stuck-at activated defects. Oppositely, lower IR values are preferred for the tolerance of stuck-at deactivated defects. Therefore, it is possible to categorize the easiness of the tolerance or mapping problem by considering the values of IR/P A and (1-IR)/P D for stuck-at activated and deactivated defects, respectively. As an empirical measure, if these values below three, then the sample space shrinks significantly that makes the problem very hard. We state this threshold phenomena as an experimental tendency based on our observations during the simulations of mapping trials rather than a strict compliance. Different defect distributions such as clustered, or a different set of benchmark functions might certainly produce a different threshold. Therefore, when we say easier or harder to solve, reader should understand the categorization context in terms of our experimental settings.
We use three defect settings. For the first and the second ones, benchmark functions are respectively mapped to optimal (n/n = 100% yield) and 1.5 larger size ((n/(1.5 × n)) 2 ≈ 44% yield) crossbars with P D = 15%. For the third setting, 1.5 larger size crossbars are used with P D = 10% and P A = 5%. We choose those three settings to evaluate the algorithms' response to stricter area conditions and multiple-type defects.
On the selection of the approaches previously given in the subsections of Section 5, we have excluded satisfiability, graph embedding and integer linear programming based ones from simulations. The reason behind that, SAT approach is already shown to be inferior to matrix model in Gören et al. (2011) . In addition, graph embedding algorithm adopts a recursive characteristic and is also demonstrated to be inferior to maximum bipartite matching in . Finally, ILP needs to cope with very large number of constraint equations that requires a drastic computational operations.
Graph-Based Approaches.
We start our experiments with optimal size crossbars that result in %100 area yield, and P D = 15%. In terms of average runtime and its deviation, Naemi's algorithm always produces the best results. The other two methods sometimes show runtime deviations higher than average runtimes. Considering the success rate, Yuan's and Simsir's algorithms are clearly superior. Random nature of Naeimi's approach causes an issue when the constraints prune the solution space severely. However, Yuan's and Simsir's algorithms are not able find a valid mapping for larger size examples, so scalability is an issue. Results are given in Table 6 .
In our second experiment, we loosen the area restrictions and use 1.5 larger size crossbars that results in %44 area yield. Here, Naemi's algorithm is the clear winner for all of the performance parameters. Only exception is that, Yuan's approach produces better success rates for the benchmarks "inc" and "misex2." In addition, its runtime deviation is relatively stabilized. Results are given in Table 7 . In our last experiment, once more we use 1.5 larger size crossbars with P D = 10% and P A = 5%. Here, Naemi's algorithm is again the clear winner for all of the performance parameters. Only exception is that, Yuan's approach is the only one being able to find a valid mapping for the benchmark "misex2." Results are given in Table 8 .
Matrix-Based
Approaches. Similar to graph-based approaches, we start our experiments with optimal size crossbars, and P D = 15%. Here, Tunali's algorithm is the clear winner in terms of the all parameters. However, it is not able to find a valid mapping for larger size examples with an only exception of the benchmark "alu4." Therefore scalability is still an important issue for matrixbased algorithms. Results are given in Table 9 . In our last two experiments, Tunali's approach also produces superior results. It is clear that Goren's approach is not scalable due mainly to its fairly complicated sorting process. However, it should be noted that Tunali's success rate for a harder case like "misex2" is inferior to Yuan's approach when stuck-at activated defects are introduced to the problem. Memetic nature of Yuan's approach fine tunes mapping process better than Tunali's sorting approach. Results are given in Tables 10 and 11 . 5.6.3 Comparisons of All Algorithms. In Table 12 , general features of the algorithms are evaluated by using four levels: poor, moderate, good, and excellent. Features of satisfiability, ILP and graph-embedding-based algorithms are interpreted through the comparison performed in the articles (Gören et al. 2011; Zamani et al. 2013; , respectively. Examining the results, we see that certain attributes need to be considered before choosing a suitable method. When area yield is an important factor, for easier cases Tunali's approach and for harder cases Yuan's 
TRANSIENT FAULT TOLERANCE
In the previous two sections, we have focused on permanent faults (defects) occurring during the fabrication process which are deterministic in nature and known in advance. However, transient faults appear in field and as for all emerging technologies, nano-crossbar arrays have very limited field data needed for accurate modeling of transient faults. Because of that, current literature is limited. Moreover, transient fault-tolerance schemes are closely related to the architecture, so certain major assumptions are necessary. Another point is that, even though the architecture in question is assumed to be based on nano-crossbar arrays, investigative assumptions are based on the existing PLA's and their presumable responses to occurrence of faults. To detect and correct errors due to transient faults, hardware redundant solutions are proposed. Main methods are fault masking and reconfiguration with online testing. Fault masking methods realize a logic function using more than one AND-OR plane. In reconfiguration-based approaches, first faults are detected with online testing and then crossbar is reconfigured. In regard to diagnostic capabilities, a different degree of integration is favored according to fault occurrence and granularity of access to input and output lines. A generic scheme of transient fault-tolerance preference is shown in Figure 12 . Hardware overhead is utilized as multiple use of AND and OR planes in fault masking in (a) and online diagnostic with reconfiguration tools in (b).
Considering the two categories of faults previously explained in Section 2 related to configuration of crosspoints and functionality of components, we can say that both categories are applicable for transient faults. However, since faults related to the functionality of components require field data for justification that is very limited, current literature only adopts the configuration level faults classified as stuck-at activated and stuck-at deactivated.
Stuck-at deactivated faults cause a missing device in AND and OR planes, so a variable and a product is erased from the logic function, named as G and D, respectively. While G type faults produce 0 −→ 1 error, D type faults produce 1 −→ 0 error. Stuck-at activated faults cause an extra device in AND and OR planes, so a variable and a product is added to the logic function, named as S and A, respectively. While S type faults produce 1 −→ 0 error, A type faults produce 0 −→ 1 error. Table 13 shows the complete list of cause and effect of faults with a logic function example.
Fault Masking
Fault masking approach is first introduced in comparison with conventional methods in Rao et al. (2007) , and detailed examination of the approach considering stuck-at deactivated faults is given in . In fault masking, two tautologies form of a Boolean function as follows: While the first tautology is for masking G-and A-type faults, S-and D-type faults correspond to the second one. To determine the area overhead of a PLA structure in terms of the used functional devices and connecting wires, we define the number of inputs/variables as I, the number of products as P, and the number outputs (implemented functions) as O. Table 14 gives the formulations of the used area for the implementations. By analyzing the table and given logic functions, we can fine-tune our area overhead. For example, a logic function with many products but a few outputs, using A-O-O is more reasonable than using A-O. It should be noted that this systematic is only for area optimization. For different performance parameters such as power and delay, a much more detailed analysis and optimization techniques are needed that can be considered as future work. 
Online Testing and Reconfiguration
Constructed mainly on conventional fault detection and correction techniques, fault-tolerance schemes using testing and reconfiguration are proposed in Rao et al. (2007) , Garcia and Orailoglu (2008) , and Farazmand and Tahoori (2009) . In Rao et al. (2007) , a straightforward diagnostics technique is introduced with pattern RAMs that hold the correct output values of the mapped logic function. Therefore it is an easy task to determine erroneous results by checking input, product, and output vectors. One RAM for each AND and OR plane is integrated to a nano-crossbar. After locating the faults, reconfiguration process is assumed to be performed.
In Garcia and Orailoglu (2008) , a checkpoint-based fault-tolerance offering reconfigurability is proposed. Online test is performed to a group of PLA blocks with choosing two of them as surrogates. Every block is checked in a round-table manner and if no fault is detected, a safe checkpoint is identified. To determine faults, row-and column-based diagnostic test vectors are designed. Starting with row based diagnostics, S type faults (previously explained in Table 13 ) can be found by setting all of the inputs or variables of the corresponding product to 1, and the rest of the inputs to 0 to make all of the products except the corresponding one being 0. In case of having a 1−→ 0 error, we can conclude that an S type fault occurs in the product. Since D type faults show the same 1−→ 0 error characteristics that is also applicable to detect A type faults by using duality, we can say that row-based diagnostics cover all types of faults except G types. Column-based diagnostics is proposed to locate G type faults. However, it is not possible to locate all G type faults with a single row test vector. For this reason a compaction algorithm is used to optimize the number of test vectors by finding products sharing most variables.
In Farazmand and Tahoori (2009) , a dual rail implementation is proposed as an online test of detection. Both a logic function f and its negation ¬f are realized in an AND plane and outputs are received using an OR plane. Since f and ¬f always have opposite values, it is possible to detect faults by comparing the two values. The area overhead in comparison with 3-modular redundancy, 5-modular redundancy, parity, and duplication methods are favorably given regarding I, O, and P parameters defined in the previous subsection. Additionally, fault coverage ratios are given. Although the results are overwhelmingly better for the proposed dual rail technique, the used assumptions are quite weak covering very certain fault characteristics.
DISCUSSIONS
In this study, we survey fault-tolerance algorithms of reconfigurable nano-crossbar arrays applicable in logic mapping and configuration processes. Both permanent and transient faults with different fault modeling approaches are covered. We conduct comprehensive simulations to evaluate the proposed algorithms using different fault rates on industrial benchmark functions. In addition, different area yield values, and multiple type defect occurrences are considered.
To make concrete discussions with future directions, brief explanations of the historical development of fault tolerance in nano-crossbar arrays are given as follows. Nano-crossbar arrays were first proposed in 1990s to overcome the upcoming challenges of integrated circuit miniaturization. Its configurable/reconfigurable attributes as well as inherent fault-tolerance capabilities attracted numerous researchers. However, as expected, this new technology comes with some challenges and fault tolerance is one of the significant ones. Fault rates are much higher for nano-crossbars compared to those of conventional CMOS circuits. Therefore, developing efficient fault-tolerance techniques for nano-crossbars is a must.
At first, defect-unaware methods were proposed motivated by the fact that, configuring defective crossbars would be time-consuming and impractical. However, area yields of defect-unaware approaches are proven to be less than ideal. In addition, we show that, stuck-at activated defects severely decrease the already low area yield values. As a result, although the number of studies in this field is limited, improving the yield remains to be a strong motivation for future studies with the fact that achieving defect-free sub crossbars enables us to use existing and well-studied tools.
The line of research that has the most abundant studies is defect-aware methods. Although research on defect-aware approaches can be considered as mature, there are still important problems waiting to be solved, including a need for specific algorithms to fine-tune the mapping problem according to multiple-type defect occurrences and different defect distributions. Additionally, current methods such as fitness approximation and matrix sorting are only able to respond to low defect rates; this issue should be solved. Another important research direction is developing techniques to restore defect mappings during the configuration process. A similar attempt using probabilistic data structure is presented in Wang et al. (2006) using bloom filters. Last but not least, in terms of area yield, current methods are not fully equipped to produce optimum results for the realization of given logic functions. Inspiring studies considering both fault tolerance and yield analysis through the manipulation of logic function are presented in Angiolini et al. (2007) and Hogg and Snider (2006) . New methods producing better results under stricter area yield constraints would be a good line of investigation. Additionally, presented approaches can be applicable for memory structures due to similar problem formalizations (Huang et al. 2004; Feng et al. 2013) , as well as for variance tolerance of the crossbars (Tunc and Tahoori 2010; Ghavami 2016; Zhong et al. 2016) .
Another trend is to develop transient fault-tolerance techniques. Fault masking and reconfiguration with online testing have been proposed. Even though presented methods are competent, without the field data it is hard to justify the results. In addition, only configuration level faults are considered in the literature; component level faults (or regarding the functionality) are open to further investigation. Also, physical realization of the architectures is still in infancy, so this line of inquiry is more reasonable with robust development and wide fabrication of nano-crossbars.
As a summary, we can list the future directions for fault-tolerance techniques for nano-crossbar arrays as follows:
-Fine-tuning for multiple-type faults and different fault distributions; -Compression and representation of defect maps; -Improvement of area yield to increase density; -Decomposition of given logic functions for area optimization; -Developing variance tolerance techniques; -Developing fault-tolerance techniques for nano-crossbar-based memory structures; -Transient fault-tolerance covering component level faults; -Reliability forecasting for nano-crossbar arrays; -Developing architectural level transient fault-tolerance techniques; and 79:32 O. Tunali and M. Altun -Developing fault-tolerance techniques for new technologies based on crossbar arrays including resistive/memristive networks.
