Abstract-Resistive Random Access Memories (RRAMs) have gained high attention for a variety of promising applications especially the design of non-volatile in-memory computing devices. In this paper, we present an approach for the synthesis of RRAM-based logic circuits using the recently proposed MajorityInverter Graphs (MIGs). We propose a bi-objective algorithm to optimize MIGs with respect to the number of required RRAMs and computational steps in both MAJ-based and IMPbased realizations. Since the number of computational steps is recognized as the main drawback of the RRAM-based logic, we also present an effective algorithm to reduce the number of required steps. Experimental results show that the proposed algorithms achieve higher efficiency compared to the general purpose MIG optimization algorithms, either in finding a good trade-off between both cost metrics or reducing the number of steps. In comparison with the RRAM-based circuits implemented by the state-of-the-art approaches using other well-known data structures the number of required computational steps obtained by our proposed MIG-oriented synthesis approach for large benchmark circuits is reduced up to factor of 26. This strong gain comes from the use of MIGs that provide an efficient and intrinsic representation for RRAM-based computing-particularly in MAJ-based realizations-and the use of techniques proposed for optimization.
I. INTRODUCTION
While the resistive switching phenomena was known from 1960s [1] , it did not gain much attention until the late 1990s. So far, various metal oxides using different materials with the resistive switching characteristics between two high and low resistance values are fabricated that are called Resistive Random Access Memory (RRAMs) [1] . RRAMs are of high interest due to their promising applications in non-volatile memory design [2] , [3] , digital and analog programmable systems [4] , [5] , [6] , and neuromorphing computing [7] . In 1971, Chua [8] derived equations describing a forth passive circuit element from symmetry which he called memristor, short for memory resistor. However, some researchers claim differences between RRAMs and memristors, the resistive switching property that is used in this work is shared by both devices [9] . Since different RRAMs have already been fabricated and their functionality is proven, here we prefer to use the term RRAM.
Material Implication (IMP) can be executed by RRAMs to synthesize Boolean functions. This enables designing memories with computing capability, however, the number of computational steps is a serious drawback of implication logic [10] . Using data structures such as Binary Decision Diagrams (BDDs) [11] and And-Inverter Graphs (AIGs) [12] has been previously proposed for optimization of RRAM-based circuits. However, both approaches presented in [11] , [12] require a high number of computational steps.
A novel homogeneous logic representation structure,
The work has been partly supported by the University of Bremen's graduate school SyDe, funded by the German Excellence Initiative and by the Swiss National Science Foundation project number 200021 146600.
Majority-Inverter Graph (MIG) was proposed in [13] that uses the majority function M (x, y, z) = x · y + x · z + y · z together with negation as the only logic operations. MIGs have a high flexibility in depth optimization that enables design of high speed logic circuits and FPGA implementations [14] . In comparison with the well-known data structures BDDs and AIGs, MIGs have experimentally shown better results in logic optimization, especially in propagation delay [13] . In particular, MIGs are highly qualified for logic synthesis of RRAM-based circuits since they can efficiently execute the built-in resistive majority operation in RRAMs [15] .
In this paper, we propose an approach to implement fast circuits with RRAMs using MIG-based logic synthesis. In order to map MIGs to the equivalent RRAM-based circuits, we first present two realizations for majority gate: (i) a realization based on IMP that is also used by previous works using BDDs [11] and AIGs [12] , and (ii) a realization that exploits the built-in resistive majority property of RRAMs denoted by MAJ. Then, we propose two MIG optimization algorithms for synthesis of RRAM-based logic circuits: (i) a multi-objective optimization algorithm to reduce the number of required RRAMs and computational steps representing area and delay of the resulting circuits, respectively, and (ii) an optimization algorithm tailored to reduce the number of computational steps as the main concern of RRAM-based logic design.
The proposed optimization algorithms for RRAM-based logic design gain higher efficiency in comparison to the conventional MIG optimization techniques. Experiments confirm the superiority of MIGs to BDDs and AIGs in RRAM-based circuit design and the effectiveness of the proposed optimization algorithms. According to the results, the circuits represented and optimized by the proposed MIG-based synthesis approach for large benchmark functions are about 26 times faster than the RRAM-based circuits implemented by the BDD-based and AIG-based synthesis approaches presented in [11] , [12] .
II. BACKGROUND A. Logic operations for RRAM-based circuit design
We present two logic operations, IMP and MAJ, for synthesis of RRAM-based circuits.
1) Material implication (IMP):
Material Implication (IMP) and FALSE operation, i.e. assigning the output to logic 0, are sufficient to express any Boolean function [16] . Fig. 1 shows the implementation of an IMP gate with two RRAMs that are represented by the symbols of memristors as proposed in [16] . P and Q designate two resistive switches connected to a load resistor R G . Three voltage levels V SET , V COND , and V CLEAR are applied to the RRAMs to execute IMP and FALSE operations by switching between low-resistance (logic 1) or high-resistance (logic 0) states.
The FALSE operation can be performed by applying V CLEAR to the RRAM. The RRAM can be switched to logic 1 by applying a voltage larger than a threshold V SET to its [16] voltage driver. To execute IMP, two voltage levels V SET and V COND are applied to the switches P and Q simultaneously. The magnitude of V COND is smaller than the required threshold to change the state of the switch. However, the interaction of V SET and V COND can execute IMP according to the current states of the switches, such that switch Q is set to 1 if P = 0 and it retains its current state if p = 1 [16] .
2) Built-in majority operation (MAJ): RRAMs are twoterminal devices which internal resistance R can be switched between two logic states 0 and 1 designating low and high resistance values, respectively. Denoting the top and bottom terminals by P and Q, the device can be switched with a negative or positive voltage V P Q based on the polarity and location of dopants. Assuming the voltage levels V SET , V CLEAR , and V COND respectively correspond to logic statements (P = 1, Q = 0), (P = 0, Q = 1), and (P = Q), the truth tables shown in Fig. 2 explain the next sate of the switch (R ) based on P , Q, and the current state (R). The built-in majority operation described in Fig. 2 can be formally expressed as the following [15] :
Therefore, using MAJ a majority gate for three variables x y, and z can be simply executed after preloading the RRAMs and negation of y.
B. Majority-inverter graphs
The Boolean algebra of MIGs was proposed in [13] . The following set (Ω) includes the primitive transformations.
It was proven in [13] that any MIG can be transformed to another logically equivalent MIG using only Ω axioms. It means that reaching a desired MIG optimized with respect to the considered cost metric is possible by applying Ω, however, the length of transformation sequence might be impractical. To solve this problem, a more advanced set of transformations derived from the basic rules in Ω was proposed in [13] which was shown by Ψ. The following set includes those axioms of Ψ that are used in this work.
Where z x/ȳ means replacing x withȳ.
III. RRAM-BASED IN-MEMORY COMPUTING DESIGN
WITH MIGS This section presents the proposed realizations for majority gate with RRAMs and their corresponding MIG mapping methodology. Then, we propose MIG optimization algorithms with respect to area, i.e., the number of MIG nodes, depth, i.e., the number of levels in the graph, and the number of RRAMs and computational steps.
A. Realization of majority gate using RRAMs 1) IMP-based realization:
The proposed IMP-based realization of a majority gate is shown in Fig. 3 . It requires six RRAMs and ten sequential steps. RRAMs shown by X, Y , and Z are loaded by input variables and the remaining three RRAMs A, B, and C are required for retaining the intermediate results and the final output. The corresponding steps for executing the majority function are as follows:
During the steps shown above, the initial values stored in two out of the six RRAMS remain unchanged while the others are either cleared or used to save the outputs of the implications. In the first step, the input variables are loaded and the other RRAMs are assigned FALSE for the next operations. Another FALSE operation is also performed in step 8, to clear an RRAM which is not required anymore for inverting an intermediate result. Finally, the Boolean function representing a majority gate is executed by implying results from the seventh and ninth step.
2) MAJ-based realization: It is obvious that the MAJbased majority gate can be realized with smaller number of RRAMs and computational steps due to benefiting from the discussed built-in majority property. Using MAJ, the majority gate will require only four RRAMs placed in the same structure shown in Fig. 3 such that the bottom electrodes of the switches are electrically connected via a horizontal nanowire and the switching can be done by applying the three discussed voltage levels to the top electrodes. Furthermore, the majority function can be executed within only three steps carrying out simple operations. The MAJ-based computational steps for the proposed RRAM-based realization are:
In the first step, the initial values of input variables as well as an additional RRAM are loaded by applying V SET or 
V CLEAR to their voltage divers.
Step 2 executes the required NOT operation in RRAM A. This can be done with applying appropriate voltage levels V SET or V COND to switch A, for cases y = 0 and y = 1, respectively. In the last step, the majority function is executed by use of MAJ at RRAM Z by applying any of the three voltage levels corresponding the difference between logic states of x andȳ.
B. Design methodology
Although both of the proposed realizations impose sequential circuit implementations, they allow a reduction in area by reusing RRAMs released from previous computations. In our proposed synthesis approach, we only consider one MIG level each time, such that the employed RRAMs to evaluate the level can be used later for the next levels. Starting from the input of the graph, the RRAMs in a level are released when all the required computational steps are done. Then, the RRAMs are reused for the upper level and this procedure is continued until the target function is evaluated. Such an implementation requires as many majority gates as the maximum number of nodes in any level of the MIG. Hence, depending on the use of IMP or MAJ in the realization, the corresponding number of RRAMs and steps for synthesizing the MIG is six or four times the number of required majority gates and ten or three times the number of levels, respectively. However, still some additional RRAMs are needed in the presence of complemented edges. Table I shows the number of RRAMs and computational steps of the resulting RRAM-based circuits. For every complemented edge in the graph a NOT gate is required. The negation can be executed by either an IMP or MAJ operation with logic 0 as shown in second step of both realizations. This will require one extra RRAM to be loaded by 0 that can be done in parallel with the data loading step and an additional step for executing the imply operation. Since the implementation starts from the input of MIG, the ingoing complemented edges of any level should be first inverted for a correct evaluation. It is obvious that the required implications for all complemented edges in a level can be executed simultaneously. In other words, the additional steps required for complemented edges are equal to the number of MIG levels with ingoing complemented edges. Similarly, the total number of RRAMs required for the synthesis of the whole graph is equal to the maximum of six (IMP) or four (MAJ) times the number of nodes in the level plus the number of ingoing complemented edges over all MIG levels.
C. MIG optimization for RRAM-based logic circuits
In general, MIG optimization is performed by applying a set of valid transformations to an existing MIG to find an equivalent MIG that is more efficient with respect to the considered cost metrics. MIG optimization in terms of area and delay aims at finding the best trade-off between the depth and the size of the graph, i.e., the number of nodes. Using RRAMs for implementation, the metrics determining area and delay depend on a combination of MIG features that some of them are not intended in conventional area and depth optimization. However, a reduction in area and especially depth might lower costs of an RRAM-based implementation. Thus, specific optimization techniques are required to find an optimum MIG with respect to the number of RRAMs and computational steps. In this section, we first present conventional area and depth optimization algorithms for standard implementation of MIGs to show why different optimization techniques are required for RRAMbased implementation. Then, we present the two proposed MIG optimization algorithms tackling the cost metrics of logic synthesis with RRAMs. The first proposed algorithm optimizes MIGs with respect to both objectives simultaneously, while the other one aims at reducing the number of computational steps, which is often regarded to be more important compared to the number of RRAMs.
1) Area optimization:
The framework for area optimization given in Alg. 1 is based on conventional MIG area optimization algorithm proposed in [13] . Using eliminate (Ω.M ; Ω.D R→L ) some of the MIG nodes can be removed by repeatedly applying majority rule (Ω.M ) and distributivity from right to left (Ω.D R→L ) to the entire MIG. Assuming x, y, z, u and v as input variables Ω. , z) ) which means the total number of nodes has decreased from three to two. In order to enable further reduction in the number of nodes, the MIG is reshaped by use of associativity axioms Ω.A, Ψ.C, which allow to move the variables between adjacent levels. Then, eliminate is applied again to optimize the size of the newly arranged MIG. The area optimization algorithm can be iterated for a maximum number of cycles called effort. From the point of area in an RRAM-based circuit, although Alg. 1 can reduce the number of physical RRAMs by removing unnecessary nodes, it does not address the issue of complemented edges that are important in both aforementioned cost metrics.
Alg. 1 Conventional MIG area optimization (based on [13])
for (cycles = 0; cycles < effort; cycles++) do Ω.M ; Ω.D R→L ; Ω.A; Ψ.C; Ω.M ; Ω.D R→L ; end for eliminate 2) Depth optimization: In general, the depth of the graph is of high importance in MIG optimization to lower the latency of the resulting circuits. Alg. 2 is structurally similar to the MIG depth optimization procedure proposed in [13] with slightly shorter iterations. The depth of the MIG can be reduced by pushing the critical variable with the longest arrival time to upper levels. This can be possible by the process push-up shown in Alg. 2. Push-up includes majority, distributivity, and associativity axioms. It is obvious that the majority rule may reduce depth by removing unnecessary nodes. , M (u, v, z) ). However, if z is the critical variable, applying Ω.D L→R will reduce the depth of MIG by pushing z one level up. In the cases that the associativity rules (Ω.A, Ψ.C) are applicable, the depth can be reduced by one if the axioms move the critical variable to the upper level. After performing push-up, the relevance axiom (Ψ.R) is applied to replace the reconvergent variables that might provide further possibility of depth reduction for another push-up.
Although Alg. 2 decreases the number of computational steps in an RRAM-based circuit, it does not aim for the issue of complemented edges. Moreover, the depth reduction by Alg. 2 is performed at a cost of area. Ω.D L→R adds one extra node to the graph. This may increase the area of the resulting RRAM-based circuit if the size of the critical level, i.e., the level with the maximum number of required RRAMs, is increased. Ω.A and Ψ.C can also have a similar effect on the maximum level size by moving one node to the critical level. A simple example for this is applying Ω .A to M (x, u, M (y, u, M (p, q, r) )) that has a depth of three and one node in each level. The transformation results in M (M (p, q, r), u, M(y, u, x) ) of depth two and two nodes in the lower level. Although the late arrival variable (M (p, q, r) ) is pushed up, the number of nodes in one level, that might be the critical level, has increased from one to two. This effect is not of interest for RRAM-based implementation of MIGs, however using Ψ.C might be with a positive spin in this case because of the possibility of reducing the number of complemented edges.
Alg. 2 Conventional MIG depth optimization (based on [13])
for ( 
push-up
3) Multi-obtective optimization: None of the algorithms explained above suggest a solution for the issue of complemented edges that contain an important part of both cost metrics in RRAM-based circuits. Moreover, a single-objective MIG optimization algorithm considers either area or delay that leads to circuits worsened with respect to the other objective. Hence, we propose a multi-objective MIG optimization algorithm to obtain efficient RRAM-based logic circuits with a good tradeoff between both objectives. The proposed multi-objective MIG optimization algorithm for RRAM-based logic design includes a combination of conventional area and depth optimization algorithms besides techniques tackling complemented edges from both aspects of area and delay. The algorithm starts with applying push-up to obtain a smaller depth. Then, the complemented edges are aimed by applying an extension of axiom inverter propagation from right to left (Ω.I R→L ) for the condition that the considered node has at least two outgoing complemented edges. The three cases satisfying this condition and their equivalent majority gates are shown below and discussed in the following considering their effect on both cost metrics.
In the first case, the ingoing complemented edges of the gate are decreased from three to zero, while one complement attribute is moved to the upper level, i.e., the level including the output of the gate. Assuming that the current level, i.e., the level including the ingoing edges, is the critical level with the maximum number of required RRAMs, this case is favorable for area optimization. However, if the upper level is the critical level, the number of required RRAMs will increase by only one. Similar scenarios exist for the two other cases, although the last case might be less interesting because the number of complemented edges in both levels is changed equally by one. That means a penalty of one is possible as the cost for a reduction of one, while transformations (1) and (2) may result in RRAM reductions of three and two, respectively. To reduce the number of computational steps, the number of levels possessing complemented edges should be reduced. Depending on the presence of complemented edges by other gates in both levels, the two first transformations given above might reduce or increase the number of steps or even leave it unchanged. Case (1) is beneficial if the upper level already has complement edges and also the transformation removes all the complemented edges from the current level. It might be also neutral if none of the levels are going to be improved to a complement-free level. The worst case occurs when moving the complement attribute to the upper level increments the number of levels with complement edges. Similar arguments can be made for the remaining cases. However, case (2) is more favorable because it never adds a level with complemented edges and case (3) can not be advantageous because it can never release a level from complemented edges. Fig. 4 shows a simple MIG that is applicable to transformation (2) (Ω.I R→L (2) ). The transformation has released one level of the MIG from the complement attribute (black dot), which results in a smaller number of computational steps. Furthermore, as a result of removing one complemented edge from the critical level, the required number of RRAMs is decreased by one. I R→L(1−3) ), the MIG is also reshaped and more chances for reducing the depth might be created. Thus, push-up is applied to the entire MIG again to reduce the number of steps as much as possible. In the last step, the number of RRAMs are reduced to make a trade-off between both objectives. Applying Ω.A, some of changes by push-up that have increased the maximum level size can be undone. Finally, distributivity from right to left (Ω.D R→L ) is applied to the graph to reduce the number of nodes in levels.
D. Step optimization
Due to the importance of latency in logic synthesis, and the issue of sequential implementation in RRAM-based circuits, we propose an MIG optimization algorithm for reducing the number of computational steps. In the proposed step optimization algorithm, two axioms of inverter propagation are applied to the MIG after push-up. First, only the axiom presented by case (1), i.e., the base rule of inverter propagation from right to left (Ω.I R→L ), is applied to the entire MIG to lower the number of levels with complemented edges. Since the transformation moves one complement attribute to the upper level, it might create new inverter propagation candidates for the all three discussed cases if the upper level already has one or two ingoing complemented edges. Hence, we apply Ω.I R→L (1−3) again to ensure maximum coverage of complemented edges. Although case (3) can not reduce the number of steps, it is not excluded from Ω.I R→L (1−3) due to its effect on balancing the levels' sizes. Finally, push-up is applied to the MIG to reduce the depth more if new opportunities are generated. It should be noted that the number of computational steps is mainly determined by the MIG depth. In fact, in the worst case caused by complemented edges, the total number of steps would be equal to seven times the number of levels, i.e., the MIG depth. Nonetheless, we show the efficiency of our proposed step optimization algorithm in the following section.
Alg. 4 Step optimization for (cycles = 0; cycles < effort; cycles++) do 
IV. EXPERIMENTAL RESULTS

A. Experimental setup
To have a comprehensive performance assessment and comparison, experiments are carried out over a benchmark set including 25 Boolean functions from ISCAS89 [17] and
LGsynth91 [18] with a number of input variables from 7 to 135, and the number of cycles (effort) is set to 40 in all experiments. The run-time of each proposed algorithm for the whole benchmark set is less than 3 seconds.
B. Optimization results
The experimental results of the presented algorithms are shown in Table II . Due to the lack of space, only results of the proposed algorithms are given for both realizations. As expected, the smallest values for the number of RRAMs and computational steps belong to the MAJ-based realization. The step optimization for the MAJ-based realization has resulted in MIGs with the smallest number of steps that is almost one forth of the obtained value by the depth optimization algorithm performed on the IMP-based realization. Even considering the results of the step optimization on the IMP-based realization, this reduction is obvious in comparison with the results of conventional depth optimization. This proves that the employed techniques to reduce the complemented edges have been effective and the proposed step optimization algorithm satisfies requirements of fast RRAM-based circuit implementations.
The proposed algorithm for RRAM costs optimization performed on the MIGs using the IMP-based realization has reduced the sum of the number of steps by 35.39%, i.e., the major drawback of sequential implementation has been effectively lowered. Furthermore, the proposed multi-objective optimization algorithm achieves 30.43% smaller number of steps compared to the conventional depth optimization. The proposed multi-objective algorithm for the MAJ-based realization achieves the smallest number of required RRAMs over other algorithms as well as maintaining a quite small number of computational steps. Sum of the RRAM counts by the proposed multi-objective algorithm is almost 19.78% lower than the same value by the proposed algorithm for step optimization at a cost of 21.09% increase in the sum of the number of steps which confirms the good trad-off and high efficiency in the resulting circuits. Table III shows the comparison of results of the proposed multi-objective MIG optimization algorithm for RRAM-based logic circuits obtained by both proposed realizations with the results by two previous works using BDD-based [11] and AIGbased [12] synthesis. Both works exploit optimization to lower the number of RRAMs and computational steps. According to Table III , the sum of the number of computational steps by our proposed MIG-based synthesis approach for the MAJ-based realization is almost 8 times smaller than the corresponding value obtained by BDD-based synthesis [11] at a fair cost of 57.42% increase in the total number of RRAMs. This is mostly due to the fact that MIGs have the privilege of benefiting from the built-in majority property of RRAMs. Although the ratio of the number of steps between the BDD-based approach in [11] and the proposed MIG-based approach scales down to 4.5 for the IMP-based realization, it can be still regarded as a noticeable superiority of MIGs in synthesis of RRAM-based circuits. This is especially obvious for larger functions. For example, the numbers of steps for the largest functions in the benchmark set apex6 and x3 with 135 inputs obtained by the proposed MIG optimization algorithm are equal to 121 and 99 for the IMP-based realization and 44 and 44 for the MAJ-based realization. While, the same values obtained by BDD-based synthesis exceed 1000 steps . More precisely, the number of required steps obtained by the MAJ-based realization for both functions is 26.5 times smaller than the corresponding result by the BDD-based approach at a low cost increase of 32.2% in the number of RRAMs. In other words, synthesis of RRAMbased circuits with BDDs for large Boolean functions might be too costly or even impractical due to the high number of computational steps, whereas the resulting circuits by MIGbased synthesis still remain efficient and quite fast.
C. Comparison with existing approaches using BDD and AIG
The results of AIG-based synthesis [12] is given for a different set of Boolean functions including smaller circuits with input variables from 3 to 16. Since the number of required RRAMs for the benchmark set are not given in [12] , here we can only compare with respect to the number of computational steps. The total number of steps by the proposed MIG optimization algorithm for the MAJ-based and IMPbased realizations are respectively 7.1 and 2.57 times smaller than the same value obtained in [12] . Furthermore, the AIGbased synthesis approach proposed in [12] fails to keep the number of computational steps at a reasonable value when the number of inputs increases. As shown in Table III , the approach proposed in [12] requires 1172 and 1564 computational steps, respectively, for functions sym10 d and t481 d with 10 and 16 input variables. While using our MIG optimization algorithm, both functions can be synthesized with only 72 or 187 steps for the MAJ-based or IMP-based realizations, respectively.
V. CONCLUSION
We presented an approach for MIG-based synthesis of Boolean functions implemented with RRAMs using two different realizations. We proposed MIG optimization algorithms to reduce the number of RRAMs and computational steps addressing the area and delay of the resulting circuits, respectively. Experimental results show that the proposed algorithms have successfully fulfilled the aims of optimization that are either finding a trade-off between both objectives or minimizing the number of computational steps. Especially, our proposed approach gains high quality performance with respect to the number of steps that is known to be the major cost metric in RRAM-based circuit design. 
