ABSTRACT Resistive switching behaviors of oxide-based resistive random access memory (RRAM) and the applications for the data storage and computing systems have been widely studied. In this paper, the critical issues correlated with the applications of oxide-based RRAM are addressed. The physical mechanism and models of oxide-based RRAM are first introduced. Then the optimization design methodology of devices and operation schemes is discussed. Finally, a novel architecture for unifying memory/computing system is presented.
I. INTRODUCTION
Electric-induced resistive switching (RS) behaviors have been demonstrated in various dielectric materials especially metal-oxides [1] - [3] . Based on the RS, resistive random access memory (RRAM) devices can be constructed for memory application [4] . Excellent performances were observed in the oxide-based RRAM devices such as great scalability less than 10nm, good compatibility with CMOS process, fast switching speed (<1ns), low operation voltage (<1.5V), and high endurance and retention [5] - [10] . High density of 32/16 Gb test chips have been demonstrated [11] , [12] . Meanwhile, new functional applications of RRAM in areas such as nonvolatile logic, bioinspired computing, and memory/computing systems are also explored [13] - [18] . Regarding to the practical applications, understanding the underlying physics of switching behaviors of RS and RRAM devices is required. Much efforts have been paid and various physical mechanisms such as the redox effect, generation and recombination (G-R) effect of oxygen vacancies with oxygen ions, phase transition (P-T) effect, Joule heating (J-H) effect, and interface barrier (I-B) effect have been proposed [1] , [19] - [23] . Based on the deeply physical understanding of resistive switching in RRAM, the optimization design of the device structures and operation schemes can be performed [24] - [27] . In this paper, we will review and demonstrate our research achievements on oxidebased RRAM including the physical models, optimization design of devices and operation schemes, and the application as a novel unified memory/computing architecture. In Section II, the physical mechanism and models on unipolar and bipolar switching behaviors are firstly introduced. Then the optimization design on device structures and operation schemes is presented in Section III. In Section IV, a novel RRAM-based "iMemComp" architecture different from the traditional von Neumann architecture is discussed. Finally is conclusion in Section V.
II. PHYSICAL MODEL
A typical structure of RRAM is schematically shown in Fig. 1 (a) . The RS layer is sandwiched between the top 2168-6734 c 2016 IEEE. Translations and content mining are permitted for academic research only.
Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 4, NO. 5, SEPTEMBER 2016
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. 307 and bottom electrodes. The device can be switched between high resistance state (HRS) and low resistance state (LRS) by SET and RESET process, respectively. For a fresh device, a forming process is usually required to trigger on the resistive switching behaviors for the subsequent cycles. Two switching modes of both unipolar and bipolar are broadly classified as shown in Fig. 1(b) and (c), respectively. For unipolar switching, the same polarity of biases can be used to SET and RESET the device. For bipolar switching, a reverse polarity of bias has to be used to RESET the device when we use a positive or negative bias to SET the device. For oxide-based RRAM, the filament effect due to the formation and rupture of oxygen vacancy (V O ) conductive filaments (CF) in the switching layer has been widely accepted [1] , [2] . To explain the underlying physics during switching processes, we proposed a unified physical mechanism based on the G-R effect of oxygen vacancies (V O ) with oxygen ions (O 2− ) [19] , [28] , as shown in Fig. 2 . In the mechanism, the formation and rupture of Vo filaments are correlated with the generation and recombination of V O . For unipolar switching, the production of moveable O 2− ions is usually originated from the nonpolar Joule heating effect. For bipolar switching, the production of moveable O 2− ions is mainly due to the electric-field effect. Based on this understanding, the physical models of unipolar switching [19] , [29] and bipolar switching [30] - [32] can be developed. [29] .
Based on the bipolar switching model, the resistive switching behaviors can be quantified [30] - [32] . For example, the voltage dependence of t RESET can be modeled. According to the model, t RESET is dependent upon two factors: 1) the transport time (t m ) of O 2− from interface to bulk due to diffusion and drift, and 2) the recombination time (t r ) of
where t RESET is the minimum of t m +t r . In this way t RESET can be calculated and the turning point is predicted by the model as shown in Fig. 3 (Left). The voltage dependence of t RESET predicted by the model were experimentally verified by ZnO-based RRAM devices, as presented in Fig. 3 (Right) [31] .
Generally, oxide-based RRAM shows better performance in bipolar switching mode than in unipolar switching mode. Therefore, most of current research focuses on the bipolar switching devices.
In order to set up the bridge between device behaviors and device-circuit-systems co-design, the compact models 308 VOLUME 4, NO. 5, SEPTEMBER 2016 are necessary. Based on the deep understanding on the microscopic properties of V O CF evolution during switching processes, the physics-based compact models are further developed to capture all the essential characteristics of oxide-based bipolar RRAM [33] , [34] . The model description is schematically shown in Fig. 4 , where the CF are simplified as a cylinder, two key physical variables of the CF radius (w) and the ruptured gap distance (x) including the CF evolution during SET and RESET operations are introduced to feature the resistive switching behaviors of oxide-based RRAM [28] , [33] . Fig. 5 shows the comparisons of electrical characteristics both in DC and AC operation modes calculated by the compact model with measured data in HfO xbased RRAM [28] , [33] . Excellent agreements between the model's results and the measured data in HfO x -RRAM are achieved. These indicate that the developed compact model can accurately capture the device behaviors using the extracted model parameters from the fabricated RRAM devices and meet the requirements of the device-circuitsystem co-design.
III. DESIGN AND OPTIMIZATION OF DEVICES AND ARRAY
It is required to realize the desired device performance for the targeted applications. Based on the physical understanding on RS behaviors and the correlations with the physical distributions of Vo during switching processes, a defect-engineering-based methodology is developed to achieve controlled switching behaviors [24] - [26] . Fig.  6 schematically shows the proposed methodology [25] , [26] . Innovative device designs and operation scheme implementations are developed to achieve the controlled oxygen vacancy distributions and the desired switching behaviors. Based on the crystal defect theory, the generation probability of Vo is largely dependent on the activation energy ε a of Vo and local electric field E loc . Device design includes choosing proper dopants and doping concentrations to form uniform distributions of oxygen vacancies in the oxide switching layer. The first principle calculations indicate that the trivalent elements such as Al, La, or Ga doping in HfO 2 or ZrO 2 causes the lower ε a near the doping' sites, which can effectively suppress the avalanching process correlated with Vo generation induced by a large local electric field to achieve better controllable resistive switching behaviors [24] . Operation scheme implementations include different operation schemes with various heights and widths of the input voltage pulses to effectively control the distributions of oxygen vacancies generated in the oxide switching layer [26] . Based on the device design approach, doped HfO 2 -based RRAM devices are designed and fabricated. The gradual resistance changes both in SET and RESET processes are measured in doped HfO 2 -based RRAM, as shown in Fig. 7 . In contrast to this, the sharp resistance change in SET process is measured in the undoped HfO 2 -based RRAM [26] .
By using different operation schemes, various resistance changes with input voltage pulses can be observed. Figure 8 shows the resistance change behaviors with input pulse numbers during SET and RESET processes. During SET process (in Left), a gradual reduction in resistance of the device is measured when positive voltage pulses are VOLUME 4, NO. 5, SEPTEMBER 2016 309 subsequently added. Then the device is eventually SET to the lowest resistance state (LRS) with a significant resistance drop. This is similar to the response of a neuron under external stimulus. Meanwhile, the measured RESET behavior (in Right) is similar to a multi-state cognitive process. These response features of doped HfO 2 -based devices to voltage pulses are similar to the responses of a neuron under external stimulus, which exhibits the capability as synaptic devices for a neuromorphic system [14] . Moreover, by using the designed operation schemes four stable resistance states can be achieved in the doped HfO 2 -based devices by applying proper voltage pulses, as shown in Fig. 9 [26] . These behaviors are available to use for multilevel data storage or computing application [26] . Above demonstrated results indicate that the proposed methodology is effective to achieve the targeted device performance by using the optimized devices and operation schemes according to the practical application requirements.
IV. MEM-COMPUTING SYSTEM APPLICATION
Modern data-centric information processing systems are usually based on the well-known von Neumann computer architecture. In this architecture, the central processing unit (CPU) and memory are separated and connected via buses as shown in Fig. 10 (Left) . Today, the need for energy-efficient information systems is as great as ever with the rapid growth of new technologies such as big data processing and Internet of Things. In this case, the traditional von Neumann architecture may suffer from the energyhungry movement named as the "von Neumann bottleneck". Regarding to this issue, various alternative approaches such as logic-in-memory have been proposed to mitigate von Neumann bottleneck [16] , [35] , [36] . Aiming at breaking the bottleneck in both device and architecture levels, we have developed a novel architecture named as "iMemComp" based on RRAM [17] , [18] , as shown in Fig. 10 (Right). In the "iMemComp", memory and computing functions are unified in the same device, which implies that the energy-hungry data movement should be vastly decreased. The physical state variable of "iMemComp" is resistance instead of voltage or charge, where LRS is defined as logical '1' and HRS is logical to '0'. The inverter logic gate is constructed by two RRAM in the "iMemComp", as shown in Fig. 11 . To execute the INV operation, voltage pulses of VDD/2 and VDD are applied to the top electrodes of input and output devices, respectively. Initially, the cells for output are in HRS. If the cell for input is in LRS, the voltage dividing between IN and the resistor R Load will raise the electrical potential of the common bit line, which will reduce the voltage drop across the OUT cell, resulting in the voltage lower than the SET voltage threshold. Therefore the OUT cell remains HRS. If the IN cell is in HRS, then the electrical potential of the bit line will remain close to ground. Therefore, the voltage drop across OUT cell is approximate to VDD, which is larger than SET threshold, leading to SET operation and thus, the transition to LRS. Therefore, an inverter function is realized by RRAM under the stimulation of external pulse train. Another basic logic operation AND constructed of four RRAM devices is introduced to achieve a functionally complete set as shown in Fig. 12 . It costs three clock-times to perform AND logic operation. Before the logic operation output and assist devices are initially RESET to HRS. During clock-times 1&2, if any input device is in HRS, the Asst device will be SET to LRS, otherwise it remains HRS. During clock-time 3, assist device and output device are executed an INV logic operation. So a AND logic operation is performed, which can be expressed as 'OUT=AB'. Fig. 13 shows the measured results of AND logic operations. The resistance states of output devices are read out by a small pulse voltage as shown by the gray-scale maps. Despite the variations around '0' due to the variability of HRS, the distinct computing results verify the feasibility of nonvolatile logic operations in "iMemComp". This functionally complete logic set can serve as the building blocks for more complex computing tasks as discussed in [36] . The logic gates in "iMemComp" are non-volatile due to the non-volatility of RRAM devices. Consequently, a timeconsuming transfer of the output to a cache memory is not required. Furthermore, the different stages of a computation need are not necessarily synchronized to a single clock. Therefore low power and parallel computation can be achieved by asynchronously performing the logic operations.
It should be noted that the demonstrated features of RRAM such as simple structure and fabrication process, high switching speed, low switching energy, and low cost are also beneficial for nonvolatile logic applications. The speedpower-area figure-of-merits and the HRS/LRS ratio are good for the logic operations. Though the limited endurance may be not sufficient for the logic computing application, the 'nonvolatility' of RRAM cells in a computing system can significantly mitigate the endurance requirement, since the computing behaviors are not as frequent as they are in CMOS-based processors.
In the context of "iMemComp" architecture built upon the crossbar RRAM arrays, a single row represents an independent processor, as shown in Fig. 14 . In this case, the architecture can offer the new features such as parallel computing and logic learning. The "iMemComp" is equipped with "logic learning" capability owing to the nonvolatile nature of RRAM. Various user-defined functions can train the arrays at the time of computation, leaving the answers memorized in the RRAM cells. These learned logic functions, which are easy to readout from crossbar arrays through decoding, can be reused for multi-bit logic and large-scale repeated tasks. Owing to the unique features demonstrated in the "iMemComp" architecture, both devicelevel and circuit-level properties for parallel computing, and the reconfigurable logic learning functions, with the superior performance in power and speed can be achieved. These have proven the feasibility of high-density, massively parallel, ultra-low-power information processing systems based on "iMemComp".
V. CONCLUSION
Regarding to the application requirements of oxide-based RRAM for the data processing systems, both the physical models and the optimization design methodology of VOLUME 4, NO. 5, SEPTEMBER 2016 311
oxide-based RRAM devices are developed to quantify the device performances and design the devices and operation schemes. Based on the achievements in device level, a novel RRAM-based "iMemComp" architecture is proposed and demonstrated to realize the low-power and high-efficiency computing and memory performances of data-centric information processing systems.
