Abstract-With the fundamental limitation on transistor scaling and energy efficiency, Nano-electromechanical (NEM) relays have emerged as a promising alternative solution for ultralow power integrated circuits due to their zero leakage and steep subthreshold properties. In this paper, we explore the implementation of the I/O interface circuits, namely the analog-todigital converter (ADC) and the digital-to-analog converter (DAC), with back-end-of-line (BEOL) NEM relays. The proposed design for the ADC comparators eliminates the need for reference generation by utilizing a bank of NEM relays with different pullin voltages. The decoder and encoder of both data converters are optimized by using a minimum number of devices and exclusive, non-leaking paths for each input and output code. As a result, while the sampling rate and operation frequency of the relaybased converters are inherently lower than CMOS counterparts due to mechanical nature of operation, the proposed data converters can achieve at least one order of magnitude lower energy consumption, which makes them appealing alternatives for ultra-low power VLSI and Internet of Things (IoT) applications.
INTRODUCTION
The NEM relays have been studied and demonstrated extensively in diverse VLSI applications such as logic gates, arithmetic blocks, memory cells and power management [1] [2] [3] [4] [5] . The potential of NEM relay technology for low power I/O units and data converters have also been investigated [1] [2] [3] . However, no complete circuit implementation of data converters can be found in the previous works, and the proposed schemes suffer from significant static power dissipation.
Recently a CMOS-compatible, vertical NEM relay that utilizes BEOL process and airgap technology has been proposed [6, 7] . In this work, we use a similar structure and customize it to demonstrate a novel implementation of all-NEM relay 4-bit data converters. By minimizing the number of devices for each block, removing the current leakage path and optimizing the operational voltage of relays, the proposed design in this paper can reduce the energy per operation cost significantly. We use Coventor MEMS+ [8] to investigate the electromechanical behavior of the relays and Cadence Virtuoso Spectre for the circuit implementation and simulation with a Verilog-A behavioral description model that we have developed for the BEOL relay.
The rest of this paper is organized as follows. The device structure, operation principles and methods for tuning and customizing the device behavior are presented in section II. The architecture and simulation results of the proposed ADC and DAC, and their corresponding decoder/encoders, are discussed in sections III and IV, respectively. A short discussion on the results and main conclusions are presented in section V.
II. DEVICE STRCUTURE, OPERATION AND TUNING

A. Device Structure and Operation
Similar to the CMOS transistor, the NEM relay, depicted in Fig. 1(a) , has four terminals: a moving gate beam (G), and fixed source (S), drain (D) and body (B) electrodes, all made of aluminum. The gate spans from M1 to M4 layers and the channel is on the M5 layer. M1 is fixed as an anchor. The VIA between M4 and M5 is inter-metal dielectric made of silicon dioxide and prevents the flow of current from the gate to the channel. The technological specifications are listed in Fig. 1(b) .
When a voltage difference is applied between the gate (the free moving beam) and body electrodes, the beam will bend towards the body due to the electrostatic force generated in between. If this force is strong enough to overcome the spring restoring force, the beam "pulls in," and the channel connects the source and the drain. Now if we reduce the applied voltage, at a certain point, the spring restoring force becomes larger than the electrostatic force and surface contact forces (including Van der Waals' force between atoms and capillary force due to a certain humidity in the air [9] ), thus the beam "pulls out". As a consequence, the operation is inherently hysteretic, a fact that is reflected by a 0.17 V difference between the pull-out voltage (VPO) and pull-in voltage (VPI), as illustrated in Fig. 1(c) .
B. Tuning the Pull-in Voltage
The design of ADC comparators, explained in the next section, requires relays with different VPI. There are a few methods for tuning the spring constant of the beam and thus the VPI. The simplest way to do this without a change in the dimensions and area of the device is by adjusting the inter-metal (c) (a) VIA locations. By doing that, the VPI can be tuned by changing the effective length of the spring (Leff). The serpentine-shaped gate in Fig.1 (a) represents the largest Leff, hence provides the minimum stiffness and spring constant. Therefore, minimum VPI is obtained with this structure. In this work, we achieve the VPI tuning by varying the location of VIA1 to VIA3 along the beam. We assign the VIA locations to five points along the metal beam, including both ends, to observe distinct shifts in the VPI. We call the possible locations from far left to far right as position "1" to "5 as annotated in Fig. 2(a) , and the full arrangement is represented by a three-digit number representing the position of VIA1 to VIA3. As an example, the structure in Fig. 1(a) has the VIA arrangements of "151".
As the comparator bank of the 4-bit flash ADC in this paper requires 15 devices with uniform changes on VPI, the tuning method stated above is implemented on the structure. Fig. 2 displays four examples of the device variations and the resulting VPI. TABLE I summarizes simulated VPI for all desired devices.
C. Tuning the Operating Voltage with Body Biasing
The NEM relay is actuated by the electrostatic force generated between the moving gate and the fixed body, conventional way of implementing this is applying VPI on the gate while the body is grounded. However, biasing the body with a negative voltage while keeping the voltage difference identical can lower the required input voltage on the gate. When the body is biased at -VPO, the relay can be switched on with only (VPI -VPO). If this method is applied to the structure in Fig. 1(a) , it results in a VPI of 0.17V and VPO of 0V, which is the minimum operating voltage achievable as described in Fig. 3(a) .
In addition to more freedom on the circuit design, this method also saves more than one order of magnitude dynamic energy for most cases, as a direct result of operating in lower voltage domain. A practical operating point to ensure safe functioning is suggested in Fig. 3(b) with the simulated results from Cadence Virtuoso pointing out an abrupt switching behavior in the I-V curve.
III. BEOL NEM RELAY FLASH ADC An overview of the ADC design is illustrated in Fig. 4 . The analog input (VA) goes through a standard sample and hold (S/H) sub-circuit [1] , then it is picked up by the comparators. The outputs of comparators are then converted from thermometer coded (TC) to standard binary output. The comparators utilize a variety of relay structures mentioned in TABLE I. The details of the design will be discussed in this section.
A. Design Methodology for ADC Core
The schematic block diagram in Fig. 4 exhibits the architecture of the all-relay flash ADC. The body of the comparators and the decoder is biased with a negative voltage, VNEG, to enable the operation of the circuits with an independent low voltage supply (VLOW) thus reducing the total energy consumption. The control signals used in the circuit include the sample signal (SAMP) and the hold signal (HOLD) to guarantee a valid and fixed analog input for the comparators, VHOLD, during each conversion. VHOLD, the shared input among all comparators, is effectively compared with the VPI of various relays simultaneously, deciding the switching behavior of those relays. Each comparator's output (OUTn) is first pre-discharged to GND when the evaluation signal (EVAL) is 0. When EVAL becomes 1, OUTn is evaluated (charged) to VLOW only if VHOLD is large enough to turn on that specific comparator relay (with a unique pull-in voltage of VPI,n), otherwise OUTn remains 0.
Note that only the comparators utilize relay structures with various VPI. All other relays have the same structure ("151") which exhibits the minimum VPI and optimal energy and speed.
B. 15-line to 4-line TC to Binary Relay Decoder
The 15-bit TC output needs to be converted to its standard 4-bit binary equivalent. We designed a relay-based decoder that generates the final output !3!2!1!0 based on the truth table in Fig. 5(a) . The TC output, OUT in Fig. 5(a) , contains consecutive 1's and 0's. This property can be used to simplify the decoder with a methodology shown in Fig. 5(b) . The implementations are demonstrated in Fig. 6 . The circuit is optimized to implement the decoder with as few as 68 NEM relays and only incurs one mechanical delay (~20ns), the total delay of the ADC is approximately two mechanical time constants. Fig. 7 reveals the Spectre simulation results of the NEM relay ADC with a sampling rate of 2.5MHz. The TC output plots are replaced with binary blocks for better visualization.
The total number of relays used to implement the ADC is 102, and the energy consumed during each conversion step is 0.74 fJ on average. This low energy consumption behavior is attributed to the absence of static direct current, e.g., voltage reference level provided by resistor string in traditional flash ADCs, and low swing of operational voltage(VLOW). The energy/op consumed in this circuit shows more than 96.6% reduction compared to the flash relay ADC proposed in [1] . The comparison between this work and state of the art relay ADC as well as a few CMOS ADCs are listed in TABLE II. The figure of merit used in this paper is Walden FOM [13] , FOM= Power/ (2 n * Sampling Rate), where n is the resolution of the ADC.
IV. BEOL NEM RELAY DAC In this section, we describe the implementation of all-relay DAC. Fig. 8(a) illustrates the schematic of the proposed DAC. Here we follow a design flow that is the reverse of the ADC, where we encode the 4-bit input into a 15-bit TC output via a dedicated relay encoder and then convert it to the final analog output using a bank of relay buffers. Each buffer is driven by one of the TC input lines and followed by a terminating resistor. This topology can generate the output voltage with a fixed step increment of VLOW/N, where N=2 We use the reverse of the truth table in Fig.5(a) to extract the expressions for the binary to TC encoder. The algorithm is summarized in Fig. 8(b) . Since many of the TC bits share similar expressions, we share some relays in the main path for multiple outputs. The constructed circuit is shown in Fig. 8(c optimized to remove recurring current paths and achieve the minimum number of devices for each function. The operational voltage of the relays has been reduced to a minimum which is enforced by the inherent hysteresis between VPI and VPO. The energy efficiency is significantly improved for both DAC and ADC as a result of these optimizations. While the conversion speed of relay-based data converters is considerably lower than the CMOS counterparts due to the mechanical nature of their operation, the superior energy per conversion proves the feasibility of this technology for ultra-low power applications with modest performance requirement, such as IoT and biomedical applications. 
