I. INTRODUCTION
T HE CRITICAL need for high-power efficiency and bandwidth in transceiver designs has significantly increased as mobile devices, such as smart phones, laptops, and tablets, continue to advance in media and graphic processing capabilities. However, the current mobile interface technologies that support CPU to memory communication have critical limitations, particularly super-linear energy consumption, limited bandwidth, and nonreconfigurable data access.
Typical memory interfaces operate at 6 and 2.15 Gb/s/pin with the power efficiency of 15.8 and 6.6 pJ/b/pin, respectively [1] , [2] . Current mobile DDR memory I/O featuring a differential dual-band interconnect (DBI) has better power efficiency of 5 pJ/b/pin at 4.2 Gb/s/pin [3] for simultaneous bidirectional (SBD) mobile memory I/O link. However, the DBI's differential signaling is incompatible with singleended mobile memory interfaces, and the DBI link [3] has limited I/O data rate and high power consumption for future Manuscript low-power DDR (LP-DDR) mobile I/O. Serial links [4] - [6] could be a promising solution for mobile memory I/O interface by providing high bandwidth, reducing cost and power dissipation and requiring less data bus lines. However, serial link transceivers typically require long initialization time. Thus, to have a fast switching between active, standby, self-refresh and power-down operation modes in mobile DRAM, they do not meet these requirements.
To mitigate these concerns, we propose a single-ended multilevel dual-band (MDB) memory interface in order to enhance the circuit and system bandwidth and power efficiency. The proposed energy-efficient mobile memory interface that utilizes a pulse amplitude modulation (PAM) signaling and an RF-band signaling is capable of simultaneous bidirectional communication and reconfigurable data access. It also increases power efficiency and bandwidth between mobile CPUs and memory subsystems on a single-ended shared transmission line (T-line). Moreover, due to multiple data communication on a single-ended shared T-line, the number of T-lines between mobile CPU and memories is considerably reduced, resulting in more compact devices and low-cost packaging to mobile communication interface and establishing the principles and feasibility of technologies for future mobile system applications.
This brief is organized as follows. Section II discusses the proposed MDB mobile memory interface architecture 1549-7747 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. along with PAM and RF-band transceiver circuitries. The implementation results of the fabricated chip are presented in Section III. Finally, the conclusion is provided in Section IV.
II. PROPOSED MDB MOBILE MEMORY INTERFACE
The proposed MDB mobile memory interface is shown in Fig. 1 . Unlike the conventional BB-only signaling, the proposed MDB signaling uses both the PAM and RF-band for simultaneous triple data stream communications on a shared single-ended T-line. The I/O interface bandwidth triples and the power efficiency improves by using the novel MDB signaling and sustaining the linear power consumption versus the bandwidth in both the PAM and RF-band. By applying the MDB link to LP-DDR I/O data (DQs) and to the command/address (C/A), the mobile DRAM access time can be significantly reduced by requesting concurrent DRAM read/write-operations on a shared memory channel. Fig. 2 shows the SBD MDB I/O transceiver schematic of the memory controller side with an RF-band transmitter (RFTX) and a PAM receiver (PAM-RX). The RFTX is composed of an LC-VCO, an amplitude-shift keying (ASK) modulator, and an on-chip band-selective transformer with on-die termination (ODT) for impedance matching of a shared off-chip I/O channel. In the RFTX, an LC-VCO first generates an RF carrier at 20 GHz, and the ASK transmitter switches on/off the current flow through M1-M4 and increases the output voltage swing through the cross-coupled pair (M5 and M6) to complete the ASK modulation and enhance the signal to noise ratio on an off-chip channel. The transistors of M7 and M8 are also employed in the modulator RFTX to allow a fast shut off at the output. Using this switch reduces the amplitude of nonmodulated signals that pass through the T-line.
The PAM-RX amplifies the incoming data streams D 2(PAM) and D 3(PAM) uses three comparators with dynamic ODT. The subtraction is done using the dual sampling comparators to increase the maximum achievable data rate. The comparators effectively recover the data from the channel by using reference voltages. The data is compared with the reference voltages, and then the recovered data from the comparators, and is decoded to 4-bit binary data. The comparators consist of differential amplifiers that have a stable gain over a wide common-mode range. Therefore, the output of the comparators connects to differential voltage buffers. Before transferring data to the decoder, a DFF circuit is utilized to hold data.
The digitally controlled ODT is utilized to improve signal integrity of the memory channel and to remove the impedance mismatch, concurrently. The ODT sets the common mode voltage and removes the impedance mismatch for optimal signal integrity. In order to implement the ODT, passive resistors are connected in series with transistors, as shown in Fig. 2 . The maximum data rate can be limited by common-mode range, voltage resolution, and offset voltage of the comparator. One of the most important methods to increase the maximum achievable data rate is the employment of parallel sampling using comparators. However, if the various common-mode ranges are applied to the comparators, it will make them unstable due to the noises at the different common modes. Thus, the sampling method increases the data rate, resulting in the increase of power consumption and noise. In this brief, the common mode range of the comparator should be carefully calibrated to recover the modulated signal correctly [7] . Thus, the noise problem can be solved by proper selection of sampling range for each comparator. Fig. 3 shows the MDB I/O transceiver of the DRAM side with an RF-band receiver (RFRX) and a PAM transmitter (PAM-TX). The RFRX consists of a transformer and a noncoherent demodulator. The on-chip band-selective transformer separates the MDB-modulated signals into the baseband PAM and RF-band and also helps convert a single-ended D 1(RF) signal to a differential signal. The RFRX does not require a power-hungry phase and frequency synchronizer due to its use of a noncoherent detector that only senses the envelope of an incoming signal. The band-pass filtered RF-band carrier is applied to the differential mutual mixer and down-converted to the baseband data D 1(RF) . The PAM-TX contains input buffers, an encoder, leakage control logic, DFFs, a DAC-based driver, and impedance control logics. Two data streams (D 2(PAM) and D 3(PAM) ) are encoded and synchronized to a forwarded clock at a DRAM side. The PAM-TX with the encoder and thermometer code (a-c) generates a four-level PAM signal by selecting different pushing/pulling current of the DAC-based driver [8] . To reduce the latency skews of the encoded signal, a three-bit synchronizer is utilized to generate a synchronized encoded signal. In a conventional PAM-TX, switches create a leakage current in the T-line. A leakage suppression control logic block is added to the PAM-TX to reduce the leakage current in DRAM power-down/nap mode. When the control logic circuit output (ENC) is high, the switches are turned off in DRAM power-down mode to save all leakage power. If the state of ENC is changed to low, the PAM-TX performs normal operation in DRAM active (or active standby) mode. The impedance control logic, which is the combination of resistors and transistors, is also integrated to the PAM driver to avoid impedance mismatch and reduce sensitivity to PVT variations. 
III. MEASUREMENT RESULTS
The proposed MDB interface is designed and fabricated in 65 nm CMOS technology to demonstrate multiband bidirectional communication on a shared single-ended PCB T-line. The MDB memory interface consumes 31 mW with a 1.2-V supply that each RF-band transceiver and PAM transceiver consume 12.6 and 18.4 mW, respectively. Fig. 4 shows the die photo of an MDB transceiver, where the PAM and RFband transceivers along with on-chip transformers occupy 0.13 mm 2 . An FR-4 PCB has been employed to verify the capability of simultaneous bidirectional communication, as shown in Fig. 4 . The simulated signal loss of the 5-cm FR-4 PCB T-line is −6.4 dB/Hz at 20 GHz, as shown in Fig. 5 . The simulated multilevel MDB waveform at the T-line is shown in Fig. 3 . The amplitude of the RF-band signal is degraded by the T-line. Two transceivers are integrated for either a controller or DRAM side so that two devices can implement the complete interface architecture shown in Fig. 1 .
The measured power spectrum of the 20-GHz VCO, which is placed at the memory controller side, is shown in Fig. 6 . Figs. 7 and 8 show the measured eye diagrams of the MDB interface. The measured eye diagrams are taken from the output driver. The total aggregate data rate is 13.4 Gb/s/pin over an FR-4 5 cm T-line. The PAM band carries 9.2 Gbps with a maximum rms jitter and peak-to-peak jitter 7.60 and 47.16 ps, respectively. The RF band carries a 4.2-Gbps data rate with a maximum rms jitter and peak-to-peak jitter 10.09 and 56.42 ps, respectively. The minimum eye width and height of the PAM transceiver are 182.1 ps and 132 mV, respectively, and 177.1 ps and 112 mV in the RF-band transceiver, respectively. These eye diagrams demonstrate that good signal integrity can be achieved at these data rates with the proposed MDB signaling.
The bit error rate (BER) is measured by BERT scope and the RF-band transceiver is working errorfree at 4.2 Gb/s (PRBS 2 15 −l). The BER is measured as < 1 × 10 −15 by using 2 23 −1 PRBS from the Agilent-70843A. The BER is expected to be better than that of phase-sensitive modulation schemes since the receiver mixer only senses the incoming signal's amplitude and frequency and phase synchronizations between the RFTX and RFRX are not required. Table I compares the MDB interface performance to that of prior memory I/O interfaces. In contrast to [1] - [3] and [9] - [11] , the MDB system transceives three data streams simultaneously on a shared T-line by incorporating a PAM transceiver with an ASK transceiver. Moreover, compared to the conventional memory I/O interfaces that use differential T-Line [2] , [3] , [9] , [11] , [12] , the MDB interface utilizes a shared single ended T-line. The differential signaling is incompatible with a current DDR memory interface and it also occupies a large Si area. Although the power consumption is high due to using a PAM signaling and LC-VCO, the MDB still exhibits the highest aggregate data throughput (∼13.4 Gb/s/pin).
IV. CONCLUSION
The design and fabrication of an MDB interface for a mobile DRAM I/O interface in 65-nm CMOS that obtains an aggregate data throughput of 13.4 Gb/s/pin on a shared single-ended 5-cm T-line is described in this brief. The MDB interface consumes 31 mW with a 1.2-V supply while each RFband transceiver and PAM transceiver consume 12.6 and 18.4 mW, respectively. The BER is measured as < 1 × 10 −15 by using 2 23 −1 PRBS. The proposed MDB interface is able to meet the required aggregate data throughput and energy efficiency demands of future mobile memory I/O link systems due to providing higher aggregate data throughput (13.4 Gb/s/pin) and moderate energy efficiency (∼2.3 pJ/b/pin) compared with prior work.
