Abstract-A sensor interface circuit based on impulse radio pulse width modulation (IR-PWM) is presented for low power and high throughput wireless data acquisition systems (wDAQ) with extreme size and power constraints. Two triple-slope analogto-time converters (ATC) convert two analog signals, each up to 5 MHz in bandwidth, into PWM signals, and an IR transmitter with an all-digital power amplifier combines them while preserving the timing information by transmitting impulses at the PWM rising and falling edges. On the receiver side, an RF-LNA followed by an envelope detector recovers the incoming impulses, and a T-flipflop reverts the impulse sequence back to PWM to be digitized by a time-to-digital converter (TDC). Detailed analysis and design guideline on ATC was introduced, and a proof-ofconcept prototype was fabricated for a capacitive micromachined ultrasound transducer imaging system in a 0.18-µm HV CMOS process, occupying 0.18 mm 2 active area and consuming 3.94 mW from a 1.8 V supply. The proposed TDC in this prototype yielded 7-bit resolution, while the entire wDAQ achieved 5.8 effective number of bits at 2 × 10 MS/s.
I. INTRODUCTION
C LASSIC analog to digital converters (ADC) have been at the heart of almost every modern data acquisition (DAQ) systems since 1970's, and still constitute an active and thriving field with a wide variety of architectures and a large number of new publications and patents [1] . However, when it comes to certain applications with high data throughput and extreme size and power constraints, they may not represent the best choice. One of these applications is intravascular ultrasound imaging (IVUS), in which the ultrasonic transducers and their interfacing circuitry should fit at the tip of 3 -8.2 French (F) catheters (1 mm -2.7 mm in diameter) [2] , [3] . Yet there is an effort underway to combine capacitive micromachined ultrasound transducers (CMUT) Manuscript received September 19, 2018 ; accepted October 16, 2018 . Date of publication October 29, 2018 ; date of current version December 21, 2018 . This work was supported in part by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) Awards under Grant R21EB017365-02 and Grant R21NS108391-01. The associate editor coordinating the review of this paper and approving it for publication was Prof. Vedran Bilas. Operational diagram of pulse width modulated impulse radio (IR-PWM) wireless data acquisition (wDAQ). The information in the analog input signal (V) is converted into a pulse with the width of α × I. In FSK-PWM, a frequency-modulated sinusoidal wave is generated at two frequencies that represent '1' and '0' of the PWM pulse. In IR-PWM, sharp impulses are transmitted at every rising and falling edge of the PWM signal, as shown in the bottom trace.
with high voltage (>60 V) readout circuitry at the tip of 1.1 F -2.67 F guidewires (0.36 mm -0.89 mm in diameter) to image stent deployment or assess an artery occlusion [4] .
The acoustic pulse echo signal generated by a highfrequency CMUT occupies the 35-45 MHz spectrum [4] , which after down-conversion to baseband would still need a sampling rate of at least 10 MS/s. Due to lack of space, interface electronics for CMUT transducers have traditionally been limited to transimpedance amplification (TIA), buffering, and delivery of the amplified analog signals across the catheter with a bundle of thin wires, one for each channel, followed by digitization and signal processing outside the patient body to create the image. The large number of interconnects through the catheter increases its thickness and stiffness, while exposing the acquired signals to noise, interference, impedance mismatch, and crosstalk. There have been attempts to timedivision multiplex (TDM) the analog signals [5] , or use optical readouts [6] with limited success. Even including ADCs on tip of the thin catheters, close to the ultrasound transducers, is challenging because of extreme size constraints [7] .
Yet another challenge is that electrostatic microelectromechanical (MEMS) sensors and actuators require high voltage supplies to achieve sufficient sensitivity and precision [8] , [9] . The interface ASICs for these applications cannot be fabricated in low voltage deep submicron processes.
In this paper, for instance, we used 180-nm high voltage (HV) CMOS process, which provides only 4 metal layers for interfacing with overlaying CMUTs. In such processes, when there are extreme size constraints, it is even more difficult to use conventional ADCs. As a result, for an application such as IVUS, there has always been a compromise between the number of channels, diameter of the catheter, its field of view, and resolution of the resulting image [2] . Aside from finding a way to deliver the acquired wideband data out of the body, a potential remedy for reducing the number of wires towards a thinner catheter is to send the data wirelessly over the short distance (∼10 cm) from inside the heart to a receiving patch antenna, adhered to the skin on the patient's chest.
Here we propose the use of analog-to-time conversion (ATC) in lieu of a traditional ADC in the abovementioned demanding DAQ applications and present a new architecture based on a combination of ATC, impulse radio (IR), and timeto-digital conversion (TDC), which would go one step further and satisfy the strongly preferred wireless data communication approach, particularly in biomedical applications, by creating a wireless DAQ (wDAQ) with low power and small footprint on the transmitter (Tx) side that is often placed inside of the body. In earlier implementations of the ATC-wDAQ, we used frequency shift keying (FSK) to establish the wireless data link for a neural recording application [10] . However, the voltagecontrolled oscillator (VCO) requires considerable power and area. Since the IR-PWM substitutes the carrier signal with narrow impulses while preserving key timing information, the Tx power efficiency in this design is considerably higher, and the simple IR-PWM Tx architecture also saves silicon area. Although this paper uses IVUS as the target application [4] , the proposed wDAQ architecture can be used for other sensing applications, such as CMUT-based gas sensor [11] , MEMSbased acoustic sensors [12] , and neural interfaces [13] - [16] .
The following section provides details on how to implement the proposed IR-PWM based wDAQ architecture, including schematics and operating mechanism. Section III includes the design guidelines, noise analysis, and key parameters that would affect the proposed wDAQ performance. In section IV, measurement results are presented and analyzed, followed by a brief discussion in section V, and conclusions.
II. IR-PWM WDAQ ARCHITECTURE
To define the design target for the proof-of-concept IR-PWM based wDAQ prototype, we have considered the fact that vascular ultrasound imaging, our exemplar application, requires a minimum of 50 dB signal-to-noise ratio (SNR) on the integrated echo signal data for generating an image with 40 dB dynamic range [17] . It can be shown that by adopting a method known as synthetic aperture imaging, SNR can be improved by
where N is the number of transducer elements [17] . In this case, the required SNR can be reduced down to 35 dB with a 6-element CMUT array [9] . Considering that the CMUT readout requires a minimum sampling rate of 10 MS/s [4] , the dual-slope charge sampling (DSCS) analog front end (AFE) presented in [18] (Fig. 2) , which was designed for sampling rates in the order of 1 MS/s, was not sufficient. However, it inspired the new design that is presented here. The operation of the DSCS-AFE in Fig. 2a is shown in Fig. 2b . Briefly, during the pre-charge phase (1), CAP+and CAP− are pre-charged to V R E F+ and V R E F− , respectively. During the evaluation phase (2), CAP+/CAP− are charged/discharged by the OTA at a rate proportional to the input voltage. During the discharge phase (3), I source /I sink discharge/charge CAP+/CAP− at a constant rate, while a hysteresis comparator that sets at the beginning of this phase, generates the PWM pulse. To achieve high sampling rate, settling times of the OTA and current sources need to be considered. In addition, increasing the sampling rate, exacerbates the effects of charge injection and clock feedthrough on V C AP+ and V C AP− , which are directly related to the ATC noise performance. The amount of charge injection noise can be found from [19] ,
where C ox , V T H , W, and L are the gate capacitance per unit area, threshold voltage, width, and length of the MOS switches, respectively. The clock feedthrough noise is given by [20] ,
which indicates that to minimize noise, W and L of the switches should be minimized, while CAP+/CAP− should be maximized. On the other hand, since CAP+/CAP− charging/ discharging currents pass through these switches, their ON-resistance should be reduced by increasing their widths. Moreover, the OTA output current should be small to reduce power consumption and undesired voltage drop across these switches. CAP+/CAP− should also be small to achieve sufficient sampling rate by charging and discharging rapidly. These are conflicting requirements that eventually define the DSCS-AFE performance limits.
A. Triple-Slope Charge Sampling
To address the above issues, we have designed a triple-slope charge sampling (TSCS) scheme, which schematic diagram and operating principle are shown in Fig. 3 and Fig. 4 , respectively. In TSCS, the switching functions have been transferred from the OTA output to its input. As shown in Fig. 3a , the TSCS is comprised of a reference voltage generator, an OTA, and a comparator. Reference voltages (V R E F+ , V R E F− , and VCM) are generated from a resistive voltage divider, followed by three individual buffers. To reduce the switching noise on reference voltages, on-chip capacitors of 3 pF are used at each buffer output. In TSCS, the OTA basically takes over the role played by I source and I sink in DSCS. Fig. 4 shows the timing and operation of the TSCS, which is divided into three phases. In the pre-charge phase (1), a positive reference voltage, V R E F = V R E F+ -V R E F− , is applied to the OTA differential input, resulting in the OTA increasing In the evaluation phase (2), the OTA input is connected to the differential input voltage, V I N , converting it to a proportional pair of currents that charge/discharge CAP+/CAP−, respectively. Finally, in the discharge phase (3), a negative reference voltage, −V R E F = V R E F− -V R E F+ , is applied to the OTA, resulting in the voltage difference between V C AP+ (descending) and V C AP− (ascending) to decrease down to zero, which we refer to as the 'end point'. At the end point, the comparator, which was turned on with its output set at the beginning of 3, is reset to generate a PWM pulse along with the RST signal, which returns the TSCS circuit back to the initial state by setting CAP+/CAP− at VCM, turning off the comparator, and zeroing the differential OTA inputs, i.e. connecting them to VCM.
In TSCS, the time it takes from starting of 3 to the end point is measured as the PWM pulse width, T PW , which contains the analog sample information. Considering that charge variation from the onset of 1 to the end point is zero, the relation between T PW and differential input voltage, V I N , can be expressed as,
where G m is transconductance of the OTA, t 1 and t 2 are durations of 1 and 2, respectively, and T PW is the PWM pulse width. Since t 1 = t 2 ,
Because t 1 , t 2 , and V R E F are pre-defined values, V I N defines T PW . Since CAP+/CAP− integrate the charge for t 2 = 25 ns in this prototype, their function is the equivalent of low pass filtering V I N with −3 dB bandwidth of [21] ,
The sampling rate of the prototype TSCS is 10 MS/s, the lowpass filtering function of the PWM can suppress aliasing and high-frequency noise and interference. Since T PW , which carries the analog sample information, only occupies 3, chosen to be 50 ns, i.e. half of the total sampling period, 1 + 2 can be used by another identical TSCS AFE, which can share the same communication channel via time division multiplexing (TDM), as shown in Fig. 4 timing diagram, to double the wDAQ data throughput. Since the binary amplitude of the PWM does not contain any information, marking each transition of 3 and end point of the PWM in each half cycle are enough to deliver the sample information, which is marked by impulses that are generated by the IR block.
B. IR-PWM Transmitter
In this paper, we focus on the TSCS design methodology with a focus on the key parameters that affect the resolution and power consumption. Moreover, several improvements over our previous work in [4] are discussed, such as a nonoverlapping clock generator to prevent the two TSCS AFE outputs, CH1 and CH2 in Fig. 4 , from interfering. If two impulses come too close, they are combined and represent one impulse, which result in the loss of the information that they carry. To avoid this, we had to apply a minimum delay offset, t of f , determined by the amount of delay in non-overlapping clock generator, at the expense of a small reduction in the dynamic range as discussed in section III.E. Another improvement is buffering all reference voltages, V R E F+ , V R E F− , and VCM, with class-AB amplifiers to help them settle quickly. Depending on their previous value, the reset period in Fig. 4 can be short, resulting in V C AP+ and V C AP− not able to reach VCM. To address this problem, we have included an equalizer between V C AP+ and V C AP− , such that a small difference between V C AP+ and V C AP− have a negligible impact on the output, thanks to common mode rejection of the comparator. We also implemented intentional offset in the comparator, which is discussed later in this section.
Key components of the IR-PWM front end are the OTA and comparator, as shown in Fig. 3a . The OTA is a source degenerated differential amplifier, shown in Fig. 5a . Since the linearity of the OTA is a key factor that determines the linearity of the entire ATC, the source degenerated resistor, R S in Fig. 5a , which is meant to linearize it, should be carefully chosen. Since T PW is determined by the ratio between input voltage and reference voltage, the absolute value of transconductance is not important. However, if R S is too small, we may not achieve enough linearity, and if it is too large, the OTA settling time may become too slow. To quantify the linearity of the OTA, we simulated OTA output current for all input voltages, found the best fitting first order polynomial based on the regression analysis, calculated the difference in output current, and plotted the worst difference in LSB. We simulated the OTA nonlinearity within input voltage range of ±500 mV, beyond the designated input range of ±300 mV, towards a target 8-bit resolution, as noted in Fig. 6 , showing that the nonlinearity decreases with increasing R S . To achieve less than 0.5 LSB in INL, we need R S ≥ 15 k, and that was the chosen value for R S . Since the OTA input experiences a sudden change at the onset of every phase, we simulated the OTA step response as well. The overshoot is less than 5% in the entire range of R S , meaning that the OTA will be stable with sufficient phase margin.
Since the OTA directly drives CAP+ and CAP−, which have large voltage variations, a two-stage OTA that isolates the input device from large voltage variation at the output is used. The cascode at the second stage minimizes the current variation from drain voltage (V OU T + and V OU T − ) variations, and the first stage cascode increases the accuracy of the current mirror. Moreover, V cap+ and V cap− in Fig. 4 , were limited to 200 mV to ensure that the MOSFETs connected to the output operate in the saturation region. To improve current matching between I S S and I S S2 , another R S and VCM-gated NMOS (M3 in Fig. 5a ) are added. Since the mirror ratio between M1 and M2 in Fig. 5a are 1:2, width of M 3 is chosen twice that of M IN .
Based on the input dynamic range, V R E F should be selected carefully in this mechanism as it determines the resolution of the TSCS. If V R E F is smaller than the dynamic range of V I N , then T PW exceeds the valid range. Hence, V R E F has to be larger than the input dynamic range of the PWM. On the other hand, if V I N dynamic range is much smaller than V R E F , the PWM output dynamic range will be limited and require a high-resolution TDC. Therefore, to fit the input dynamic range within pulse width dynamic range, we set the reference voltage at 350 mV.
Since the main switching activities are transferred to the OTA input, and driven by buffers, the switching noise does not directly affect the sampling capacitors. Nonetheless, noise at the input of the OTA should also be lowered to improve the overall noise performance. The effect of added buffers is twofold: low impedance termination for the switch and fast recovery of the OTA input voltage. When a switch has low impedance terminal, injected charge from the switch mostly flows into that terminal, preventing large variations in the high impedance OTA input. The buffer reinforces the OTA input voltage rapidly every time the switch closes and reduces the switching noise. Because of this fast recovery, the spectrum of the switching noise is pushed to higher frequencies, where it is attenuated by the low-pass filtering and charge integration effects of the OTA and sampling capacitors, respectively.
Since the comparator in ATC should accurately catch the timing of the 'end point', a continuous-time comparator should be used. Dynamic comparators that are commonly used in ADCs are not suitable here, because they only compare at the edge of the clock cycles. We have adopted a comparator with the source-coupled differential pair, shown in Fig. 5b . Despite its positive feedback, the comparator has a delay in the nanosecond range. Considering that the maximum pulse width of the PWM signal is <50 ns, the delay significantly reduces the PWM output dynamic range. Since, at 3, V C AP+ and V C AP− change with constant slope, the comparator delay would be constant, and we can set a deliberate offset by designing an asymmetrical comparator input pair, to compensate for this delay. To save power, an enable switch (EN) is also used to turn off the comparator when it is not used. The comparator has built-in mismatch in its input pair to create offset, as explained above. Yet, relatively large capacitors (3.2 pF) were used to minimize the mismatch between the sampling capacitors. Considering the slope of V cap+ -V cap− in 3, we need a built-in offset in the compactor to compensate for its delay. The offset voltage, V O F F , can be found from,
where A is the ratio between input pairs. In simulations, the comparator delay was ∼2.5 ns, and we need 21.25 mV offset to compensate for the delay. In our design, 10% mismatch on device size introduced 20.6 mV offset with standard deviation of 1.4 mV. A small variation in V OFF compared to the desired value from (7) is acceptable because it only adds a constant offset in the output that can be easily calibrated in the backend. Fig. 5c shows the impulse generator that transmits impulses at rising and falling edges of the PWM signal. An inverter followed by a capacitor bank delays the PWM signal, and an XOR gate consisting of inverters and NAND gates generates sharp impulses with 3-bit binary adjustable pulse width. The impulse is applied to an all-digital PA that drives the Tx antenna. To adjust the spectrum of the transmitting impulse, we implemented a register and capacitor bank to change the inverter delay. The size of the inverter chain was set to achieve an output power of at least 0.2 mW with 50 load (antenna loading), which is a requirement of this application [4] .
C. IR-PWM Receiver
The IR-PWM Rx, which block diagram is shown in Fig. 3b , was implemented with commercial off-the-shelf components (COTS), We used three LNAs (ADL 5542, Analog Devices), which have individual 20 dB gain with a bandwidth of 0.5 ∼ 6 GHz, and an envelope detector (ADL 5511, Analog Devices), which has an input frequency range of 0 to 6 GHz. The Rx antenna receives the IR signal from the Tx, eventually the catheter tip inside the body in our application, the LNA amplifies it, an envelop detector down converts the amplified signal, and the following LNAs amplify the envelope signal. We have also developed an Rx ASIC, which recovers the PWM signal from the amplified envelope signal. A diode-connected PMOS and a capacitor filter the envelope to remove undesirable pulses to prevent glitches, a size adjustable inverter with controlled switching threshold converts the analog impulse into a digital pulse, and a T-flipflop followed by a digital buffer recover the PWM signal.
In this proof-of-concept prototype, we used an oscilloscope (TDS-5054, Tektronix) with 40 GHz sampling rate and 500 MHz bandwidth as a TDC to capture the received signal, and delivered it to computer for offline processing. The limitation of this method is the oscilloscope memory size, which is limited to 400,000 samples. Alternatively, a highspeed ADC can be used as Rx with FPGA for time-to-digital conversion (TDC).
III. NOISE AND LINEARITY CONSIDERATIONS
In this section, by defining the main sources of noise and the way they affect T PW , we present design guidelines for improving the resolution and overall performance of the proposed wDAQ architecture. This information is particularly helpful towards future designs.
A. Operational Transconductance Amplifier
Three non-idealities of the OTA contribute to the pulse width error, besides its nonlinearity: input offset, limited bandwidth, and output referred noise. The effect of the input offset, V O F F , which is caused by process mismatch, can be considered from (4),
As shown in (9), V O F F turns into a gain error and an offset in the pulse width, which can be corrected with calibration.
To analyze the effect of the OTA limited bandwidth, we added a first order low-pass filter (LPF) after an ideal OTA. The step response of the first order LPF can be expressed as A. (1-exp (-t/τ )), where A is the amplitude and τ is the time constant. At the beginning of 1, the input voltage of the OTA is changed from 0 to V R E F . Therefore, the OTA output current varies as,
Similarly,
Since the difference in charges stored in the sampling capacitors from the beginning of 1 to the end point is 0 C,
If we ignore the clock jitter, i.e. |t 1 | = |t 2 |, then,
Assuming exp (-T PW /τ ) 1/2 n and exp (−t 1 /τ ) 1/2 n , where n is the resolution of the wDAQ, then,
The output noise of the OTA distorts the sampling capacitor voltage. The PWM signal is generated at 3 (see Fig. 4 ), where the sampling capacitor voltage changes with the slope of G m V R E F /C. Therefore,
Any constant pulse width offset can be systematically removed. However, it still limits the dynamic range of the PWM signal. From (16), bandwidth of OTA was designed to be 250 MHz to make the offset and error less than 1 ns and 85 ps, i.e. 1/2 9 of the maximum pulse width with a 1 MHz sinusoidal wave, respectively. To prevent switches 1, 2, and 3 at the OTA input in Fig. 3a from limiting its bandwidth, considering the OTA input parasitic capacitance, their sizes were designed to provide an RC time constant of 63.6 ps, which is the equivalent of 2.5 GHz bandwidth.
The OTA noise is generally composed of two parts: flicker noise (V f,OT A ) and thermal noise. Because the flicker noise is inversely proportional to the frequency and device size, it is smaller than thermal noise in large device (large gate area). The OTA transconductance is 1/R S , and the OTA noise bandwidth is 1/(4R S C), where R S is the source degenerated resistor, and C is the sampling capacitor. Hence, larger sampling capacitance, as expected, helps to reduce the OTA noise.
Based on simulation results, the OTA and comparator noise can be minimized. We set the margin of the sampling capacitor voltage error at 48.75 μV, or 1/2 12 of the maximum sampling capacitor voltage variation (200 mV) such that the noise from OTA and comparator would be negligible.
B. Power Supply Noise
As shown in Fig. 5a , the structure of the OTA is fully differential, and its differential output has a high supply noise rejection capability. Since process mismatch degrades supply rejection capability, we simulated supply rejection with a 10% mismatch in R S at input differential pair. The OTA's transconductance from the supply voltage to output current was −125 dB in this simulation, which is 40 dB less than −85 dB transconductance from OTA input to output. We assume that the wDAQ is supplied by an ideal voltage source, and the main supply noise is from the voltage drop across interconnects. The maximum instant current supplied to wDAQ is below 50 mA, and interconnect resistance is less than 0.3 . Therefore, the IR drop would be less than 15 mV, resulting in input referred noise in the order of 150 μV, which is only 6.4% of LSB.
Simulations show that the maximum instantaneous current is caused by the comparator function and PA impulse generation. Those happen either after the 'end point' or at the beginning of the discharging phase. Since comparator has already made a decision at the 'end point', the effect of noise on OTA after that point is negligible. At the beginning of discharge, OTA discharges at constant current regardless of the input voltage, and the supply noise becomes a systematic error, which can be eliminated as an offset. Following the comparator, wDAQ has several digital blocks. Since information is encoded in the time domain, variations in digital circuit delays appear as noise. According to our post-layout simulations, the delay variations resulted from changes in VDD = 1.8 V ± 15 mV are in the order of 45 ps, or 16.5% of the LSB. We can, therefore, conclude that the effect of supply noise due to sudden variations in current consumption is negligible in wDAQ operation.
C. Comparator
As explained in section II.B, the comparator delay can be compensated. However, its input referred noise affects the sampling capacitors' voltages. The comparator input referred noise also follows (16) , and since the noises of the comparator and OTA are uncorrelated, the overall noise of the sampling capacitor can be found from,
D. Clock Jitter
Clock jitter affects the PWM since its control signals are directly generated from the master clock. Here, we evaluate the effect of clock jitter in each phase. From (6), the effect of clock jitter at the beginning of 1 (t j 1 ) is,
Considering the jitter at the beginning of 2 (t j 2 ),
Similar to (18) , the effect of jitter at the beginning of 3 (t j 3 ),
Let's assume V I N has a uniform distribution from −V R EF to V R E F , and zero average. Then the root-mean-square (RMS) of V I N is V R E F / √ 3. To simplify equations, we assume all random clock jitters are equal in duration (t j ) but uncorrelated,
If we assume V I N is a sinewave with amplitude of V R E F , then the root-mean-square (RMS) of
To reduce the noise caused by clock jitter, since clock period is often more accurate than its duty cycle, we applied twice higher frequency clock, 40 MHz in this case, and divided it by two, to achieve the desired 20 MHz clock.
E. Receiver and Wireless Link Bandwidth
Because of limited bandwidth of the antenna and FCC-regulated spectrum allocation [22] , the transmitted impulse will have finite rise and fall times. This will cause a certain amount of interference depending on the time difference between two adjacent pulses. Ultra-wideband (UWB) with an allocated bandwidth of 3.1 GHz ∼10.6 GHz is the most common band used by IR transceivers [20] . To show the fundamental limits of the proposed wDAQ accuracy due to wireless link limitations, we considered an impulse with flat frequency response within the UWB bandwidth, represented in the time domain as,
where f h and f l are the upper and lower boundaries of the allocated bandwidth. As mentioned in Section II.B, since we have to apply delay offset (t of f ) twice in each sample, at both the rising and falling edges, the dynamic range of the pulse width would be limited to T samp -2t of f , where T samp is the sampling period. Therefore, even though longer t of f gives better immunity against loss of information (the equivalent of error bits in digital communication), it limits dynamic range, which either reduces resolution or demand better TDC specifications. To emulate the effect of interference among adjacent pulses, we considered two impulses, and calculated the resulting delay at the switching onset when the signal passes half-peak threshold, as depicted in Fig. 7 . By comparing the applied and calculated delays, we can indicate the timing error, T PW,link . So we swept the applied delay and found the maximum error, then measured all maximum errors by sweeping t of f . To find the optimal t of f , we calculated the maximum error rate from,
As mentioned above, resolution of the PWM is defined where INL and DNL are both less than 0.5 LSB. Therefore, the relationship between t o f f and resolution leads to,
We calculated (30) and (31) from the ideal UWB impulse, and plotted the results in Fig. 8a. Fig. 8b shows the corresponding accuracy requirement on the Rx TDC. We can conclude that when everything else is ideal, except for limited the limited bandwidth, 11-bit resolution at the sampling rate of 20 MHz is possible, assuming TDC can provide 24 ps timing accuracy. In Fig. 8c , spectrum of the IR-PWM is depicted. Since IR-PWM modulates the time between two impulses, the spectrum does not have uniform deeps, but its outline is similar to the spectrum of an ideal impulse.
IV. MEASUREMENT RESULTS
A prototype IR-PWM Tx ASIC was fabricated in the TowerJazz 0.18-μm power management CMOS process, occupying an active area of 0.18 mm 2 , as shown in Fig. 9 . The IR-PWM Tx consumes 3.94 mW at 1.8 V. The dynamic energy consumption of IR-PWM was 197 pJ per sample, which is the equivalent of 28.1 pJ/bit, if we consider 7-bit of resolution for the wDAQ system. The energy consumption of the IR Tx without the PWM block was 8.86 pJ/bit with 33.6% power added efficiency. Since the process used for this prototype is not optimized for RFIC, the Tx design was not meant to be comparable to state-of-the-art UWB transmitters. The target IVUS application, on the other hand, requires a communication distance of only a few cm from the tip of catheter in the heart to the Rx antenna on the chest.
Process limitations and parasitics resulted in increased impulse width and shifted the measured spectrum of the impulses from 3 ∼ 5 GHz down to 0.6 ∼ 1.5 GHz. In the design phase, t of f = 3 ns was considered between consecutive impulses based on the analysis in section III.E. Based on the measured impulse spectrum, however, we found that if two IR-PWM pulses are closer than 7.5 ns, the interference increases above the acceptable range. V R E F was also adjusted from 350 mV to 450 mV. Table I benchmarks the proposed wDAQ against recently reported wireless data acquisition systems.
Since in this architecture the TDC generate the digital code it plays a role in defining the wDAQ resolution. In this prototype, the recovered PWM signal was digitized by an oscilloscope with 50 ps sampling period, which provides 9-bit resolution considering the PWM signal dynamic range of 35 ns. Following the INL and DNL measurements, defined in [13] , the resolution of the IR-PWM prototype was 7-bit, with INL and DNL < +/− 0.5 LSB, as shown in Fig. 10 . The resolution was one bit less than what we expected from simulation results (8-bit) , which is likely due to 10% lower than expected resistance of the OTA degeneration resistors in Fig. 5b . Fig. 11a shows the IR-PWM measurement setup. We used log periodic antennas (WA5VJB, Kent Electronics, Sugar Land, TX), for both Tx and Rx sides in this prototype, which covered 0.85 ∼ 6.5 GHz range. Tx-Rx antennas were placed 15 cm apart to achieve high enough SNR with low error rate. Fig. 11b shows the Rx ASIC, which schematic is depicted in Fig. 3b , and was fabricated in a 0.35-μm standard CMOS process, occupying 0.2 mm 2 . To measure the effective number of bits (ENOB), we conducted a single-tone test with 1 MHz sinusoidal wave. The oscilloscope memory limitation confined the measured signal to 20 μs. The PWM data from the COTS Rx and ASIC Rx outputs were recorded separately with 50 termination, instead of their following stages, to prevent signal distortion, and decoded offline to be plotted in Fig. 12 . Since each stage has a certain amount of delay, the edges of the pulse and PWM outputs are not aligned. Based on the decoded data, SNR, spurious-free dynamic range (SFDR), and signal-tonoise and distortion ratio (SINAD) are measured and depicted in Fig. 13 . SINAD of the Tx output and the recovered PWM signal are 39.7 dB and 36.8 dB, respectively, which correspond to ENOB of 6.3 and 5.8 bits, respectively.
To determine noise contribution of various blocks, following the discussion in section III, the clock jitter, input signal noise, and Rx signal amplitude were measured. A 40 MHz square wave from function generator (Tektronix AFG 3102) was used as clock, which measured jitter with RMS of was 126.4 ps is depicted in Fig. 14 . The input sine wave was also generated from the same function generator, with SNR of 55.3 dB. To quantify the effect of each noise source in the same domain, the equivalent SNR and number of bit (NOB) are calculated from,
Equivalent NOB = Equivalent SNR − 1.76 6.02 .
Since we applied a sine wave as input, the corresponding PWM noise (T PW,C L K ) according to (28) was 159.4 ps. Since the OTA and comparator output were not accessible, noise simulation results were used, which indicate input referred noise of 141 nV for the comparator and output referred noise of 40.5 μV for the OTA, including the √ kT/C noise of the sampling capacitors that is the dominant noise source. Simulation results also showed that G m of the OTA was 70 μA/V with -3 dB bandwidth of 250-MHz. From (18) , T PW,C was 7.7 ps. Because τ of the OTA is 0.636 ns and the minimum T PW is 7.5 ns, the assumption for (16) is valid for the entire input dynamic range. From (16), the OTA-induced an equivalent offset of ∼0.64 ns, and considering the 1 MHz sine wave, the RMS pulse width error of 47.05 ps. As we discussed earlier in this chapter, from the measured impulse, we choose the t of f as 7.5 ns, and we calculated T PW,link of 49.8 ps.
The equivalent SNR and ENOB are calculated and summarized in Table II . If we assume all noise sources are uncorrelated, the expected equivalent SNR of the Tx output from the noise source was 45.32 dB. However, if they are correlated, then the equivalent SNR can drop to 41.14 dB. To simplify the analysis, we ignored the relationship between each noise source. However, since the clock jitter changes the duration of the integration, it is strongly correlated with other noise sources. Therefore, in Table II we calculated the expected SNR and equivalent NOB as a range from the best case, when all noise sources are uncorrelated, to the worst case, when all noise sources are correlated. The measured SNR at Tx output, 44.4 dB, was between 45.32 dB -41.14 dB. Similarly, the total equivalent SNR, including Tx noise, Rx, and wireless link, 42.2 dB, was between 45.57 dB -40.6 dB. As shown in Fig. 8 , the IR-PWM has 7-bit resolution based on INL, but non-linearity causes harmonic distortion, which degrades SINAD [25] . Therefore, the measured ENOB of the prototype IR-PWM was 5.8 bit, which is lower than the ENOBs from noise sources (6.45 -7.27 bit). Table II summarizes the noise contributions from various sources and their impact on the overall ENOB.
V. DISCUSSION

A. Power Efficiency
Advantages of the ATC over the conventional ADC in this application are:
1) Due to its simpler architecture, ATC is expected to occupy smaller area on chip, in a given process, while consuming considerably less power than common high-speed ADC architectures [10] . In fact, ATC is similar to the analog part of the well-known family of single-, dual-, and multi-slope ADCs. However, the digital part in this case is transferred onto the receiver (Rx) side, out of the patient body, where size and power are not nearly as constrained, allowing for implementation of high-resolution TDCs in deep sub-μm processes or high-speed FPGAs.
2) The ATC does not require high frequency clock, further reducing the dynamic power consumption of the DAQ, while creating a less noisy system-on-a-chip (SoC) environment. Successive approximation-register (SAR) ADCs, which are commonly used in low power applications, generate one bit per clock cycle, requiring a clock frequency of N × sampling rate. In comparison, the ATC only requires the same clock frequency as the sampling rate, thanks to TDM.
3) The total on-chip capacitance required for implementing an ATC block is considerably less than low-power ADC architectures, such as SAR, and result in smaller die size, which is important in HV processes with large transistors or those that do not have numerous metal layers, used in applications, such as IVUS [26] . Moreover, ATC is not as sensitive to matching among small capacitors as the SAR ADC is [27] .
4) Considering that ATC involves integrating charge over the period of each sample, it inherently reduces high-frequency noise.
5) Finally, we have shown in [28] that wirelessly transmitting the PWM signal, which results from the ATC, over a short distance, can be more power efficient on Tx side than a serial-data bit stream resulting from the conventional ADC approach. In conventional IR-UWB communication, three encoding techniques, which do not require complex signal processing, are widely used: binary phase shift keying (BPSK), on-off keying (OOK), and pulse position modulation (PPM). The BPSK and PPM change the polarity and timing of the impulses for '0' and '1' in every bit of data, respectively, while the OOK only sends an impulse when the data is '1'. Considering that most of the power consumption would be in the PA and assuming that the probabilities of '1's and '0's in a serial-data bit stream are equal, the expected number of the transmitted impulses in OOK would be half of BPSK and PPM, resulting in cutting the Tx power consumption almost in half. The IR-PWM goes even further and transmits only two impulses for each sample, as shown in Fig. 1 , while representing all N data bits, where N is the ATC resolution. As a result, the Tx power in IR-PWM is only 1/N of the BPSK and PPM, and 2/N of the OOK [28] .
B. Limitation of the Current Prototype, Future Improvements
Since the guidewire IVUS [4] has very strict power/area limitations, we cannot afford to have any off-chip components, precluding crystal-based high accuracy clock generation. Hence, in this application, the power carrier, which has been generated outside of the body, based on accurate crystalbased circuitry, is also used as the timing/clock source [4] . Since power is delivered through a pair of long wires, with parasitics and interference, we cannot expect it to be a low jitter timing reference. That is why we have concluded that the clock jitter is the main source of noise and a key limiting factor in achieving higher resolution in the current wDAQ prototype. If we broaden the spectrum of IR-PWM by using very thin coaxial cables or use this architecture in other applications that would offer higher bandwidth, the impulse will be sharper, leading to reduction of T PW,L I N K . Although the sampling rate of IR-PWM is directly related to the clock frequency, its resolution is not related to the frequency. If the application does not have strict constraints on sampling rate, simple low frequency (40 MHz) ring oscillator would be a good enough clock source [29] . On the other hand, if the application requires certain accuracy and sampling rate, relaxation oscillator or digital controlled oscillator can be used at the expense of area [30] .
VI. CONCLUSION
In this paper, we propose an IR-PWM based wDAQ architecture, which transfers analog data as a pseudo-digital time-based signal with significantly improved transmission efficiency. Proof-of-concept Tx and Rx circuits are described for the proposed wDAQ system, and its noise performance has been analyzed in detail and measured. The current IR-PWM prototype has 7-bit resolution with 2 × 10 MS/s data throughput and energy consumption of 28.1 pJ/bit at 1.8 V supply. A detailed discussion and analysis of noise sources indicated the practical limitations of the IR-PWM method, and the areas where its performance and resolution can be further improved. This concept can be used for ultra-low-power and wideband data acquisition system that can only afford to have very small footprints.
