A fully differential fourth-order 1-bit continuous-time delta-sigma ADC designed in a 65 nm process for portable ultrasound scanners is presented in this paper. The circuit design, implementation and measurements on the fabricated die are shown. The loop filter consists of RC-integrators, programmable capacitor arrays, resistors and voltage feedback DACs. The quantizer contains a pulse generator, a high-speed clocked comparator and a pull-down clocked latch to ensure constant delay in the feedback loop. Using this implementation, a small and low-power solution required for portable ultrasound scanner applications is achieved. The converter has a supply voltage of 1.2 V, a bandwidth of 10 MHz and an oversampling ratio of 16 leading to an operating frequency of 320 MHz. The design occupies a die area of 0.0175 mm 2 . Simulations with extracted parasitics show a SNR of 45.2 dB and a current consumption of 489 µA. However, by adding a model of the measurement setup used, the performance degrades to 42.1 dB. The measured SNR and current consumption are 41.6 dB and 495 µA, which closely fit with the expected sim-
ulations. Several dies have been measured, and an estimation of the die spread distribution is given.
Keywords Ultrasound · Portable · Scanners · Integrated · Continuous-Time · Delta-Sigma · ADC
Introduction
Ultrasound scanning is a widely used technique in medical applications due to its operating simplicity, non-invasive nature, live imaging capabilities and extended diagnosis range. However, the commonly used static ultrasound scanners are expensive, large and have no power consumption limitations since they are plugged into the AC mains. Due to its virtually unlimited supply power, the electronics of a static scanner are generic discrete components which typically have a higher performance than required and thus use excessive power.
In the last decade, portable ultrasound scanners have been emerging. Research on their implementation has increased since they are a small and flexible alternative to traditional static ultrasound scanners. There are several challenges in the design of a portable ultrasound scanner. Firstly, due to the reduced size, the maximum power dissipation on the scanner is approximately 2 W to guarantee that it does not become so heated that it is uncomfortable to handle. Secondly, since the device is battery supplied, the maximum power consumption of the electronics is also limited, which obsoletes the usage of generic discrete components. An application specific integrated circuit (ASIC) solution is required to custom design the electronics and minimize the are and power consumption. Implementing the electronics using ASICs leads to the best signal-to-noise ratio (SNR) for the available area and power budget, which directly translates into the best picture quality achievable.
A portable ultrasound scanner comprises several channels, and each of them consist of a transducer, a transmitting circuit (Tx) and a receiving circuit (Rx). The Tx excites the transducer with high-voltage signals in order to generate ultrasound waves. The Rx amplifies, delays and digitizes the signal induced in the transducer by the reflected waves. Since both size and power are critical to achieve a handheld device, all blocks need to be optimized accordingly.
In this paper, the design, implementation and measurements of a fully differential fourth-order 1-bit continuous-time delta-sigma analog-to-digital converter (CTDS ADC) are presented. A continuous-time implementation was chosen over a discrete-time one due to the required high operating frequency and low power requirements [1, 2] . Optimized performance is needed to achieve a usable picture quality, which was studied in [3] . The circuit is designed to be used in a 64-channel synthetic aperture sequential beamforming (SASB) [4] [5] [6] [7] portable ultrasound scanner based on capacitive micromachined ultrasonic transducers (CMUTs). The circuit is fabricated in a 65 nm process with a supply voltage of 1.2 V. Fig. 1 Structure of the fully differential fourth-order 1-bit continuous-time delta-sigma ADC with two resonators for optimal zero placing.
CTDS ADC topology and specifications
In [8] , the 64-channel portable ultrasound scanner system based on CMUTs was studied and the most adequate topology and specifications were derived. A fourth-order 1-bit CTDS ADC topology was chosen and a summary of the SNR, bandwidth (BW), oversampling ratio (OSR), supply voltages (V SS /V DD ), common mode level (V CM ) and maximum differential input voltage (u in,max ) is shown in Table 1 . The relatively low SNR requirement for the ADC is possible due to the in-handle SASB [4] [5] [6] [7] of the scanner, which sums the 64 channels into one. This technique requires a fine delay accuracy of 3.125 ns in order to temporally align the signals from each channel before summing them. As a consequence, a CTDS ADC implementation was chosen over a Nyquist rate ADC. Using a 10 MHz bandwidth Nyquist rate ADC would require interpolation filters to achieve the required delay accuracy. These type of filters are complex, process dependent and highly area and power demanding, which is not acceptable for portable ultrasound scanner applications. Contrarily, a 1-bit oversampled converter at a sampling frequency of 320 MHz inherently embeds the necessary delay accuracy in the bit stream. As a result, a simple and efficient inverter-based clocked digital delay line with switches can be used as a delay block.
Due to the low SNR requirements, the thermal noise was found to be negligible compared to the inherent quantization noise, which is rare in CTDS ADC design. Typically, the signal to quantization noise ratio (SQNR) of the CTDS ADC is designed 10-12dB higher than the specified SNR to accommodate for thermal noise, non-idealities and transistor level implementation limitations. In this design, the thermal noise can be neglected which, as it is can be seen later, affects significantly the design choices and implementation of the CTDS ADC. Due to the low impact of the thermal noise, all the tradeoffs of the design have been biased towards low current consumption instead of noise performance. Furthermore, it allows for more headroom to be used to accommodate non-idealities and transistor level implementation limitations. Note that the main goal is to find the design with the lowest current consumption that satisfies the specifications. Therefore, the SNR will be purposely degraded by reducing the current until the targeted SNR is reached. The block level structure of the fully differential CTDS ADC designed is shown in Fig. 1 , where the input and output are u in = u d,in+ − u d,in− and v out = v d,out+ − v d,out− respectively. The loop filter consists of four operational transconductance amplifier (OTA) based RC-integrators, resistor coefficients and two feedback voltage digital-to-analog converters (DACs). The RC-integrators are grouped in pairs in order to create two resonators, which optimally place the zeros of the transfer function to improve the SQNR. The design was done according to the optimization method shown in [9] . Additionally, the integrating capacitors are designed to be programmable to be able to compensate for corners and process variations. The quantizer is implemented with a pulse generator, a high-speed clocked comparator and a pull-down clocked latch. In the next section, the design of each of the blocks of the CTDS ADC is thoroughly described.
Block design
In this section, the design of each block of the CTDS ADC is shown. In each subsection, the specifications, topology and design choices of the block are discussed. Circuitry for portable ultrasound scanners has to be small and efficient, therefore, all the blocks of the ADC are designed to minimize its die area and power consumption. Note that in all schematics, the bulks of the PMOS and NMOS transistors are connected to the positive supply (V DD ) and negative supply (V SS ) respectively if it is not indicated otherwise.
Operational Transconductance Amplifier
The specifications for the OTA are a gain (A v ) of 40 dB, a gain-bandwidth (GBW) of 1.3 GHz, phase margin (PM) of 35 • and a slew rate (SR) of 120 V/µs. The load of the OTA is the sum of the nominal value of the integrating capacitor and the parasitics of the circuitry. The value of the nominal capacitance was chosen to be small in order to minimize the current consumption of the OTA. As a result, the coefficient resistors are large and noisy. Typically, this is a problem in CTDS ADCs since the thermal noise limits the SNR. Much smaller resistor coefficients need to be used, increasing the size of the capacitors and thereby and the current in the OTAs. Nonetheless, the SNR of this design is not thermal noise limited, hence, the integrating capacitors should be chosen as small as possible to achieve a power efficient design. The nominal value of the integrating capacitors was chosen to be 100 fF leading to resistor coefficients in the order of 100 kΩ. Smaller capacitor values were not used to not compromise the matching and parasitic sensitivity of the design.
The most demanding parameter of the specifications is the GBW, and it needs to be achieved with the minimum current possible. The symmetrical OTA topology shown in Fig. 2 is selected since it has a very high currentto-GBW ratio, low offset, high output swing, is symmetrical and tolerant to mismatch [10] . Cascoded transistors M 8a /M 8b and M 9a /M 9b have to be added to boost the gain. The cascodes limit the output swing of the OTA, however, this is not an issue since the output swing of the OTA-based integrators can be adjusted by scaling the filter coefficients. The main disadvantage of symmetrical OTAs is the high levels of thermal noise, however, as it was stated before, due to the low SNR required, the thermal noise is not a limiting factor. The bias current in the inner branch is generated by M 6 and is mirrored five times larger with the current mirror formed by M 2a /M 2b and M 3a /M 3b . The common-mode feedback (CMFB) consists of M 4a /M 4b and M 5 , which detects the output levels and adjusts the current in the outer branches to compensate for it.
The OTA was simulated with extracted parasitics including process variations and mismatch and the performance obtained is shown in Table 2 . The nominal value and the maximum and minimum value simulated across all the process variations and mismatch are noted. All the specifications are satisfied even in the worst case of each parameter. A small margin for each parameter is left for any post-silicon variations. Note that the same OTA design is used for all four integrators. However, in future versions of the CTDS ADC the current in the second, third and fourth OTAs will be scaled down due to their lower requirements.
Programmable capacitor array
Due to process variations, the value of the resistors and capacitors can range up to +/-20% in the worst case corners. Therefore, the coefficients of the loop filter, which depend inversely on the RC product, can vary significantly resulting in performance degradation and even instability. In order to compensate for these variations, the 100 fF integrating capacitors are implemented as programmable capacitor array so that the capacitance value can be adjusted. The schematic of the array can be seen in Fig. 3 . The bits b n control whether the corresponding capacitor C n is connected to the input/output of the OTA or if it is disconnected and shorted to ground. In this design three control bits (n = 1,2,3) are used, leading to eight possible capacitor values combining C 0 = 60 fF, C 1 = 10 fF, C 2 = 20 fF and C 3 = 40 fF. The extra control bit, rst, works as a reset signal of the CTDS ADC by shorting the input/output of the OTAs.
High-speed clocked comparator
The sampling frequency of the oversampled ADC is 320 MHz, therefore, a very fast comparator is needed. Furthermore, in order to get consistent comparisons with the same starting state, the comparator needs to be reset every cycle. The topology used is the one suggested in [11] , and it is shown in Fig. 4 . Two small inverters acting as buffers are added at the comparator outputs, v o+ and v o− , to load them equally. As a result, the comparator behaves more equally for both comparator decisions. The comparator has two different phases. Firstly, when the clock clk c is low, the comparator is disabled and both outputs v o+ and v o− are pulled up to V DD . Secondly, when clk c goes high, the starting state of the comparator is unstable since both v o+ and v o− are high. A small differential signal in the input pair of the comparator, M 10a /M 10b will pull down either v o+ or v o− through the two positive feedback paths formed by 
Pull-down clocked latch
Even though the comparator is symmetric and equally loaded, the amplitude of its differential input signal determines the comparison time. The comparator takes longer to compare small differential signals, and is quicker at deciding for larger differential signals. This would result in varying delays in the feedback signals, which would decrease the SNR and could even cause instability. Hence, a clocked latch is needed. The latch provides a time consistent output independently of the comparator behaviour. Firstly, clk c enables the comparator and after a decision time, clk l enables the latch passing the comparator decision to the outputs of the CTDS ADC v d,out+ and v d,out− . The outputs are consistently generated on the rising edge of clk l , hence, the feedback signals are synchronized and have a constant delay. The schematic of the pull-down clocked latch can be seen in Fig. 5 . It consists of a latch formed by M 20a /M 20b and M 21a /M 21b and two pull down branches composed of M 18a /M 18b and M 19a /M 19b . When the clock clk l is low, both branches are disconnected, and the latch maintains its state. When clk l is high, one of the branches pulls down one of the nodes of the latch forcing a state. The pulling strength of both branches is consistent every cycle since v co+ and v co− are always either V DD or V SS when the latch is enabled. 
Pulse generator
In order to control both the comparator and the latch the enabling pulses clk c and clk l need to be generated. As it is shown in Fig. 6 , there are three states per cycle, the comparison time (t c ), the latch time (t l ) and the reset time (t r ). During t c , only the comparator is enabled. During t l , both the comparator and latch are enabled. Finally, during t r , both comparator and latch are disabled. It is important to notice that the comparator can stay enabled during the latch time since M 14a /M 14b are designed to be very strong, hence the comparator inputs can not flip its outputs. This allows for a way simpler and more robust control scheme where it is not critical to turn off the comparator before the output is latched.
The pulse generator is implemented with a simple inverter delay line, and logic gates that generate clk c and clk l as shown in Fig. 7 . Note that the input clock, clk in , is buffered internally in order to minimize the routing effects from the clock source to the die. This simple design consumes a small current and is resistant to process and mismatch variations since, even though t c , t l and t r can vary, these states can not overlap due to its inherent structure. The loop delay of the CDTS ADC from the comparator to the DACs is largely dominated by t c , which is determined by the delay across the inverter delay line. The layout of this block affects the unit delay of an inverter, therefore all the timing simulations need to be done with extracted parasitics. Following the specifications found through simulations in [8] , the loop delay can not be higher than 300 ps. Simulations with extracted parastics of the implementation, including corners and mismatch variations, show that the loop delay varies from 210 ps to 298 ps with a nominal value of 252 ps, which satisfies the specifications.
Voltage feedback DAC
The two DACs of the system are chosen to be implemented as simple voltage DACs for simplicity, easiness of matching and area reduction. They consist of a PMOS and NMOS forming a transmission gate that connects the feedback
depending on the gate signals v d,out+ and v d,out− (Fig. 8) . These transmission gates need to be fast, therefore small transistors should be used. Due to the large coefficient resistors in the order of 100 kΩ, the on-resistance of the small transistors is neglectable. In order to obtain consistent, symmetric feedback pulses, both DACs should match, hence, several unit size transistors are used in each MOSFET device. The total width and length used for the DAC transistors are W/L = 200/60 nm. 
CTDS ADC simulations
After the assembly of all the blocks, the layout of the full CTDS ADC, with a total area of 0.0175 mm 2 is shown in Fig. 9 . The area distribution is as follows: OTAs, including its bias circuit, occupy 3100 µm 2 (17.7%), the capacitor arrays 7600 µm 2 (43.4%), the coefficient resistors 6300 µm 2 (36%), and the pulse generator, comparator, latch and DACs combined occupy 500 µm 2 (2.9%). It can be seen that the majority of the area is occupied by the loop filter, and the area of the quantizer is significantly smaller. In order to achieve better matching, the resistors and capacitors were split in several unit size devices and interleaved. Moreover, dummy resistors and capacitors were placed around each matching structure. Similarly, the transistors were matched using common-centroid techniques and dummy devices.
The output spectrum of the full CTDS ADC with extracted parasitics (PEX) is shown in Fig. 10 . Note that due to the optimal zero placing and the even order of the CTDS ADC, there are no zeros at DC. The nominal SNR and current consumption simulated with extracted parasitics are 45 dB and 489 µA respectively. From the total current, 443 µA are spent on the OTAs (90.6%), 22 µA are spent on the quantizer (4.5%) and 24 µA are spent in the DACs (4.9%). The current consumption is clearly dominated by the loop filter, mainly in the OTAs. The supply voltage is 1.2 V, hence the power consumption of the CTDS ADC results in 0.587 mW.
The CTDS ADC is also simulated with extracted parasitics in the corners and with temperature and supply variations. A summary of the typical, maximum and minimum values obtained over all combinations are shown in Table  3 . The capacitor array is adjusted for each simulation to compensate for the capacitance variation and adjust the loop filter coefficients accordingly. The design falls within specifications even across the corners and variations.
Normalized frequency (f/fs) 
Results and discussion
The CTDS ADC was fabricated in a 65 nm process, and a die picture of the integrated circuit (IC) taken with a microscope is shown in Fig. 11 . A printed circuit board (PCB) was designed to test the circuit. The PCB contains lowdrop out (LDO) voltage regulators and decoupling capacitors to stabilize the circuit references and supply voltages. The PCB voltages are fed using two Rigol DP832 programmable DC power supplies. The master clock of the CTDS ADC is supplied with a low jitter, high-accuracy clock generator AD9516-3. The differential input signals are generated with a Tektronix AFG3102C function generator and the differential outputs of the ADC are measured using Rohde & Schwarz RTO 1024 oscilloscope. A picture of the measurement setup can be seen in Fig. 12 . For the purpose of comparing the simulations with the measurement results accurately, the clock jitter, supply variations and the parasitic resistances, capacitances and inductances introduced by the measurement setup were estimated and modeled. The most relevant parasitics in the setup are the coupling capacitances from the ESD protection of the pads, the inductances and resistances from the bondwires and IC socket and the This phenomenon will not be an issue on the final IC since the supplies will be on-chip. Hence, this degradation is only present due to the fact that the ADC is tested as an isolated block and not part of the system. However, if this degradation had to be overcome in this current IC, the die would need to be flip-chip bonded into a PCB directly to avoid packaging. The frequency response measured on the IC (Meas.) can be seen in Fig. 13 . Additionally, the simulations with extracted parasitics (PEX), and the simulations with extracted parasitics and measurement setup modeled (PEX*) are shown to ease the comparison. The frequency response and SNR fit closely with the simulated results with the measurement setup modeled. Furthermore, the measured current consumption is 495 µA, which is also very close to the simulated 491 µA. The CTDS ADC is designed to be connected inside a die, without receiving or delivering any outputs directly outside of the IC. Consequently, when the circuit is used in a portable ultrasound scanner, the SNR degrading effects caused by the measurement setup would not be present. Due to the high correlation from simulations and measurements, the CTDS ADC is expected to operate inside an Rx channel with a performance similar to the simulations with extracted parasitics, 45 dB. In order to assess the SNR and power consumption (P ADC ) variation spread over several dies, the approach suggested in [12] is used. In Fig. 14, the measured SNR and power consumption of 20 dies and their respective median range, M , and percentiles P 15.87 and P 84.13 for a confidence level of 95% are shown. The SNR performance spread is proportionally small with P 15.87 and P 84.13 located at 41.05 dB and 42.51 dB respectively, hence, a consistent performance across dies is expected. The power consumption spread is also low with P 15.87 and P 84.13 located at 0.548 mW and 0.636 mW respectively, but more significant than the SNR spread. This is attributed to the OTAs biasing circuit, where small transistors were used to reduce the parasitic capacitances and thereby the power consumption. As a result, the die variations affect the total power consumption of the CTDS ADC more significantly. In the future, a biasing circuit with higher variation tolerance will be investigated.
A performance comparison between this design and other CTDS ADC with similar specifications is provided in 1 41.2 41.3 41.4 41.5 41.6 41.7 41.8 41.9 42 42.1 42.2 42.3 42.4 of merit (FoM) of energy per conversion is used (1) . Both the measured performance (Meas.), and the expected performance without the measurement setup SNR degradation (PEX) are included. As it can be seen, this design achieves a comparatively low FoM using a very small die area and low power consumption which enables channel scalability, a necessary factor for portable ultrasound scanners. Note that the die area, which is one of the strengths of this design, is not part of the FoM, hence it is not taken into account for the comparison.
F oM = P ADC 2 · BW · 2 SN R−1.76dB 6.02dB
(1)
In order to put in perspective the power consumption of the CTDS ADC in the portable ultrasound scanner, the full system is considered. A 64-channel portable ultrasound scanner, containing 64 ADCs, has an approximate total power budget of 2 W. Using 64 of the designed CTDS ADC, only a power consumption of 38 mW is used which is a very small portion of the total budget. As a result, this efficient ADC implementation allows for most of the power budget to be spent in the other parts of the scanner.
Conclusions
In this paper a fully differential fourth-order 1-bit continuous-time delta-sigma ADC designed in a 65 nm process for portable ultrasound scanners is presented. The ADC has a supply voltage of 1.2 V, a BW of 10 MHz, and operates at a f s of 320 MHz. The aim of the design is to reduce the area and minimize the power consumption achieving a small, efficient and scalable design that can be used in a portable ultrasound scanner. Due to the low SNR specifications, the design is inherently dominated by quantization noise, which is uncommon for CTDS ADC and expands the possible design space. The loop filter uses OTA based RC-integrators and voltage feedback DACs. The quantizer is composed of a high-speed clocked comparator and a pull-down clocked latch which are both controlled by a pulse generator. The design occupies a die area of 0.0175 mm 2 and is robust to corners and process variations. The power consumption measured on the fabricated IC is 0.594 mW, and the measured SNR degraded by the measurement setup is 41.6 dB, leading to a FoM of 302 fJ/conversion. However, the expected SNR of the CTDS ADC used internally on an IC, without measurement setup degradation, is 45 dB, leading to a FoM of 197 fJ/conversion. Measurements across several dies show a consistent design with a small SNR die spread.
