Abstract-This paper presents a delta-sigma based readout architecture targeting electrocortical recording in brain stimulation applications. The proposed architecture can accurately record a peak input signal up to 240 mV in a power-efficient manner without saturating or employing offset rejection techniques. The readout architecture consists of a delta-sigma modulator with an embedded analog front-end. The proposed architecture achieves a total harmonic distortion of −95 dB by employing a current-steering DAC and a multi-bit quantizer implemented as a tracking ADC. A system prototype is implemented in a 0.18 μm CMOS triple-well process and has a total power consumption of 54 μW. Measurement results, across 10 packaged samples, show approximately 14-ENOB over a 300Hz bandwidth with an input referred noise of 5.23 μV rms , power-supply/common-mode rejection ratio of 100 dB/98 dB and an input impedance larger than 94 M.
I. INTRODUCTION
T HE development of readout architectures for applicationspecific integrated circuits is driven by the proliferation of smart implants in the next-generation health care systems [1] . A prime example are the implantable neural recording interfaces [2] , which can facilitate a wide range of medical applications due to the integration of wireless telemetry [3] , [4] , multi-channel recording [5] , [6] , on-chip feature extractions [7] , data compression [8] , bio-impedance measurements [9] and closed-loop brain stimulation [10] , [11] .
Traditionally, the main purpose of a neural recording interface has been to precisely measure the slowly changing biopotentials, which are generated by the neural activity in the brain, under very stringent constraints (e.g., concerning power, area, noise, common-mode rejection, and distortion) [7] , [12] . An additional challenge is the high dynamic range associated with bio-potential measurements [4] , [9] , [12] - [14] . For instance, an electrocortical (ECoG) signal is usually sensed, as depicted in Fig. 1 , via an invasive microelectrode grid array, which is implanted on the surface of the brain. The electrochemical interaction between the electrode and the tissue produces an offset voltage, referred to as the half-cell potential. Ideally, in a differential measurement, two identical electrodes should cancel-out their half-cell potentials [15] . In practice, a random differential electrode offset, up to hundred millivolts, can appear [4] , [13] , [16] , [17] . Moreover, the situation is worsened when the recording functionality is combined with the brain stimulation, where a build-up of charge occurs between electrodes [18] - [20] , as well as undesired in-band stimulation artifacts [16] , [17] . As a result, several approaches have been reported to cope with these problems, and they can be classified in the following categories: AC coupling, analog DC servo loop, digitally assisted servo loop and high dynamic range neural recording.
The most straightforward method of rejecting the electrode offset is by AC coupling, which performs passive high-pass filtering with a very low cut-off frequency [14] , [21] - [23] . However, the total common-mode rejection is limited by the capacitor matching accuracy, which is further deteriorated by the impedance unbalance between the sensing and reference terminals of the analog front-end [24] , [25] . An alternative solution is to use an analog DC servo loop, which performs active high-pass filtering, at the cost of an integrator with a considerable time constant [9] , [10] . The main disadvantage of these approaches is that the cut-off frequency of each recording channel is prone to different variations [4] . A digitally assisted servo loop can be used to overcome this issue by controlling the corner frequency in the digital domain [4] , [13] , [26] - [28] . However, the stability of a digitally assisted servo loop is impacted by the latency of the digital filter. Another drawback of this solution is that a charge-redistribution DAC is directly driven by the fluctuating electrode-tissue impedance, which can introduce distortion in the readout due to finite input impedance. To circumvent this issue, a DC-coupled neural recording interface with a current-steering DAC was used in [4] and [26] , but with a limited peak input (± 50 mV). Although the aforementioned techniques are effective for rejecting the electrode offset in traditional neural recording interfaces, a different approach is needed for brain stimulation systems. In particular, neural recording interfaces with a high effective-number-of-bits (ENOB>12-bit) are needed to linearly capture the ECoG signal in the presence of large stimulation artifacts, without saturating the readout architecture [16] , [17] . However, even these interfaces still rely on the aforementioned offset rejection techniques.
This paper presents a power-efficient based readout architecture with 14-ENOB for a peak input signal of ±240 mV over a 300 Hz bandwidth, which can facilitate highfidelity ECoG recording in the presence of large stimulation artifacts and differential electrode offset, as shown in Fig. 1 . The proposed system consists primarily of a modulator that has inherent immunity to flicker noise, common-mode and power supply noise. A DC-coupled instrumentation amplifier is embedded inside the modulator to achieve high linearity at peak input signal and to provide a high-impedance interface with the electrodes. The output voltage swing requirement of the instrumentation amplifier is relaxed by employing a current-steering DAC and a multi-bit quantizer implemented as a tracking ADC. The system prototype is fully verified and the measurement results show consistent performance across ten packaged samples. The remainder of the paper is organized as follows. The proposed system architecture is explained in the Section II. The circuit implementation is described in Section III. The measurement results are reported and discussed in Sections IV and V, respectively. Finally, the paper is concluded in Section VI.
II. SYSTEM ARCHITECTURE

A. Design Challenges
To acquire a deep understanding of the proposed readout architecture, key concepts are progressively introduced as shown in Fig. 2 . The readout architecture presented in Fig. 2a consists of a DC-coupled instrumentation amplifier (IA) followed by a multi-bit modulator [29] . The chopping technique is used to mitigate the flicker noise contribution of the readout architecture, as well as to separate the input signal spectrum from any low-frequency common-mode or power supply noise. A dynamic element matching (DEM) technique is employed to circumvent the linearity limitations of a multi-bit DAC. However, the readout architecture in Fig. 2a introduces two key design challenges: the need for a powerefficient quantization of a chopper modulated input signal and the high linearity requirements of the instrumentation amplifier.
The first challenge can be resolved by employing a modulator that provides noise-shaping with a low-pass (instead of the traditional high-pass) characteristic, such that the oversampled narrow-band input signal is located at a carrier (chopping) frequency. This type of modulator is referred to as a high-pass (or chopper-stabilized) modulator, because the gain of the loop filter is highest at half of the sampling frequency, f s /2, which corresponds to a high-pass filter in discrete-time [29] - [32] . The carrier frequency of the narrowband input signal has to be at f chop = f s /2. Otherwise, the high-pass modulator reverts to a band-pass topology and the order of noise shaping decreases accordingly. The second challenge can be mitigated by either employing offset rejection techniques or, as it is proposed in this work, by embedding the IA inside the modulator to take advantage of the multi-bit quantization and to enable high linearity performance in a power-efficient manner [33] . As shown in Fig. 2b , the IA is placed after the comparison with the DAC, which bounds the voltage swing at the IA's input to the magnitude of the least significant bit (LSB) of the DAC. Consequently, the input feed-forward (IFF) path is DC-coupled to the microelectrode array, which decreases the input impedance of the readout architecture because of the sampling operation. Removing the IFF path is a potential solution to this issue. However, that would put a more stringent requirement on the amplifiers (OTAs) that are needed to implement the loop filter, H (z). A better solution, as it is implemented in the proposed readout, is to reroute the IFF path from the IA's output and add a digital counter to track the magnitude of the difference between the input signal and DAC output (e.g. the quantization error), as shown in Fig. 2c . In this way, the DAC is updated only when the magnitude of the X (z) signal increases beyond the thresholds of the comparators in the flash ADC, like in the case of a tracking ADC [34] . As highlighted in Fig. 2c , the key building blocks of a tracking ADC are the flash ADC, the counter and the DAC. More importantly, the quantization error, which is needed to enable the tracking operation, is contained in the IFF path. The benefit of this solution is twofold: the input impedance, Z in , increases, and the number of comparators in the flash ADC reduces.
B. IFF Path Design
To avoid direct feedthrough loops in the modulator, which is shown in Fig. 2c , the IFF path has to be sampled in a different phase than the sampling phase (S) of the loop filter, called the tracking phase (T). Consequently, these two phases introduce a phase delay in the signal path. The impact of this phase delay is analyzed in MATLAB/Simulink by modeling it with a 2 nd order Thiran all-pass filter, which has an excellent phase delay approximation and a flat group delay at low frequencies [35] . The transfer function of a 2 nd order Thiran all-pass filter is given by
where a 1 = 0.4 and a 2 = −0.0286 are the filter coefficients, which correspond to a phase delay of half a clock period. Additionally, T (z) has a net delay of two clock periods, which can affect the modeling of the loop. Therefore, the system model in Fig. 3 is rearranged compared to Fig. 2c , to move T (z) out of the loop. In this way, the net delay of T (z) does not have an impact on the stability of the system model. The model for the instrumentation amplifier, IA(z), appears twice since it impacts both V in (z) and D out (z), which correspond to the up-converted input and output of the readout, respectively. The IFF path, in Fig. 2c , is decomposed into a feed-forward signal path, which is modeled as a delay of z −2 to cancel out the net delay in T (z), and a feedback path represented as a delay of z −1 . The 'mixer-Counter-DEMmixer-DAC' path shown in Fig. 2c is substituted in the system model, for the sake of simplicity, with a functionally equivalent digital high-pass filter (1 + z −1 ) −1 .
C. Linear Model Analysis
The established linear model, given in Fig. 3 , can provide a qualitative understanding of the proposed readout architecture and its implications on the circuit implementation. For instance, a conventional modulator with an IFF path will have a signal transfer function with unity gain over the entire Nyquist bandwidth, however, in the proposed readout this is not the case as it can be seen in the following expression: 
where G ≈ 0.8 is the gain of the flash ADC and the IA's transfer function, IA(z), is approximately given by:
where τ ≈ T s /10 is the settling time of the IA. The H (z) is implemented as a feed-forward filter, which is adopted from [29] , and is given by:
The magnitude of the STF (z), shown in Fig. 4a , has a high-pass characteristic with unity gain over a wide range of input frequencies near f s /2, but it does not change significantly the requirements of the modulator given in [29] .
Similarly, the instrumentation amplifier also introduces a phase delay, which has an important impact on the stability of the modulator, in particular on the noise transfer function, which is given by:
Even though the modulator is operating at a relatively low sampling frequency, f s =153.6 kHz, with an oversampling ratio of 256, the IA(z) is part of the feedback path and its phase delay has a similar effect as the excess loop delay (ELD). As it can be seen in Fig. 4b , a peaking occurs in the magnitude response of the NTF (z), which is caused by the interaction of the poles in the unity circle shown in Fig. 4c . Since the delayed output of the IA(z) is also added to the output of H (z), the IA(z) also fulfills the role of the ELD compensation filter [36] . In order to maintain stable operation, the voltage swing at node X (z) cannot exceed the full-scale range of the flash ADC, which is represented as a linear gain block, G, in Fig. 3 . This implies a minimum number of comparators, which can be determined from the following expressions:
As it can be seen from the magnitude response of (6) and (7) given in Fig. 4d , the quantization error, E(z), can increase by a factor of 2.77 (or 8.85 dB), while the out-of-band tones of V in (z) can appear with at most unity gain. Due to the gain in (6), the voltage swing at X (z) can increase beyond the first quantization range in the flash ADC, e.g. ±V LSB /2. In this case, an additional pair of comparators is required to maintain the tracking operation when the swing at X (z) reaches the next quantization range, e.g. ±3·V LSB /2. These additional comparators function as redundancy, which protect the tracking ADC from overloading. Consequently, the flash ADC requires minimum four comparators to capture these five level crossings (including zero) at X (z). On the other hand, if the input signal contains large out-of-band interferes at odd multiples of the chopping frequency, then low-pass filtering can be applied to ensure that the voltage swing at X (z) remains within the quantization range.
System architecture description and analysis demonstrate that the operation of the proposed readout is similar to a conventional modulator. However, as shown in Fig. 3 , the instrumentation amplifier is considered both part of the signal and feedback path, which makes its circuit integration with the loop filter a key challenge for achieving high linearity.
III. CIRCUIT IMPLEMENTATION
A. Loop Filter
At the core of the proposed readout is a 2 nd -order feed-forward switched-capacitor filter, and its single-ended representation is shown in Fig. 5 . The switched-capacitor (SC) implementation has been chosen considering the reported potential power-efficiency [29] . Alternatively, recently reported work in [37] suggests that a continuous-time modulator with chopped integrators at half the sampling rate could also be a power-efficient solution.
Stable operation is established by matching the sampling capacitors, C s , in the IFF path and the first active block of the loop filter. In this way, a compensation filter, for the ELD introduced by the IA, is implemented without additional complexity. Apart from the sampling (S) and tracking (T) phases, the loop filter employs an asynchronous reset (R) and a preset phase (P). The R phase, which is not explicitly shown in Fig. 5 , is used to set the initial condition of the integrators, while the P phase is used to equalize the charge in the parasitic capacitors, at the input terminals of the OTA 1,2 , before mixing. Additionally, a passive mixer is placed in series with C s of the IFF path to create a simple switch-RC filter [38] , which improves the input signal response at the multiples of chopping frequency while suppressing out-of-band interferes that can potentially overload the internal quantizer.
Reconfiguration of the loop filter has been implemented to validate the immunity to flicker noise of the proposed readout. The loop filter can be reconfigured to a high-pass or a lowpass mode by adjusting the polarity of the second feed-forward path from the OTA 1 and by disabling the switching activity of the complementary chopping clocks, f chop,A1 and f chop,B1 (see Fig. 5 ). Correspondingly, the SC high-pass filter reverts to a conventional integrator and the adjustment in polarity of the OTA 1 output is needed to satisfy the mathematical substitution of the z variable in H (z) with −z, which completes the transformation from a high-pass to a low-pass topology. The chopping frequency is set to half of the sampling rate of the modulator (76.8 kHz). As a result, a higher flicker noise corner of the IA can be tolerated, but, the influence of the parasitic capacitance on the system input impedance is more pronounced. Simulation results show that the input impedance of the IA (without packaging, ESD diodes, and I/O pad parasitics) is limited to approximately 200 M , which is sufficient for an ECoG recording interface [12] . The parasitic capacitance of the transistors in the chopper (mixer) circuit is negligible due to their small dimensions (W/L = 800 nm/180 nm). Moreover, the corresponding magnitude of charge injection (≈ fC) is considerably lower than the magnitude that can be found in literature, for brain stimulation (≈ nC) [18] - [20] . Additionally, the proposed readout architecture operates with a 0.6 V common-mode voltage, which is beneficial for reducing the thermal noise density and signal dependent charge injection introduced by the switches.
In the high-pass mode, the dominant noise source is the thermal kT/C noise, which is generated primarily by the first high-pass filter and secondarily by the IA and DAC. Because the system is limited by the kT/C noise, the low noise performance has to be traded-off with the capacitor area. Also, a substantial sampling capacitance puts a tougher challenge on the IA specifications, which increases the power dissipation. Alternatively, lower noise performance can be achieved by increasing the sampling rate of the modulator while maintaining the same cut-off frequency in the digital filter (see Fig. 1 ), but, at the cost of higher power consumption in both analog and digital domains. Moreover, a higher sampling rate, and subsequently a higher chopping frequency, reduces the input impedance of the system. Therefore, we choose C s and f s to be 6.6 pF and 153.6 kHz, respectively, as a trade-off between the noise, area/power consumption, input impedance and IA linearity.
B. Current Feedback Instrumentation Amplifier
A low power, low noise, high common-mode rejection, current feedback topology is chosen for the IA [9] . The circuit diagram of the IA is shown in Fig. 6a . To achieve a low power consumption despite the relatively high chopping frequency, all transistors in the signal path are biased in weak inversion region, while the transistors in the biasing network are forced to moderate inversion to reduce their thermal noise contribution. For the input stage, M 1,2 , PMOS transistors are chosen due to lower flicker noise contribution and improved isolation from the substrate noise. Alternatively, triple-well transistors could have been used, but the large design rule separation between the different wells would impact the matching accuracy. In order to introduce design freedom in setting up the DC operating point for the input and output stages, the intermediate stage is implemented as an NMOS folded-cascode with M 3,4 , which increases the loop gain and, in turn, the linearity of the IA, but, at the cost of an increased noise budget. The output stage, M 5 and M 6 , provides voltage buffering and current feedback via source degeneration resistor, R 2 . Consequently, the output stage experiences both the input and output voltage swings, which negatively impacts the linearity of the current-feedback IA. However, by including the IA inside a multi-bit modulator, this issue is resolved partially since the output voltage across R 2 is now reduced to the LSB of the DAC. The remaining part of the issue is the input signal swing, which induces a large current across resistor R 1 . Thus, we choose R 1 and R 2 to be approximately 70 k as a trade-off between linearity and noise performance. Additionally, the linearity performance is also affected by the IA loop-gain at the chopping frequency and its harmonics. Considering the difficulties of stabilizing a high loopgain amplifier with multiple poles, a compensation capacitor, C C ≈ 660 fF, is added in parallel with the drain of M 3,4 to define a dominant pole. The position of the high-frequency poles introduced by the source nodes of M 1,2 and M 3,4 is carefully controlled by maintaining low parasitic capacitance. Moreover, the pole introduced by C s is effectively canceledout by choosing R 3 , (shown in Fig. 5 ), to be approximately the same as R 2 . The combination of R 3 and C s produces an equivalent resistance in the order of a M , which has a negligible loading effect on the IA gain. Considering the noise performance, the value of R 3 is theoretically irrelevant in a kT/C noise limited system. However, it has a direct impact on filtering the IA thermal noise before aliasing. The value of R 3 also impacts the settling-time, and consequently the ELD, which is determined by τ ≈ (R 2 /2 + R 3 )C s . Based on system analysis and circuit simulation results, a τ ≈ 0.7 μs is chosen as a trade-off between the IA noise aliasing and ELD.
C. Current-Steering DAC
The DAC is implemented with a current-steering topology, which provides a high output impedance and enables the IA to be interfaced to the electrodes via a high-impedance terminal. In principle, the DAC injects a current offset at the start of each sampling phase, which is compared against the induced current of the up-converted input signal. The resulting residue represents the quantization error, which is in the order of ±R 2 I LSB . The current-steering DAC also provides a biasing current for transistors M 5,6 via ten unit elements, as it can be seen in Fig. 6a . When the DAC <9:0> thermometer code has the same number of logical "1" and "0", the biasing current is equally distributed among the IDAC+ and IDAC-current branches. Each element is implemented as a PMOS current source with a cascode, as it can be seen in Fig. 6b , so that the equivalent output impedance of all unit elements does not introduce even order distortions [39] . The minimum transistor dimensions are defined by the matching accuracy between the unit elements. Behavioral level simulations showed that a maximum 1% standard deviation can be tolerated in the drainsource current of a unit element when a first order dynamic element matching (DEM) is applied to the DAC. Based on the Monte Carlo results with both device and process mismatch, a W/L ratio of 80 μm / 8 μm is selected. The chopping of the IDAC output is performed in the digital domain by applying the XOR operation between the f chop,A1 clock and DAC <9:0> signal. For testing purposes, an external capacitor (C ext ≈ 1 μF) is used to provide a clean biasing reference.
D. Tracking ADC
As it was explained in Section II, the tracking ADC consists of the aforementioned current-steering DAC, which is linearized with the DEM, as well as a flash ADC with a reduced number of comparators and a digital counter.
1) Flash ADC:
The output of the loop filter, X (z), is quantized by the flash ADC implemented with 4-comparators, while being assisted by a digital counter and the currentsteering DAC to achieve the same functionality as a flash ADC with 10-comparators. Each comparator consists of a differential difference amplifier (DDA) and a current driven strong-arm latch, as it can be seen in Fig. 7a . The propagation delay of the latch will vary depending on the magnitude of the signal swing at X (z). As a result, simple combinational logic is used to generate asynchronous pulses (CLK 1,2,3 in Fig. 7a ), for the subsequent digital circuits, when the outputs of all comparators resolve. An alternative solution, though with more significant power consumption and circuit complexity, would be to use a higher clock frequency and a timer to precisely generate the necessary clock pulses.
The voltage references for the comparators are implemented via a resistor ladder driven by a current source, equal to half of the DAC unit value, to further reduce the voltage swing at node X (z). The common-mode voltage of the ladder is regulated to 0.6 V, by using a feedback amplifier. The ladder resistors are the same value as R 2 and are placed close together on the die to ensure proper matching.
2) Counter and Tree-Structured DEM: The digital counter, shown in Fig. 7a , operates as a bi-directional shift register, which can shift by one or two steps or maintain the previous value. The outputs of the flash ADC controls the direction and amount of the shifting, as it is depicted in Fig. 7a . The challenging part of the counter control logic is that it has to determine the polarity of a fully differential signal X (z) without a direct point of reference, due to a reduced number of comparators. A simple solution to this problem, as implemented in the proposed work, is to detect which of the four comparators activates first when the counter is at zero (D out<3:0> = 0), which is functionally equivalent to zerocrossing detection. The sign extraction logic is triggered after approximately t 1 = 1 ns (see Fig. 7a ) of the rising edge of the CLK. The output of the counter, which is triggered t 2 = 10 ns after CLK 1 , is converted to a 3-bit binary number D out<2:0> and a sign bit, D out<3> . The final signed binary output D out<3:0> has to be first down-converted for the DEM to function correctly. The down-conversion is achieved by toggling the D out<3> in the sign extraction logic during every clock cycle. An example waveform of the digital output, D out<3:0> , is shown in Fig. 7b , where we can see that it usually changes between adjacent quantization levels, but in some cases it can change by two LSBs. Thus, the tracking ADC monitors the slowly changing X (z) and updates the D out<3:0> , and consequently the DAC output, accordingly when X (z) increases beyond the comparator thresholds.
A tree-structured DEM, presented in [29] , is selected over other linearizion techniques because of the following advantages [40] . 1) It is not as susceptible to generating spurious tones in the signal-band, 2) it can be implemented with an arbitrary number of DAC levels and 3) the required digital logic is relatively simple, and it introduces only the combinational network delay (low latency). The internal registers of the tree-structured DEM, which control the switching logic, can be updated during the falling edge of CLK 2 , which does not introduce additional latency. As a result, the tree-structured DEM functions as a look-up-table, which is randomized for each subsequent digital input. A parallel register is added between the DEM <9:0> output and the DAC <9:0> , as shown in Fig. 7a , to prevent any glitches from propagating to the IA during logic transitions. The parallel register is triggered by the rising edge of CLK 3 , which is delayed by t 3 = 10 ns after CLK 2 .
E. Clock Generation, Biasing Network and Input CMFB
The non-overlapping clocks for the SC circuits are generated with a pair of cross-coupled NOR gates and delays. As mentioned in Section III-A, the common-mode level at the source/drain of an NMOS switch is reduced to 0.6 V compared to the popular choice of V DD /2, which eliminates the need for CMOS switches or clock bootstrapping. The chopping clock is created by using a T flip-flop, which divides the sampling clock frequency by two. The delays for setting up the triggering events (for the counter, DEM and DAC register) are implemented with a cascaded stage of inverters, which are loaded with MOS-capacitors to allow for a compact integration with the digital circuitry. The internal biasing current is implemented with a PTAT current reference (constant-Gm) circuit. The input common-mode feedback (I-CMFB) of the IA is set to 0.6 V with a single-ended differential pair biased with 1 μA and stabilized externally using a 10 pF capacitor.
IV. MEASUREMENT RESULTS
The proposed readout architecture is implemented in a 0.18 μm CMOS triple-well process. The die micrograph with an overlying transparent image of the layout with block annotations, is shown in Fig. 8 . The active die area, excluding the bonding pads, the logo, and the I/O drivers, is approximately 0.43 mm 2 . There are four separate power domains in the chip, as it can be seen by the power-cuts in Fig. 8 , two for the analog and two for the digital domains. The IA, DAC and the biasing network fall under one 1.8 V analog domain, while the SC circuits and the flash ADC have a separate analog domain. Similarly, in the digital domain, the DEM and the remaining on-chip logic are powered with a 1.8 V digital supply, which is separated from the digital I/Os supply. Additionally, n-well trenches are used to increase the lateral isolation of the noise-sensitive blocks. All digital circuits are placed in deep n-wells to improve the shielding of the substrate from the digital activity. The chip is bonded to a 44-pin plastic leaded chip carrier package for testing purposes. The packaged chip, e.g., the device-under-test (DUT), is characterized using a custom evaluation board (EVB).
A. Measurement Setup
An ultra-low distortion signal generator (Keysight U8903B) is used to create a pure test-tone inside the signal bandwidth, as it is required for measuring the linearity of a DUT. The balanced test-tone is first processed by a common-mode (CM) level shifter (AD8475), which sets the common-mode voltage to 0.6 V. The common-mode level of the test-tone is coarsely defined via an external current source and is regulated by the I-CMFB of the IA. The on-board anti-aliasing RC filter is implemented with a 200 pF ceramic capacitor and 1 k resistors with a ±0.1% tolerance rating. The voltage supply and reference levels are provided to the DUT via onboard voltage regulators (LP5912, TPS7A88, TPS7A8300). A function/ arbitrary waveform generator (Agilent 33250A) is used to provide the external oversampling clock. The output data stream is captured by a digital logic analyzer (Tektronix TLA621), which is triggered externally by the aforementioned oversampling clock. A Hann 2 window is applied on the data stream, before calculating a 65536-point fast Fourier transform (FFT), to prevent signal leakage due to the lack of synchronization between the clock and signal generator. The resulting power spectral density (PSD) is averaged eight times to reduce the measurement uncertainty in the results. The DC offset introduced by the custom EVB is removed in post-processing.
B. Measurement Results
The measured power consumption of the DUT, excluding the I/O pads, is approximately 54 μW. Post-layout simulations show that the power distribution is as follows: 33.2 μW is consumed cumulatively by the IA, DAC, biasing network and I-CMFB; 11.7 μW is consumed mostly by the switchedcapacitor circuits and partly by the flash ADC; the remaining 9.1 μW is dissipated in the digital circuits.
The PSD measurements, when the DEM is turned on and off, are shown in Fig. 9 . The harmonic distortion, which is generated by the device mismatches in the current-steering DAC, dominates the output spectrum of the DUT when the DEM is turned off. Note that the even-order harmonics are not canceled since they are caused by the mismatch errors between the unit elements (and not due to the transistor nonlinearity). The spurious-free dynamic range (SFDR) improves by approximately 25 dB when the DEM is activated and peaks at around 95 dBc. Subsequently, the SFDR is no longer limited by the mismatch errors in the DAC, but by the nonlinearity of the IA. Based on the measured PSD, the signal-to-noiseand-distortion ratio (SNDR) is approximately 88.1 dB for an in-band tone at 37.5 Hz with an amplitude of −4.7 dBFS. Considering the frequency resolution of the FFT and the applied averaging in the measurement setup, we can conclude that the SNDR is limited by the in-band noise and not by the nonlinearity of the DUT. These findings are consistent in all available packaged samples as shown in Fig. 10 . As previously explained in Section II-C, there is a noticeable peak in the noise shaped spectrum in Fig. 9 , which is more pronounced in the measurements than predicted by the simple linear model, due to a gain error in the IA at the chopping frequency. It is worth considering that the DC voltage offset of the DUT spreads to the two adjacent bins, as highlighted in Fig. 9 and Fig 11, due to the windowing function and should not be mistaken with the flicker noise.
The noise floor and the DC offset of the DUT are measured by connecting the differential inputs of the CM level shifter to the ground of the EVB. The results of the noise measurement are shown in Fig. 11 , while a histogram of the in-band noise rms values across samples is given in Fig. 12 . The high-pass modulator outperforms its low-pass counterpart in both the rms noise and average input DC offset. The DC offset of the HP modulator is approximately 25.85 μV rms across all samples. As expected, the characteristic slope (−10 dB/decade) of the flicker noise dominates the in-band frequencies when the low-pass mode is activated. By configuring the loop filter to the high-pass operation, the rms noise reduces by more than three times on average, as shown in Fig. 12 . Note that the spurious out-of-band tones that are present in the noise measurements are caused by the pattern noise generated by the modulator for a constant input (offset).
The dynamic range of the DUT is measured by sweeping the input amplitude at 37.5 Hz from −77.7 to −3.5 dBFS and is shown in Fig. 13 , Considering that dBFS refers to
, where V FS is the DAC full-scale value of approximately 301.37 mV peak and v diff,in is the input amplitude, the peak SNDR/SNR performance of 88/89 dB is measured for an input amplitude of −4.82 dBFS. The proposed architecture achieves 92 dB of dynamic range (DR). As it can be appreciated from Fig. 10 , the results are in good agreement across all samples. Similarly, the peak SNDR/SNR performance remains relatively consistent for various in-band test frequencies, as shown in Fig. 14 . The power supply rejection ratio (PSRR) measurements are performed by providing the 1.8 V supply voltage together with a −7.5 dBFS test-tone at 57.5 Hz. The corresponding PSRR is approximately 100 dB on average, as shown in Fig. 15 . The same setup is used for the common-mode rejection ratio (CMRR) measurements, but with the common-mode voltage set to 0.6 V. The measured CMRR of the DUT is approximately 98 dB on average, as shown in Fig. 15 . The largest impedance mismatch, between the two recording electrodes, Z elect1 and Z elect2 , for which the intrinsic CMRR of the system deteriorates by −3 dB, is approximately 1.2 k (assuming Z in >>Z elect1 +Z elect2 ). Subsequently, the DUT input impedance, Z in , is limited to 94 M at 57.5 Hz due to the additional parasitic capacitance from the chip package.
V. DISCUSSION
A performance comparison of the proposed system with the state-of-the-art ASICs for ECoG recording is summarized in Table I . The total harmonic distortion (THD) of the proposed architecture surpasses that of the stateof-the-art, in particular, when considering the maximum peak-to-peak differential input (v pp,max ). In conventional readout architectures, the THD and noise are usually reported separately for the analog front-end and ADC without capturing the achievable ENOB of the entire system. Therefore, we combine the reported noise,v in,n , and THD performance (for v pp,max ) in literature to calculate the peak SNDR, and subsequently the ENOB, by using the following expression:
Compared to the SNDR expression from [27] , in (8) we also include the linearity performance of the system. The assumption is that the THD is primarily dominated by the third order harmonic, which is a reasonable assumption for a differential architecture, but, also optimistic since it neglects the existence of other harmonics. The ENOB of the system is then calculated as: ENOB = SNDR' − 1.76 6.02 .
Having ENOB, we can now have a comparison of the power-efficiency at system level across different readout designs by using the well-known Walden figure-of-merit (FOM W ), which is given by:
where P total is the total power consumption, and BW is the signal bandwidth. The FOM W represents the amount of invested energy in the readout architecture and has been recently used to compare neural recording interfaces for action potentials [41] . As it can be seen in Table I , the proposed ECoG readout achieves an excellent power-efficiency of 4.37 pJ/conversion-step and 14.33-ENOB. Additionally, according to the Schreier FOM, the proposed solution achieves a power-efficiency of 155 dB, which is among the best of the start-of-the-art, due to the high ENOB. However, the presented chip prototype has a large area consumption compared to the state-of-the-art, primarily due to the sampling capacitors in the first high-pass filter. Although if we calculate the product of the FOM W and area, which is a metric that has been recently introduced in [41] , the proposed architecture has a moderate utilization of both power and area. On the other hand, the input referred noise (IRN) is also comparatively higher due to the discrete-time operation of the loop filter. A potential solution, though at the cost of the reduced maximum input range, is to increase the IA voltage gain to approximately 4 (e.g. 12 dB), which should reduce the total IRN by half and the peak input swing to ≈ ±50 mV, which is currently a more popular value among the state-of-the-art. 
VI. CONCLUSION
This paper has presented a based readout architecture with a 92 dB dynamic range, which can be used for ECoG recording in the presence of brain stimulation artifacts and electrode offset. The readout architecture is comprised of a high-pass modulator with a tracking ADC and an embedded current-feedback IA. The proposed system can record a peak input signal of ±240 mV over a 300 Hz bandwidth with a 14.33-ENOB. Additionally, the proposed readout architecture reports an excellent power-efficiency and a moderate areaefficiency of 4.38 pJ/conv.-step and 1.88 pJ·mm −2 /conv.-step, respectively.
