Abstract-This paper presents a low-power ECG recording system-on-chip (SoC) with on-chip low-complexity lossless ECG compression for data reduction in wireless/ambulatory ECG sensor devices. The chip uses a linear slope predictor for data compression, and incorporates a novel low-complexity dynamic coding-packaging scheme to frame the prediction error into fixed-length 16 bit format. The proposed technique achieves an average compression ratio of 2.25× on MIT/BIH ECG database. Implemented in a standard 0.35 µm process, the compressor uses 0.565 K gates/channel occupying 0.4 mm for four channels, and consumes 535 nW/channel at 2.4 V for ECG sampled at 512 Hz. Small size and ultra-low-power consumption makes the proposed technique suitable for wearable ECG sensor applications.
I. INTRODUCTION
H EALTHCARE spending is increasingly becoming the major source of expenditure in many countries, with the U.S. alone spending roughly 18% of its GDP on healthcare. Cardiovascular diseases are one of the leading causes of this spending. Aging and increasing life expectancies are expected to skyrocket these expenses in the near future. The way forward to reign in these costs and improve quality of life is to focus on prevention and early detection of diseases by proactively monitoring the individual's health condition using low-cost wireless wearable sensors.
The main challenge in the development of a low-cost wearable electrocardiogram (ECG) sensor is the design of an ultralow-power ECG chip, which can acquire, process, and wirelessly transmit the ECG signal to a remote doctor via a personal gateway in real time. A high level of integration, with built-in signal acquisition and data conversion [1] , helps reduce the size and cost of such a sensor. The single largest source of power consumption in the sensor is the wireless transceiver. In the scenario of continuous ECG monitoring, a large amount of ECG data is acquired and it has to be either stored locally in a flash memory or transmitted wirelessly to a sensor gateway, resulting in large memory and high energy consumption at the sensor. In some cases, on-chip SRAM blocks are used to buffer the ECG data in order to facilitate burst-mode transmission. However, this results in large chip area [2] and increases the overall cost of the device. Local data compression is an attractive option for such devices. By reducing the amount of data through compression, it helps to minimize the power consumed by the radio for wireless transmission while reducing the size of on-chip SRAM/flash memory and sensor battery. Although lossy compression techniques provide higher compression ratios (CR), we focus on lossless schemes so as to prevent the loss of any signal useful in the diagnostic procedure [3] . Furthermore, lossy compression techniques have not been approved by medical regulatory bodies in many countries and hence cannot be used in commercial devices due to liability concerns. Most of the existing literature on lossless ECG compression predominantly focuses on achieving higher CR. In the context of wireless sensors and ambulatory devices, ultra-low-power operation, low-complexity implementation, and multi-channel support are also important to make sure that the energy and memory savings obtained from the compression are higher than what is consumed by the compressor itself.
This paper describes the development of an ECG acquisition chip with fully integrated lossless compression engine (first presented in [4] ) with ultra-low power, which can reduce the system-level power consumption by half without any loss of signal quality. The compression scheme does not require use of costly memory at transmitter or receiver and always generates a convenient fixed-length data output which avoids the need for further packaging. The design has low hardware complexity and achieves low power consumption.
The rest of the paper is organized as follows. In Section II, the system architecture is presented. The analog front-end is detailed in Section III. The compression scheme and performance evaluation are given in Section IV. The architecture of the proposed scheme and its implementation are discussed in Section V. Measurement results are shown in Section VI. Conclusions are given in Section VII. 
II. SYSTEM ARCHITECTURE OF ECG SOC CHIP
The system block diagram of the proposed ECG SoC is shown in Fig. 1 . The front-end consists of four ECG recording channels, a multiplexer (MUX), and a 12 bit successive approximation (SAR) ADC. The digital back-end includes a lossless compression block, a real-time clock (RTC) module, and a serial peripheral interface (SPI). To improve the ECG signal quality and suppress the 50/60 Hz power-line interference, a driven-right-leg (DRL) circuit is included. The output of the DRL is connected to the right-leg (RR) electrode to stabilize the subject potential and improve the common-mode suppression. A low-power 32.768 kHz crystal oscillator driver and a CMOS bandgap reference are also integrated on-chip in order to minimize the number of off-chip auxiliary circuits. The ADC sampling rate is configurable for either 256 or 512 Hz for a balance between signal quality and the amount of data. The whole chip is designed to work under a single 2.4 to 3.0 V power supply.
III. ANALOG FRONT-END
The analog front-end (AFE) is often a bottleneck of the system in terms of noise and linearity performance. The input-referred noise needs to be low enough for accurate biomedical data acquisition. The signal distortion should be less than 1% even at the 3 V full-scale output. Moreover, as the sampling rate of the ADC is at 256/512 Hz, a higher order low-pass filter with less than 100/200 Hz cut-off frequency is required for minimizing the aliasing errors. The limited power budget discourages extra active anti-aliasing filters. In our design, the signal bandwidth is reduced by designing a low-bandwidth operational transconductance amplifier (OTA) for both the instrumental amplifier (IA) and the programmable gain amplifier (PGA). This section highlights the circuit design considerations and trade-offs for the analog front-end. Fig. 2 shows the architecture for a single-channel ECG frontend. The AFE includes a low-noise IA, a PGA, and a rail-to-rail output buffer (BUF). The IA amplifies the ECG signal with a fixed gain of 125. Tunable pass-band gain is achieved by tuning the PGA gain through G 1:0 . For anti-aliasing purposes, the low-pass cut-off frequency can be adjusted within 35-175 Hz, by changing the PGA's frequency response. The following unity-gain buffer improves the settling time for the MUX output signal, reducing the residual errors [1] .
A. ECG Channel With Pseudo Resistors
The IA adopts the capacitively coupled technique to block the input DC offset. While the amplitude of the typical ECG is around millivolts, the DC offset between the differential Ag/Cl wet electrodes could be up to 200 mV, or even higher when using dry electrodes. To avoid saturating the amplifiers and to increase the input dynamic range, the input offset shall be cancelled properly.
A simple and power-efficient way to block the DC offset is by using a high-pass filter. Since the low-frequency component of the ECG traces around 0.5 Hz still contain important information for ST segment analysis, the high-pass corner needs to be set at 0.05 Hz or lower. Large capacitors and resistors are hence required to achieve such a low cut-off frequency. In our design, pseudo resistors [5] replace the passive resistors to save the chip area. The pseudo resistors are normally constructed by two or more diode-connected PMOS in parallel. The bulk of each PMOS is connected to the source or drain, so that the performance of the pseudo resistor is less dependent on the absolute voltage applied onto it.
Two types of pseudo resistors are used in the IA and the PGA stage respectively. The simulated resistance versus input voltage across the pseudo resistors is plotted in Fig. 3 . As in the IA stage is around 0.5 pF, the equivalent resistance of the pseudo resistor in this stage should be at least . Because the input amplitude is small for the IA stage, it is unnecessary for the pseudo resistor in the IA stage to support large input. So the Type A design with two PMOS transistors is adequate. For the PGA stage, however, the input amplitude could be as high as 1.5 V. Since the resistance of the Type A design is around 1 M at 1.5 V, the current flowing through the pseudo resistor would be about 1 A when output amplitude is large, causing loading errors at the output. It is therefore necessary to use a different pseudo resistor structure with higher resistance for a wide input range. By cascading two Type A designs in series, the resulting resistance of the Type B design is 10 higher than Type A at 1.5 V. As shown in the testing result later in Section VI, the cascaded pseudo resistor helps to achieve less than 0.4% total harmonic distortion (THD) at 3 V output. input pair, and , the OTA transconductance can be changed, which is given by (1) To improve the noise-to-power efficiency, all of the input transistors -are biased in the subthreshold regime. The thermal noise current of a MOS transistor operated in the weak inversion [6] can be modeled as (2) where is the subthreshold slope factor, which is around 1.3 as simulated in the target technology. The noise contributions of the cascaded transistors and the tail current sources are negligible. Also, the first stage dominates the noise for a typical twostage OTA. Based on the simplification mentioned, the input-referred thermal noise of this OTA is approximately
B. Operational Transconductance Amplifier (OTA)
The noise efficiency factor (NEF) [7] is used to benchmark the noise-to-power trade-off, which is defined by (4) where is the total current and BW is the amplifier bandwidth. Suppose that the drain current of -is , and the drain current of and is . Also under the EKV model [8] , the of a subthreshold MOS transistor is approximately (5) where 26 mV. If we further ignore the flicker noise and the noise contributions from -and the common-mode feedback (CMFB) circuit, the optimal NEF for this OTA architecture is given by (6) Nonetheless, it should be noted that the current of the first stage cannot exceed the total current . To minimize the NEF, given that is required, should be maximized to take advantage of the current reuse between -. If that all the current flows to the inverter-based amplifier branch, the minimum NEF of 1.3 can be achieved. Unfortunately, pursuing the highest NEF would impair other design targets like the bandwidth limits and the settling time, which is discussed in the following.
First, unlike other high-speed designs, it is advantageous to limit the bandwidth of the IA for ECG capturing. The intrinsic low-pass characteristic of the OTA helps to suppress the highfrequency noise and artifacts without any extra active filters. This issue becomes even more critical as the sampling rate is 256/512 Hz or lower. Since the bandwidth of the IA is (7) both the IA pass-band gain, , and the Miller-compensation capacitance values, and , have to be increased to limit the bandwidth. As the gain is also determined by the capacitor ratio given in Fig. 2 , either approach demands significant capacitor area on chip. Alternatively, the can be reduced, with the side effect of increasing the noise floor.
Second, large output slew rate is required to mitigate the output distortion. Because the IA gain is designed to be 125, harmonic distortions could be introduced even at the IA stage. The most straightforward way to improve the linearity is to increase the static current at the output stage. Simulation also shows large output stage current improves the baseline recovery time after resetting the IA. But excessive current at the second stage would inevitably affect the power utilization and the transconductance . An output boosting technique called quasi-floating gating [9] is also used to push the second stage of the OTA into class-AB operation. The gates of and are partially controlled by the first stage's outputs through the small capacitors and , so that the transconductance of the second stage and is enhanced. Last but not least, the common-mode feedback circuit should be carefully designed to avoid stability issues. As the drain currents of and are controlled by the common-mode feedback circuit, setting too small would cause the CMFB to fail to adjust the common-mode current. On the other hand, using large is likely to introduce CMFB stability issues. Moreover, the CMFB circuit itself requires minimal current dissipation to ensure enough common-mode settling time and the loop stability.
With all the trade-offs mentioned above, we allocate half of the total current to the second stage to improve the output linearity and the baseline recovery time after reset. The current ratio and are both set at 0.1, considering the bandwidth upper limits. The optimal NEF now becomes 2.4. It is worth noting that this simplified calculation does not include the flicker noise, which is optimized by using large width and length for all input transistors, to .
C. Multiplexer (MUX) and Analog-to-Digital Converter (ADC)
All channels are multiplexed to the ADC for AD conversion using an analog MUX. The ADC is based on the dual-capacitive-array architecture proposed in [10] . The MUX is implemented using bootstrapped switches as shown in Fig. 5 . The bootstrapped technique allows reduced-size NMOS to be used as the switch. Consequently, the associated parasitic capacitance and on-resistance of the switches are much smaller as compared to the conventional transmission gate. This not only ensures that the system bandwidth is not limited by the multiplexer but also minimizes the signal interference between channels. Since the breakdown voltage is 5 V in the target technology, device M2 is chosen as a diode-connected NMOS. With NMOS threshold voltage of about 1 V, this limits the boosted voltage to less than 5 V under a 3 V supply. As shown in the timing diagram, the MUX multiplexes four inputs to the ADC in sequential order. Its switching is misaligned with ADC sampling and sufficient settling time is assigned before ADC sampling to minimize signal distortion. Buffers are inserted before and after the MUX in order to address the driving issue.
IV. LOSSLESS DATA COMPRESSION SCHEME
The block diagram of the proposed compression-decompression scheme is illustrated in Fig. 6 . A short-term linear predictor is used to model the ECG and de-correlate the input signal. Here, the current sample value, , is estimated from past samples:
where is the estimate of the sample , is the predictor coefficient, and is the order of the predictor. Knowing that an ECG signal has minimal variation between samples and most signal energy is in the low frequencies, differentiators of orders ranging from 1 to 4 with integer coefficients are tested using MIT/BIH data to select an optimal predictor with minimal complexity. The coefficients of the 1st to 4th order predictors are , respectively. The predicted value is then subtracted from the actual value to obtain the prediction error, i.e., , of the current sample. For optimal compression, the prediction error should be as low as possible. Fig. 7 shows the prediction errors from MIT/BIH tape 101 for various predictors. It is observed from Fig. 7 that, due to the variation in signal statistics, predictors perform differently for various segments of the ECG signal. For segments with large amplitude variation, higher order predictors perform better, and for slow varying segments, lower order predictors perform better. Since large amplitude variations are mostly limited to the QRS segment, for which the duration is less than 12% of one ECG cycle, lower order predictors are expected to have better average prediction performance. To verify this and find the optimal predictor with the highest average performance, mean absolute prediction error (MAPE) and root-mean-square prediction error (RMSPE) for four predictors are computed for all the records in the MIT/BIH Arrhythmia database using (9) and (10) . (9) (10) where is the number of samples in the ECG record. The second-order predictor yields the lowest overall MAPE and RMSPE and therefore it is chosen for de-correlation of the ECG signal. The MAPE and RMSPE of two sample ECG records, e.g., tapes 104 and 203, are given in Table I to show the performance differences. After de-correlation, the dynamic range of is much smaller than the ECG signal, as shown in Fig. 8 . Note that, for achieving lossless compression, it needs a maximum of bits to fully represent , where is the bit width of . With the proposed scheme, only prediction error needs to be transmitted instead of the original ECG samples. At the receiver, the exact reverse process has to be carried out to reconstruct the original data as in Fig. 6 .
To reduce the bit width of , a coding scheme can be used without incurring any data loss. Variable-length coding schemes like Huffman and Arithmetic coding [11] are commonly used, which produces prefix-free codes [12] that can be packed closely. Though these approaches produce relatively optimal bit representations, the complexity of the encoder and decoder is quite high [11] , [13] . For example, the Huffman coding method associates the most frequently occurring symbols with short codewords and the less frequently occurring symbols with long codewords. This symbol-codeword association table has to be pre-constructed using a statistical dataset. The implementation of this table, for a fully lossless compression, would require a large on-chip memory (i.e., 2 locations for a 13 bit ) [13] , which will eventually compromise the savings of SRAM area achieved by the use of data compression. A suboptimal approach [12] , selective Huffman coding, encodes only frequently used symbols with Huffman codes and retains the remaining data unencoded at the expense of a decrease in compression ratio. The hardware complexity of [12] is lower compared to the statistical approach. However, it still needs an -symbol lookup table at the encoder, as well as the decoder. In addition, these coding schemes produce variable-length codes at the output and thus require further packaging into fixed-length packets before it can be stored in fixed-word-length SRAM/flash memory or interfaced through a standard I/O like SPI. This repackaging usually needs complex hardware like that proposed in [14] .
We propose a simple coding-packaging scheme, which combines encoding and data packaging in one single step. It has very low hardware complexity and achieves small area and low power while producing a fixed-length 16 bit output. The flowchart of the scheme is presented in Fig. 9 , where the error signal is represented in 2's complement format ( ). As shown in Fig. 8 , most error samples center on zero and hence can be represented by a few bits. Therefore, we only select the necessary LSBs and remove any MSBs that do not carry any information. However, these data cannot be packed closely because they do not have the prefix-free nature of Huffman codes. Consequently, a data framing structure, shown in Table II , with a unique header for different frame types, is formed to pack the error samples of varying widths to a fixed-length 16 bit output.
The dynamic data packaging scheme, as shown in Fig. 9 , uses a priority encoding scheme to frame fixed-length data from samples of multiple bit widths. As the error data is received, the algorithm checks whether the maximum amplitude of a group of the last several signal samples exceeds the value that a particular frame type can accommodate from Table II . If not, the algorithm proceeds with the next best framing option. The order of priority of frame generation is D, C, A, B, E. For Type E frames, the original sample itself is sent instead of prediction error.
A. Performance Evaluation
The bit compression ratio (BCR) of the scheme is estimated as in (11) . It indicates the number of uncompressed bits corresponding to each compressed bit.
The proposed algorithm is evaluated using ECG records from 48 patients in the MIT/BIH Arrhythmia database, sampled at [15] . An average compression ratio of 2.25 is achieved against all the records. The compression performance of the proposed algorithm with MIT/BIH Arrhythmia database is given in Table III and Fig. 10 . The proposed coding scheme achieves 4% better performance than that of Selective Huffman coding, while generating fixed-length coded output. Its compression ratio is around 15% lower than ideal Huffman coding, but at significantly less hardware cost, as will be discussed in the next section. To ensure that the proposed scheme can be applied to any ECG dataset, we also tested the scheme with the MIT/BIH compression test database, which consists of 168 patient records IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 49, NO. 11, NOVEMBER 2014 [16] . An average compression ratio of 2.198 is achieved for all the patients.
V. ARCHITECTURE FOR THE PROPOSED COMPRESSOR
An overview of the hardware architecture for a 4-channel compressor is shown in Fig. 11 . The design takes 12 bit, multiplexed 4-channel data from ADC output. The incoming data is serially multiplexed in the format of . Since the compressor works on each channel data independently, de-multiplexing has to be performed before further processing the data. The data from each channel is identified by a 2 bit channel select header appended to the ECG sample by the ADC. Each of the data streams is fed into a separate slope predictor for computing the prediction error. The individual predictors are gated with the channel-select signal to reduce the effective data switching. Since the incoming data is serially multiplexed while the slope predictor requires two consecutive samples from the same channel to produce the linear prediction, we use buffers to form two consecutive samples from serially multiplexed input stream. This way, the same arithmetic hardware for computing the linear prediction can be shared among all channels.
When initializing the compression operation, the first two samples of all channels have to be explicitly transmitted or predetermined. This is to aid the initialization of the decompression process at the receiver. In our design, we predetermined the first two samples of all channels as zeroes, i.e., predictor data registers were set to zeroes during initialization. As a result, initial samples need not be explicitly transmitted, which in turn simplifies the hardware implementation.
The next step is to compute the minimum bit-width required for representing each prediction error sample, , in 2's complement format. The bit width required for each error sample is computed and loaded into a 6-word register. Accordingly, the error amplitude comparison operations required for data framing can be reduced into comparing the bit widths of . Since the data framing format shown in Table II only accepts five types of data packets (i.e., 2, 3, 5, 7, and 8 bits), the bit widths of with only these granularities are computed by the BW compute bit block, as shown in Fig. 12(a) , which checks whether the incoming sample can be represented by using bits. A priority encoder is used to find the absolute minimum bit width required for representing the sample. The encoder assigns the highest priority to the block output, which assigns the lowest required bit width. The encoding table is shown in Fig. 12(c) .
The schematic of the bit-width computation logic is illustrated in Fig. 12(b) . The circuit checks if the leading ( ) MSBs of the error sample, represented in 2's complement, is either '0' or '1'. If so, this indicates that the sample can be represented in bits. The error samples are then packaged into fixed 16 bit frames, based on the minimum required bit widths of the error samples. The overview of the framing block hardware is given in Fig. 13 . The original data samples, error samples, and corresponding minimum bit-widths required are loaded into a 6-word register. Once the register is full, the framing controller packs the data from the register into a single 16 bit frame based on the input from frame-enabling logic. The frame-enabling logic is used to evaluate if the current samples in the register could be packed into one of the five frame types shown in Table II . This is implemented using simple comparators which compare the bitwidth values in the register with those required for each frame type. The implementation of frame-enabling logic is shown in 
A. Automatic Resynchronization
Similar to other schemes in the literature [11] , [17] , the reconstructed signal will be in error and cannot be recovered if there is a packet loss in the wireless channel. To mitigate this issue, we devised a simple technique wherein we periodically send resynchronization frames (i.e., frame Type E), so that even if a packet loss occurs, the decompression can be restored by the time the next resynchronization frame appears. These frames contain the original samples, not the prediction error, and hence the receiver is able to resynchronize the register values of the decompression block, even in the case of packet loss. In the case of a packet loss, the maximum amount of data lost will be limited to the time between the packet loss and the appearance of the next resynchronization frame, as illustrated in Fig. 15(a) . The resynchronization frames are transmitted every 4 seconds in this design. The selection of a 4 s interval for resynchronization is based on observations made in past, i.e., the corrupted data is limited to less than 1% if there is a packet loss every 10 minutes [18] . Certainly, this interval can be adjusted based on transceiver performance. In case of a much higher rate of loss, error correction schemes provided by the transceiver or higher level storage and retransmission mechanisms should be enabled. This is true even for transmission of noncompressed biomedical data, as data loss is generally unacceptable.
The control signal for enabling the resynchronization, RESYNC_EN, is generated for the framing controller as shown in Fig. 15(b) and (c) . In typical laboratory conditions, using the proposed chip, such errors never occurred.
B. Framing Controller
The framing controller block creates 16 bit data frames based on the input from the prediction error. To keep track of the number of valid samples in the registers after each framing operation, a 3 bit counter, as shown in Fig. 16 , is used to indicate that registers are fully loaded and ready for a framing operation if its value reaches 6. The state machine for the controller is shown in Fig. 17 .
At reset, the state machine is in state INIT and the counter is incremented by '1' for every data loaded into the register. The state machine output asserts the multiplexer select signal SEL (Fig. 16) to '5' which enables the counter to count up by '1' through the corresponding multiplexer input. Once the counter reaches '6', the state machine goes to state BUF_FULL and start to identify optimal framing options starting from frame Type D, which can pack up to 6 samples at a time. The state machine uses control signal CTRL (Fig. 17) which is a concatenated 5 bit signal from the framing enabling logic and RESYNC_EN signals. If CTRL , where refers to "don't care"; it indicates that Type D frames can be formed from the current register data. The state machine goes to state Frame_D, asserts the SEL signal to '0', resets the counter, and generates a Type D frame at the circuit output. After this, the state machine waits for 6 clocks to fully load the local register, by asserting SEL to '5' and increasing the counter by '1' for every clock. Similar operations take place for all other frame types. When RESYNC_EN is asserted, the state machine always chooses Type E frames by asserting SEL to '4'. For Type E frames, original samples instead of error samples are used.
VI. MEASUREMENT RESULTS AND COMPARISON
The design is implemented in a standard 0.35 m CMOS process. The measurement is performed at 2.4 V and 3.0 V. The measured results at 2.4 V are shown in Table IV . The total input-referred noise integrated from 0.5 Hz to 250 Hz is less than 1.5 V . The high-pass corner is less than 0.01 Hz, and the full-scale 3 V THD is only 0.08%. The total front-end current, including all four ECG channels, ADC, DRL, bandgap, and crystal oscillator circuit, is 12.5 A. The digital back-end consumes 0.89 A and has a core area of 0.2 2.0 mm . Fig. 18 shows the die photograph. The total chip area is 2.94 2.15 mm . Fig. 19 gives an example of the compressed signal and the ECG signal reconstructed from it. The signal tapped out from the ADC is overlapped with the reconstructed data and no data losses are observed in the decompressed ECG signal. Fig. 20 shows the evaluation prototype and gateway application developed to monitor ECG. Fig. 21 shows the input-referred noise spectrum of the ECG channel. The total input-referred noise is 1.47 Vrms, integrated from 0.5 Hz to 250 Hz. The thermal noise floor is at 55 nV Hz, with the noise corner at around 50 Hz. This design achieves a NEF of 3.31.
The digital back-end was measured under sampling frequency of 512 Hz. The compression block was operated at 32 KHz and was clock gated by the ADC's "end of conversion" signal. SPI readout was operated at 2 MHz. Since the chip includes an analog front-end, the compression performance was measured using a ST-Electromedicina ST-10 ECG signal generator. The measurement shows that the chip achieves an average compression rate of 2.55 under different heart rate conditions. Comparison of the proposed compressor with other recently published designs is given in Table V . This design achieves the lowest power and complexity with a negligible reduction in compression ratio. As the compression ratio varies with sampling rate, for a fair comparison, we present two sets [11] gives 5.8% higher compression ratio at the cost of 23 times more gate counts due to the high memory requirements of the Golomb coding scheme. Note that the design in [11] also supports 4 EEG channels and 1 DOT operation. The selective Huffman coding scheme in [17] achieves 8% more compression at the cost of 6.4 times more of gate counts. Also noted is that only 9 bits are used in [17] to represent the prediction error, , for an 11 bit input data, when the data is left uncoded. Based on our understanding, this will result in data loss and therefore may not be considered fully lossless. For a fully lossless representation, has to be represented by at least 13 bits. In our study, we obtained 2.15 compression using the selective Huffman coding scheme with a full bit-width representation for uncoded signals, as shown in Table III . The power consumption of the proposed design is 535 nW for 1 channel while the design in [17] consumes 36.4 W for 1 channel. Furthermore, the output generated by [17] has variable bit lengths and has to be further packaged into fixed length for practical interfacing purposes, which involves further hardware cost [11] , [14] . Overall, the proposed design gives the lowest power and gate count for compression without making any compromises on data integrity.
VII. CONCLUSION
This paper presents a low-power ECG SoC with lossless data compression for wearable devices. The compressor achieves an average CR of 2.25 using MIT/BIH test data. The design consumes 535 nW/channel and has a core area of 0.4 mm in 0.35 m process. In comparison with existing methods, the proposed algorithm and hardware implementation demonstrates the lowest power consumption and is therefore suitable for wearable wireless devices.
