Abstract-A compressive sampling (CS) photoplethysmographic (PPG) readout with embedded feature extraction to estimate heart rate (HR) directly from compressively sampled data is presented. It integrates a low-power analog front end together with a digital back end to perform feature extraction to estimate the average HR over a 4 s interval directly from compressively sampled PPG data. The application-specified integrated circuit (ASIC) supports uniform sampling mode (1x compression) as well as CS modes with compression ratios of 8x, 10x, and 30x. CS is performed through nonuniformly subsampling the PPG signal, while feature extraction is performed using least square spectral fitting through LombScargle periodogram. The ASIC consumes 172 µW of power from a 1.2 V supply while reducing the relative LED driver power consumption by up to 30 times without significant loss of relevant information for accurate HR estimation.
, [2] . PPG acquisition, being a single point measurement and the absence of electrodes increases patient comfort which is crucial for continuous monitoring of HR. However, PPG acquisition requires the tissue to be optically stimulated using a LED and this leads to the LED driver being the dominant power consumer, with the power consumption often ranging from the mW range to tens of mWs [3] .
A high sensitivity, LED-less PPG acquisition system that utilizes ambient light as the source for stimulating the tissue has been proposed in [4] . However, its applicability is limited particularly under low ambient light and low perfusion conditions, which necessitate the use of LED as light source. Compressive sampling (CS)-based PPG acquisition has emerged as an attractive alternative for reducing the LED driver power consumption in PPG acquisition systems [3] . Random stimulation and sampling at sub-Nyquist rate is employed in CS-based PPG acquisition systems instead of conventional uniform stimulation and sampling at Nyquist rate, thereby reducing the effective duty cycle of the LED driver, which results in a proportional reduction in its power consumption. CS-based acquisition, however, suffers from the drawback of requiring a computationally intensive convex optimization process to recover the signal. In conventional CS-based acquisition systems, acquired data are transmitted over a wireline/wireless link to a base station, where the reconstruction is performed [5] . This approach however has the following drawbacks. 1) For systems where realtime analysis of the features in the signal is important, such as continuous HR monitors, near sensor reconstruction/feature extraction is desirable. 2) The need for wireless/wireline link restricts the power budgets of energy-scarce sensor interfaces. To overcome the aforesaid limitations, a hardware accelerator capable of performing signal reconstruction on the sensor node has been implemented in [6] . While this accelerator achieves real-time reconstruction, it consumes up to 10 mW of power, which can potentially outweigh the reduction in LED driver power consumption obtained through CS.
In this paper, a fully integrated CS PPG acquisition application-specified integrated circuit (ASIC) [7] is presented for low-power HR estimation. As illustrated in Fig. 1 , the Fig. 1 . System overview of the compressively sampled (CS) photoplethysmography (PPG) readout for heart rate monitoring. Integrated digital back end (DBE) performs feature extraction to estimate heart rate directly from compressively sampled data.
implemented ASIC comprises of a readout and signal processing chain which is interfaced to an off-chip photodiode (PD), which is excited through an off-chip LED. The presented ASIC not only employs CS for LED driver power consumption reduction but also embeds a low-power digital back end (DBE) capable of extracting the HR information directly from the CS PPG signal without requiring complex reconstruction techniques. This advances the state-of-the-art by reducing the relative LED driver power consumption by up to 30x, while retaining the information relevant for the reliable extraction of HR. Furthermore, the integrated DBE extracts HR information directly from CS data through least square spectral fitting technique, while incurring a power penalty of only 7.2 μW, thereby circumventing the need for embedded complex reconstruction. This paper is organized as follows. Section II gives an overview of CS and its application to PPG acquisition. It also describes the extraction of relevant features for HR estimation directly from CS data through least square spectral fitting technique. Section III describes the ASIC architecture and implementation details, including the design of the analog front end (AFE), which includes a transimpedance amplifier (TIA), switched integrator (SI), and a 12-bit SAR Analog-to-digital converter (ADC), and the detailed description of the DBE, which accelerates the feature extraction. Section IV summarizes ASIC measurement results. Finally, Section V concludes the paper.
II. BACKGROUND

A. Overview of CS for Photoplethysmography
CS is an alternate signal acquisition paradigm which asserts that certain class of signals can be faithfully recovered from far fewer samples or measurements of the signal compared to traditional Nyquist-based sampling [8] . This acquisition protocol relies on the inherent structure of the signal which is related to its sparsity on a given basis and its incoherence of the sampling scheme. In mathematical terms, the process of signal acquisition can be described as follows. Let X be the N-dimensional signal vector, which is K-sparse on a basis Ψ. A linear transformation of X through Ψ results in an N-dimensional K-sparse vector
where S is the sparse representation of X on Ψ. Instead of acquiring X, a lower dimensional projection of X, obtained by linearly transforming X through a measurement matrix Φ, Y , is acquired in CS
where Y is an M-dimensional measurement vector (M N ). The amount of data reduction is quantified through the compression ratio (CR) defined as
As described earlier, for faithful signal recovery, the basis transform Ψ and the measurement matrix Φ need to be incoherent.
PPG signals are shown to be sparse on frequency basis in general and discrete cosine transform basis in particular [9] . Since frequency basis is maximally incoherent with canonical basis, the measurement matrix simplifies to a reduced order identity matrix. Figure 2 shows a partial measurement matrix structure. Uniform sampling can be viewed as a linear transformation of the input signal vector X with an N × N identity matrix. An M × N reduced order identity matrix is formed by choosing M rows from the N × N identity matrix at random. The M rows chosen at random, correspond to the M sampling instants in time-domain (with the row index corresponding to the sample index). Since the rows and hence the sampling instants are chosen at random, CS of PPG signal is equivalent to randomly subsampling the signal. In practice, pseudorandom subsampling schemes are used, showing on par performance with fully random samplers. The same pseudorandom sequence can be reused for every discrete window of length T acq s (see Fig. 3 ). Compared to the conventional PPG sampling scheme as shown in Fig. 3 , where signal is uniformly sampled at f s,N , CS-based PPG acquisition acquires signal at an average sampling rate of f s,CS given by
Conventional PPG acquisition systems have an LED driver duty cycle, D = T ON × f s,N and power consumption of the LED driver proportional to the duty cycle. CS-based PPG acquisition systems, on the other hand, have an LED driver duty cycle of T ON × f s,CS and hence enables the reduction of LED driver power consumption by a factor of CR.
B. Information Extraction From Compressively Sampled Signal
Traditional CS-based acquisition systems offload the acquired samples to a base station, where signal recovery is performed. However, as described in Section I, this approach has disadvantages for applications where real-time continuous monitoring is desirable. Moreover, in certain applications, complete signal recovery is not required, provided the information required to extract the key parameters can be recovered from the CS data. This is the case for PPG acquisition systems for HR monitoring. Fig. 4 shows a time domain PPG signal segment and its frequency domain representation. HR is typically estimated from the time domain PPG signal by estimating the time difference between successive peaks in the signal amplitude. Alternatively, average HR within a short observation interval (T acq = 4 s in current work) can be estimated in the frequency domain from the frequency corresponding to the peak in the spectrum, f pk given in beats per minute (bpm) by
Typically f pk assumes a value between 0.5 and 5 Hz, which corresponds to an average HR range of 30-300 bpm. Hence, to estimate the average HR from a compressively sampled PPG signals, it is sufficient to extract its power spectral density (PSD).
State-of-the-art feature extraction techniques that extract frequency domain features directly from CS data have relied on Johnson-Lindenstrauss (JL) lemma [10] . JL lemma asserts that the inner products for a subset of vectors are preserved up to a factor of 1 ± ( < 1) under random projections [11] . This implies that the PSD and hence the energy of the signal is preserved under random projections within an accuracy of . However, for accurate extraction of PSD from the projected data (equivalent to CS data) using JL lemma-based approach, the factor needs to be as small as possible ( 1) . For PPG signals, where the measurement matrix Φ is a reduced order identity matrix, it has been shown in [11] that the PSD features are not well preserved ( ≈ 0.68 for a CR of 10x). Hence, the state-of-the-art feature extraction techniques for CS data are not readily applicable for accurate HR extraction from CS PPG signals.
In this work, the use of least squares spectral fitting techniques is explored as an alternate approach for PSD estimation from CS PPG signals. In particular, Lomb-Scargle periodogram (LSP) is used as the PSD estimator for randomly subsampled PPG signal. Let x(t j ), j = 1, 2, . . . , M be the CS PPG samples. LSP estimates the PSD P (ω) of x(t j ) as function of angular frequency ω as
where μ is the mean of x(t j ) and τ is given by
In context of CS, the measurement matrix Φ is known a priori and hence the sampling instants t j are predetermined. Therefore, from (7), for a given frequency ω, τ can be predetermined and so are the quantities cos ω(t j − τ ) and sin ω(t j − τ ) in (6) . These facts are utilized in simplifying the design of the feature extraction unit (FEU), which is part of the DBE. Fig. 5 shows the architecture of the single channel CS PPG acquisition ASIC. The ASIC embeds an AFE which performs nonuniformly subsampled acquisition of the PPG signal and a DBE which performs the HR estimation directly from the CS PPG signal and also doubles as the timing controller that synchronizes the building blocks. The AFE integrates a programmable gain TIA, the output of which is interfaced to an SI, which improves the SNR. The output of the SI is buffered and digitized through a 12-bit SAR ADC. A sub-1V bandgap reference is integrated on-chip to provide stable bias and reference signals. The DBE comprises of a control unit (CU) that generates the necessary control signals required for the LED driver, AFE, and the ADC, and also the required internal timing and synchronizing signals. Direct memory access (DMA) is integrated into the DBE, which transfers the incoming data from the ADC into one of the data memory (DMEM) banks. The FEU, also part of the DBE, accelerates the process of LSP to enable extraction of HR directly from the CS PPG signal. The DBE is clocked through an external clock at 32 kHz. The ASIC also provides wide scale programmability both for the gain and bandwidth settings of the AFE and CR, and therefore it can be tailored for a wide range of signals.
III. ASIC IMPLEMENTATION
A. AFE Architecture
The first stage of the readout channel is a TIA that is interfaced to an off-chip PD. The TIA converts the PPG signal that is acquired as a current signal at the output of the PD into a voltage signal, which is further processed by the signal processing chain in voltage domain. The TIA is realized by employing resistive feedback (R f ) around a two-stage Miller compensated OTA as shown in Fig. 6 . When connecting a TIA to a PD, stability issues can arise due to the reverse bias junction capacitance (C p ) of the PD. Fig. 7 shows the measured reverse bias junction capacitance of the PD that forms the part of the commercial Nellcor compatible transmission type finger probe used in the current work. As it can be seen, the PD offers a large C p ranging from 145 to 155 pF across the channel reference voltage (V ref ) range and hence a compensation capacitor (C f ) is added in parallel to R f to introduce an LHP zero and thus improve the stability margin of the TIA. The TIA in the current work has a programmable transimpedance gain of 10, 50, 100, and 250 kΩ, while the feedback capacitance (C f ) can be programmed from 2 to 22 pF, thus allowing the ASIC to be interfaced with a wide range of PDs.
PPG signal, measured as the current at the input of the TIA, consists of a large static component of current on top of which, a relative small pulsatile (AC) component rides. This AC component of current is typically 1%-4% of the static component and contains the information relevant for HR extraction. In order to relax the channel dynamic range requirements, the static component of the current has to be rejected early in the signal processing chain. This is achieved by interfacing a 5-bit current DAC (IDAC) (see Fig. 8 ), capable of sourcing up to 10 μA of current at the input of the TIA.
The OTA used to realize the TIA is shown in Fig. 6 . It uses a standard two-stage Miller compensated topology with PMOS input pair with two modifications. 1) The NMOS active load is source degenerated with a resistor R s , the resistance of which is 12.5 kΩ. 2) Enable switches (En) are added to turn OFFthe OTA in optional power down mode. Degenerating the active load has advantages from the noise point of view as follows. The input referred PSD of thermal voltage noise of the first stage of the OTA without active load degeneration is given by
With the NMOS active load degenerated, the input referred noise PSD is given by
When properly degenerated (g m n R s 1) the noise contribution of M n 1 and M n 2 is negligible and given that the noise contribution of R s is smaller compared to the contribution of M n 1 , 2 and negligible flicker noise contribution due to R s , the overall input referred voltage noise is reduced for the OTA used in the current work [12] . The flicker noise contribution of the input pair is minimized by the use of PMOS devices with relatively large area (160 μm/1 μm) for the input pair. As described in Section II, a CS-based PPG acquisition system operates with a very low duty cycle and hence additional power savings in the readout chain can be obtained by disabling the OTA between successive sampling instances through enable switches.
The output of the TIA is fed into an SI, which is realized by incorporating a switched capacitor in feedback around the OTA in Fig. 6 . The output of the TIA is converted into a current signal through R int , which is then integrated onto C int for a duration of T int . This results in a voltage gain is given by (10) for the SI stage
In the current work, R int and T int are fixed to 30 kΩ and 30.5 μs (1 period of the 32 kHz clock), respectively, while C int is 3-bit programmable and has a range between 50 and 250 pF, thus providing programmable gain for the SI stage.
SI stage, apart from providing additional gain, also acts as a noise limiting filter [13] . This is particularly important in pulsed PPG acquisition systems, where the thermal noise originating from the OTA of the TIA exhibits noise peaking at high frequencies. This noise peaking is due to the large reverse bias junction capacitance of PD (C p ) coupled with relatively lower values of TIA feedback capacitance (C f ), which leads to a high-frequency OTA noise transfer function given by
This high-frequency noise folds back into the baseband upon sampling, thereby increasing the effective noise bandwidth (ENBW). The presence of the SI, which provides sinc filtering, introduces zeros in the signal as well as the noise transfer functions at frequencies that are integral multiples of 1 T in t and thus reduces the ENBW and the noise that would be aliased back into the baseband (see Fig. 9 ).
The output of the SI is then digitized using a 12-bit SAR ADC shown in Fig. 10 , which comprises of a split capacitor DAC to reduce the area requirements, with a unit capacitance (C u ) of 800 fF. A level-shifting sampling approach has been used to enable rail-to-rail input range [14] . The sampling instants of the ADC are controlled by the CU that forms part of the DBE. The digitized data, at the output of the ADC is fed into the DBE for further processing to extract the HR.
B. DBE Architecture
The DBE, shown in Fig. 11 , comprises of a CU, which further consists of a timing control and an RISC controller. The timing control generates the necessary timing signals for the proper operation of LED drivers, AFE, ADC, and internal signals required for the synchronization of the DBE. The RISC controller controls the subsystems in the DBE as per the settings stored in the configuration and instruction registers. Two data memory banks DMEM0 and DMEM1 store the incoming CS PPG data in a ping-pong fashion. Each memory bank is 12-bit wide and has a depth of 512 to enable storage of 4 s worth of PPG data when uniformly sampled at 128 Hz. The data at the output of the ADC are moved into one of the memory banks through a DMA controller. An FEU accelerates the PSD estimation of CS PPG data through LSP. The PSD coefficients are written back into a data memory (DMEM) which is 18-bit wide with a depth of 64. The DBE supports four different compression levels: uniform sampling (1x), 8x, 10x, and 30x. An external clock of 32 kHz provides the master clock for the DBE and the auxiliary clocks required are internally generated.
The timing control block internally divides the 32 kHz clock by 256 to generate a 128 Hz clock. In uniform sampling mode, this 128 Hz clock acts as the sampling clock (o_samp), based on which the rest of the control signals required for the LED driver (LED_Pulse) and the AFE (PD_Act, INT_clk, CH_Samp and INT_Rst) are generated as shown in Fig. 12 . When CS acquisition mode is enabled by selecting a non-unity CR, the 128 Hz clock drives a 9-bit counter, which references one of the three 512-bit lookup tables (LUTs) (selected based on CR) where the sampling instances, corresponding to the entries of the measurement matrix are stored. The output of the LUT serves as the sampling clock (o_samp) based on which the rest of the control signals are generated as explained above. When the optional power-down mode is enabled, the enable signal (see En in Fig. 6 ) is generated by the timing control block, in addition to the above signals. The enable signal is asserted at the rising edge of PD_Act and deasserted at the falling edge of INT_Rst.
The FEU performs the PSD estimation of the CS PPG signal using the LSP described in Section II. The FEU performs a 64-point LSP over a frequency range of 0.5-3.5 Hz, resulting in a frequency resolution of 0.047 Hz. This translates into a resolution of 3 bpm in determination of HR over a range of 30-210 bpm, which is conformant to ANSI-AAMI standards for HR meters [15] . Sum of absolute values is used instead of squared values in (6) to simplify the hardware implementation. Since the sampling instants are known a priori, further simplification of hardware is done by storing the pre-evaluated sine and cosine coefficients in ROM, which are appropriately referenced depending on the CR. With the above simplifications, a modified LSP can be expressed as
where P is the 64×1 vector of LSP coefficients, C T and S T are pre-evaluated cosine and sine transformation matrices of dimension 512×64, respectively, and X is the mean subtracted input CS PPG data acquired over a duration of 4 s with a dimension of 512×1. Therefore, the process of LSP in this case reduces to a matrix transformation process, with the transformation matrices predetermined.
The mean of the incoming CS PPG data over a 4 s interval is calculated by accumulating the samples as they arrive at the input of the DBE and dividing by 4 × f s,CS , where f s,CS is the average sampling frequency given by (4) . The division process is performed through nonrestoring divide algorithm. The mean subtracted samples are then fed into an eight-way multiply-accumulate (MAC) unit, shown in Fig. 13 , which performs the acceleration of the matrix multiplication operation in (12) . Of eight MAC units, four are assigned to accelerate the multiplication with cosine coefficients, while the rest accelerate the sine coefficient multiplications, thereby requiring 8192 clock cycles for the FEU to compute the LSP coefficients of the 4 s PPG signal segment. The LSP coefficients are then truncated to 18-bits and written to DMEM, where a linear search is performed to determine the peak in the LSP coefficients and the corresponding frequency bin, from which the 8-bit average HR is estimated using (5). The HR thus estimated is then stored into an internal register and HR_DONE signal is asserted to indicate the availability of the result. In the measurement setup, an external microcontroller (ARM Cortex M3) reads the HR data and stores/wirelessly transmits the same. Low-power techniques including clock gating are employed to reduce the power consumption of the DBE given the low duty cycle operation of the system.
IV. MEASUREMENT RESULTS
The ASIC is fabricated in a 0.18 μm process and occupies an area of 10 mm 2 . Fig. 14 shows the chip micrograph of the fabricated ASIC. To characterize the ASIC an external LED is driven with a dc current and the response at the output of the readout channel is recorded under uniform sampling mode. Fig. 15 shows the output of the SI for one such stimulus along with the ADC sampling clock (CH_samp) and the integrator clock (INT_clk), where it can be seen that the SI starts integrating the output of the TIA when INT_clk is high, resulting in a ramp for a dc current stimulation. The output of SI is then sampled 4 Includes AFE, ADC, DBE (while executing feature extraction), and bias power consumption, with power down mode disabled. 5 Off-chip LED driver. LED power consumption is subject to the SNR, skin tone of the subject and the efficiency of the LED used in the setup NA-Not applicable, NR-Not reported.
through a rising edge of CH_samp followed by the reset of SI, thereby verifying the functionality of the timing control of the DBE as well as the functionality of the AFE. To further validate the functionality of the ASIC, the LED is modulated by a sinusoidal current with a frequency of 1.2 Hz (corresponding to 72 bpm HR) and the resulting PD current is read-out for CRs of 8x and 30x at the output of SI as shown in Fig. 16 . This further validates the functionality of the timing control and the AFE in the CS acquisition mode.
To demonstrate the recovery of channel from saturation event arising due to increased optical coupling (for example due to motion), a direct optical exposure event is triggered while modulating the LED with a sinusoidal current. Thanks to the presence of the IDAC, the channel can successfully recover from the saturation event (see Fig. 17 ). An in vivo acquisition of PPG signal is performed both in uniform sampling mode and CS mode with a CR of 10x through transmission pulse oximetry on index finger. The probe has been shielded while performing the measurements to avoid interference from the ambient light. The signal acquired in uniform sampling mode is low-pass filtered digitally with a cut-off frequency of 5 Hz and is shown in Fig. 18 . The performance of the FEU is characterized by modulating the LED with a sinusoidal current, the frequency of which is swept from 0.5 to 3.4 Hz to cover the HR range of 30-204 bpm. The LED modulation is carefully chosen so that the ac component of the photocurrent is approximately 20 nA pp . Sinusoidal modulation is used instead of PPG signals from a standard database [16] due to the following reason. 1) Signals in the database do not have golden annotations for HR to benchmark the performance of ASIC, and 2) PPG signals are extremely sparse on frequency domain and therefore can be approximated with sinusoids. The output of the readout is then compressively sampled with CRs 8x, 10x, and 30x and feature extraction is performed on the acquired data. Since the feature extraction process estimates the frequency corresponding to the peak in the PSD, under ideal conditions, the estimated peak frequency (f pk ) is identical to the input frequency. Fig. 19 shows the extracted peak frequency for different CRs. The peak frequency serves as a proxy to estimate the HR using (5). The HR thus measured exhibits a worstcase error of 10 bpm at 30x compression for a nominal HR of 96 bpm. This error is still conformant to ANSI-AAMI standards for HR meters [15] .
The ASIC consumes a total power of 172 μW from a supply of 1.2 V for the entire system without power-down mode enabled. The power consumption of the ASIC is dominated by the AFE which consumes 158.8 μW, while the ADC and the DBE consume 6 and 7.2 μW, respectively (see Fig. 20 ). On the other hand, the LED driver power consumption scales from 1200 to 43 μW, which corresponds to uniform sampling mode (1x CR) and 30x CR respectively, thanks to the compressive sampled acquisition paradigm. The LED driver power consumption is measured while acquiring the PPG signal of a healthy individual. At the reported power levels, the resulting photocurrent is measured to have an ac component of 45 nA pp , while the dc component is measured to be 1.6 μA. At lower CRs, the LED driver continues to dominate the power consumption of the system, while at higher CRs the AFE limits the power consumption due to fundamental noise limitations. Table I summarizes the key performance metrics for the implemented ASIC and compares against the state-of-the-art PPG acquisition systems. Compared to the state-of-the-art, CS-based PPG acquisition enables up to 30x reduction in the power consumption of the LED driver, thanks to the DBE, which accelerates LSP to enable feature extraction directly from CS data to accurately estimate HR with minimum power penalty. While [4] consumes lower power than the current work, it does not describe the robustness and accuracy in determination of the HR under low ambient light/low perfusion conditions (low SNR condition). Under such conditions, it is likely that an LED-based stimulation is required, in which case, the proposed CS-based PPG acquisition enables the reduction of LED driver power consumption proportional to the CR.
The robustness of the ASIC under varying SNR conditions is demonstrated by performing in vivo acquisition of PPG under four different conditions, changing the LED driver current (drawn from a 5 V supply), while adjusting the IDAC setting to cancel most of the dc component out. Table II shows excerpts of the recorded PPG signals after being filtered, along with the information of the different setups, the resulting ac component on the acquired signals, and the HR value calculated by the ASIC at a CR of 10x as well as with uniform sampling. The ac component of the photocurrent varies from 3 nA pp for a LED driver peak current of 18 mA to 12 nA pp when the LED driver peak current is increased to 314 mA. The HR, estimated from the uniformly sampled PPG signal using FFT, serves as the reference. The PPG signal is then compressively acquired at a CR of 10x and the average HR estimated by the ASIC is compared against the reference. As can be seen in Table II , the error in the average HR estimated at 10x CR within 2 bpm under varying SNR conditions. The LED driver power consumption, on the other hand, scales proportional to the CR, from 6.1 mW to 615 μW for an acquired ac component of photocurrent of 12 nA pp .
V. CONCLUSION
A CS PPG readout with embedded feature extraction to estimate HR directly from compressively sampled data is presented. The ASIC advances the state-of-the-art by reducing the relative LED driver consumption by up to 30x while retaining the relevant signal information. An integrated DBE performs feature extraction to estimate the PSD and the average HR over a 4 s interval of compressively acquired PPG signal through LSP. The estimated HR conforms to the accuracy requirements specified by ANSI-AAMI standards for HR meters. The ASIC consumes 172 μW of power from a 1.2 V supply, with the DBE consuming only 7.2 μW, thus avoiding the energy penalties of wireless/wireline transmission and/or embedded signal reconstruction.
