Low Power Circuits for Smart Flexible ECG Sensors by Zhao, Yang
LOW POWER CIRCUITS FOR SMART FLEXIBLE ECG 
SENSORS 
YANG ZHAO 
A DISSERTATION SUBMITTED TO 
THE FACULTY OF GRADUATE STUDIES 
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS 
FOR THE DEGREE OF  
DOCTOR OF PHILOSOPHY 
GRADUATE PROGRAMME IN ELECTRICAL ENGINEERING AND 
COMPUTER SCIENCE 
YORK UNIVERSITY 
TORONTO, ONTARIO 
AUGUST 2019 
© YANG ZHAO, 2019 
 
  ii 
ABSTRACT 
Cardiovascular diseases (CVDs) are the world leading cause of death. In-home heart 
condition monitoring effectively reduced the CVD patient hospitalization rate. Flexible 
electrocardiogram (ECG) sensor provides an affordable, convenient and comfortable in-
home monitoring solution. The three critical building blocks of the ECG sensor i.e., 
analog frontend (AFE), QRS detector, and cardiac arrhythmia classifier (CAC), are 
studied in this research.  
A fully differential difference amplifier (FDDA) based AFE that employs DC-
coupled input stage increases the input impedance and improves CMRR.  A parasitic 
capacitor reuse technique is proposed to improve the noise/area efficiency and CMRR. 
An on-body DC bias scheme is introduced to deal with the input DC offset. Implemented 
in 0.35µm CMOS process with an area of 0.405mm
2
, the proposed AFE consumes 
0.9µW at 1.8V and shows excellent noise effective factor of 2.55, and CMRR of 76dB. 
Experiment shows the proposed AFE not only picks up clean ECG signal with electrodes 
placed as close as 2cm under both resting and walking conditions, but also obtains the 
distinct α-wave after eye blink from EEG recording. 
A personalized QRS detection algorithm is proposed to achieve an average 
positive prediction rate of 99.39% and sensitivity rate of 99.21%. The user-specific 
template avoids the complicate models and parameters used in existing algorithms while 
covers most situations for practical applications. The detection is based on the 
  iii 
comparison of the correlation coefficient of the user-specific template with the ECG 
segment under detection. The proposed one-target clustering reduced the required loops.  
A continuous-in-time discrete-in-amplitude (CTDA) artificial neural network 
(ANN) based CAC is proposed for the smart ECG sensor. The proposed CAC achieves 
over 98% classification accuracy for 4 types of beats defined by AAMI (Association for 
the Advancement of Medical Instrumentation). The CTDA scheme significantly reduces 
the input sample numbers and simplifies the sample representation to one bit. Thus, the 
number of arithmetic operations and the ANN structure are greatly simplified. The 
proposed CAC is verified by FPGA and implemented in 0.18μm CMOS process. 
Simulation results show it can operate at clock frequencies from 10KHz to 50MHz. 
Average power for the patient with 75bpm heart rate is 13.34μW. 
 
  iv 
ACKNOWLEDGEMENTS 
I would first like to thank my supervisor, Dr. Yong (Peter) Lian, for his vast guidance in 
my research and the four-year academic life. I still remember his warm welcome and 
exhaustive introduction at the first time I came here. His smart and patient support in my 
course study, scholarship application, research, industrial cooperation, and writing of 
papers, inspired me, promoting the success of this research. The invaluable experiences 
with Dr. Lian opened the scientific study door of my life. 
I would also like to acknowledge Prof. Ebrahim Ghafar-Zadeh and Prof. 
Sebastian Magierowski, the member of my supervisor committee. Their important 
suggestions and comments on my research proposal and topic form the final structure of 
this study. 
I would like to express my thanks to my colleagues. I‘m grateful for the great help 
from Zhongxia Shang in each design. The testing arrangement assisted by Chen Xi 
facilitates this research; I appreciate his time and effort. Without the valuable design tips 
from Dr. Chundong Wu and low power design experiences from Dr. Xiaoyang Zhang, 
this research wouldn‘t conduct so smoothly. Much gratitude to the discussions with 
Qingsong Xie, who helps improve my algorithm efficiency. 
In addition, I would like to express my gratitude to Prof. Guoxing Wang for his 
support of testing equipment during my internship. I also like to thank Dr. Yu Pu for his 
patient assistance on the ASIC implementation of CAC. 
  v 
Finally, I would like to thank my family and friends for their support and counsel. 
The encourage of my parents kicks out my research anxieties. Parties, sports, and many 
other activities with my friends give me a rich and colourful research life.  
  
  vi 
LIST OF PUBLICATIONS 
[1]. Y. Zhao, Z. Shang and Y. Lian, ―A 2.55 NEF 76dB CMRR DC-Coupled Fully 
Differential Difference Amplifier Based Analog Front-end for Wearable 
Biomedical Sensors,‖ in IEEE Transactions on Biomedical Circuits and 
Systems. [Early Access] DOI: 10.1109/TBCAS.2019.2924416. 
[2]. Z. Shang, Y. Zhao, and Y. Lian, ―A Low Power Frequency Tunable FSK 
Receiver Based on the N-path Filter,‖ in IEEE Transactions on Circuits and 
Systems II: Express Briefs. [Early Access] DOI: 10.1109/TCSII.2019.2931840. 
[3]. J. Zhang, H. Zhang, J. Xu, Y. Zhao, J. Li, G. Hu, J. Wang, R. Zhang, Y. Lian, 
"A Low Energy ASIC for Triple-chamber Cardiac Pacemakers with Contact 
Resistance Measurement," Microelectronics Journal, vol. 60, pp. 65-74, Feb. 
2017. 
[4]. Y. Zhao, S. Lin, Z. Shang and Y. Lian, "Classification of Cardiac Arrhythmias 
Based on Artificial Neural Networks and Continuous-in-Time Discrete-in-
Amplitude Signal Flow," 2019 IEEE International Conference on Artificial 
Intelligence Circuits and Systems (AICAS), Hsinchu, Taiwan, 2019, pp. 175-178. 
[5]. Y. Zhao, Z. Shang and Y. Lian, "Auto Generation of High-Performance Fixed-
Point Multiplier for Artificial Neural Networks," 2019 IEEE International 
Conference on Artificial Intelligence Circuits and Systems (AICAS), Hsinchu, 
Taiwan, 2019, pp. 1-5. 
  vii 
[6]. Y. Zhao, Z. Shang and Y. Lian, ―User Adaptive QRS Detection Based on One 
Target Clustering and Correlation Coefficient,‖ 2018 IEEE Biomedical Circuits 
and Systems Conference (BioCAS), Cleveland, OH, 2018, pp. 1-4. 
[7]. Y. Zhao, H. E. Farag and Y. Lian, ―Dynamic Threshold Algorithm with 
Simplified Appliance Identification for Smart Meter Privacy,‖ 2016 IEEE 
Electrical Power and Energy Conference (EPEC), Ottawa, ON, Canada, 
October 12-14, 2016, pp. 1-6. 
[8]. Z. Shang, Y. Zhao and Y. Lian, ―Low Power FIR Filter Bank for EEG 
Processing Using Frequency-Response Masking Technique,‖ 2018 IEEE 23rd 
International Conference on Digital Signal Processing (DSP), Shanghai, China, 
November 19-21, 2018. 
[9]. Z. Shang, Y. Zhao and Y. Lian, ―Removal of Baseline Wander Noise in ECG 
Signal Using Asymmetrical Frequency-Response Masking Bandpass Filters,‖ 
2018 40th Annual International Conference of the IEEE Engineering in 
Medicine and Biology Society (EMBC), Honolulu, HI, USA, July 17-21, 2018, 
pp. 6002-6005. 
[10]. Z. He, Y. Lian, G. Wang, Z. Peng, Y. Zhao, M. Wang and B. Meng, ―A Heart 
Rate Measurement System Based on Ballistocardiogram for Smart Furniture,‖ 
2018 14th of the biennial Asia Pacific Conference on Circuits and Systems 
(APCCAS), Chengdu, China, October 26-30, 2018. 
  viii 
[11]. Z. Shang, Y. Zhao and Y. Lian, ―An APWM Controlled LLC Resonant 
Converter for a Wide Input Range and Different Load Conditions,‖ 2017 IEEE 
12th International Conference on ASIC (ASICON), Guiyang, China, Oct. 25-28, 
2017, pp. 608-611. 
[12]. Z. Shang, Y. Zhao and Y. Lian, "Low Power FIR Filter Design for Wearable 
Devices using Frequency Response Masking Technique," 2017 IEEE 12th 
International Conference on ASIC (ASICON), Guiyang, China, Oct. 25-28, 
2017, pp. 516-519. 
[13]. Q. Xie, M. Wang, Y. Zhao, Z. He, Y. Li, G. Wang, and Y. Lian, ―A 
Personalized Beat-to-Beat Heart Rate Detection System from 
Ballistocardiogram for Smart Home Applications,‖ IEEE Transactions on 
Biomedical Circuits and Systems, submitted. 
[14]. Y. Zhao, Z. Shang and Y. Lian, ―An Event-driven Patient Specific ANN 
Cardiac Arrhythmia Classifier,‖ IEEE Transactions on Biomedical Circuits and 
Systems, under preparing. 
 
  ix 
TABLE OF CONTENTS 
Abstract ............................................................................................................................... ii 
Acknowledgements ............................................................................................................ iv 
List of Publications ............................................................................................................ vi 
Table of Contents ............................................................................................................... ix 
List of Tables ................................................................................................................... xiii 
List of Figures .................................................................................................................. xiv 
List of Acronyms ............................................................................................................. xix 
CHAPTER 1 Introduction ...................................................................................................... 1 
1.1 Motivation ........................................................................................................... 1 
1.2 Flexible ECG sensor ........................................................................................... 5 
1.3 Existing Issues in the Designing of Flexible ECG Sensor .................................. 7 
1.4 Objectives of Research ..................................................................................... 11 
CHAPTER 2 Review of ECG Interface for Wearable Applications .................................... 13 
2.1 Brief Introduction of ECG Signal Analog Front-End ....................................... 13 
2.1.1 Design Challenges for AFE .......................................................................... 14 
2.1.2 Resistive Three-Amplifier AFE .................................................................... 22 
2.1.3 AC-coupled Pseudo Resistor AFE ................................................................ 23 
2.1.4 DC-coupled Pseudo Resistor AFE ................................................................ 28 
2.1.5 Chopper Stabilization.................................................................................... 32 
  x 
2.1.6 Differential Difference Amplifier ................................................................. 34 
2.1.7 Summary of AFE .......................................................................................... 35 
2.2 QRS detection ................................................................................................... 37 
2.3 Machine Learning based Cardiac Arrhythmia Classification ........................... 42 
2.4 Summary ........................................................................................................... 46 
CHAPTER 3 A 2.55 NEF 76dB CMRR DC-Coupled FDDA AFE .................................... 47 
3.1 FDDA DC-coupled AFE................................................................................... 48 
3.1.1 FDDA Instrumentation Amplifier................................................................. 48 
3.1.2 Back-to-back Connected Pseudo Resistor .................................................... 52 
3.1.3 On-body DC Biasing..................................................................................... 54 
3.1.4 Design Considerations of Proposed FDDA .................................................. 56 
3.1.5 Noise Analysis .............................................................................................. 58 
3.1.6 Layout of Large Size Input Transistor .......................................................... 61 
3.1.7 PGA............................................................................................................... 62 
3.2 Simulation Results ............................................................................................ 63 
3.3 Measurement Results ........................................................................................ 67 
3.4 Conclusion ........................................................................................................ 73 
CHAPTER 4 Personalized QRS Detection Based on One Target Clustering and Correlation 
Coefficient......................................................................................................................... 74 
4.1 User Adaptive Detection ................................................................................... 75 
  xi 
4.1.1 Peak Detection .............................................................................................. 77 
4.1.2 One Target Clustering ................................................................................... 78 
4.1.3 Correlation Coefficient ................................................................................. 82 
4.2 Adaptive Thresholding...................................................................................... 84 
4.3 Experiment Results ........................................................................................... 88 
4.4 Conclusion ........................................................................................................ 91 
CHAPTER 5 An Event-driven Patient Specific ANN-CAC ................................................ 93 
5.1 Data Preparation................................................................................................ 94 
5.1.1 Identity of ECG ............................................................................................. 95 
5.1.2 Conditional Data Grouping Scheme (CGS) .................................................. 97 
5.2 Proposed CTDA ANN-CAC............................................................................. 99 
5.2.1 CTDA vs Nyquist Sampling ....................................................................... 100 
5.2.2 Processing CTDA Signal for ANN-CAC ................................................... 102 
5.2.3 Topology of Proposed CTDA ANN-CAC .................................................. 104 
5.3 Design Considerations .................................................................................... 108 
5.3.1 SRAM ......................................................................................................... 109 
5.3.2 SPI ............................................................................................................... 110 
5.3.3 Multiplier .................................................................................................... 112 
5.3.4 Top FSM ..................................................................................................... 119 
5.3.5 FSM for Conditional Accumulation (FSM-CA) ......................................... 120 
  xii 
5.3.6 FSM for Normal Neural Operation (FSM-NN) .......................................... 123 
5.4 Training of ANN-CAC ................................................................................... 125 
5.4.1 Backpropagation ......................................................................................... 126 
5.4.2 Biased Training ........................................................................................... 127 
5.5 Experiment Results ......................................................................................... 129 
5.5.1 Verification on FPGA ................................................................................. 130 
5.5.2 ASIC Implementation ................................................................................. 132 
5.5.3 Benchmarking ............................................................................................. 135 
5.6 Conclusions ..................................................................................................... 138 
CHAPTER 6 Conclusions and Future Works .................................................................... 139 
6.1 Fulfilled Objectives and Future Works ........................................................... 139 
6.2 Research Contributions ................................................................................... 144 
Bibliography ................................................................................................................... 146 
 
  xiii 
LIST OF TABLES 
Table 2-1-1. Minimum requirements of medical grade ECG AFE. ................................. 20 
Table 2-1-2. Summarize of different AFE used in biomedical sensor interface circuits. . 36 
Table 3-3-1. Measured gain for the 9 AFE chips.............................................................. 69 
Table 3-3-2. AFE performance comparison with other related works. ............................ 70 
Table 4-1-1. Pseudo code of proposed one target clustering. ........................................... 78 
Table 4-3-1. Performance Evaluation on MIT-BIH Database .......................................... 90 
Table 4-3-2. Comparison with Other Algorithms ............................................................. 91 
Table 5-1-1. Heart beat label and corresponding meaning in MIT-BIH .......................... 94 
Table 5-1-2. Mapping of MIT-BIH labels to AAMI types ............................................... 94 
Table 5-1-3. Statistics of heartbeat number in each record and the proposed CGS ......... 98 
Table 5-4-1. Details of the proposed biased training. ..................................................... 128 
Table 5-5-1. Current summary at 75bpm heart rate. ....................................................... 134 
Table 5-5-2. Classification fusion matrix based on simulation and FPGA verification. 136 
Table 5-5-3. Classification accuracy comparison. .......................................................... 137 
Table 5-5-4. Performance comparison with other implementations. .............................. 137 
  xiv 
LIST OF FIGURES 
Fig. 1-1-1. The diseases diagnose expenditures in United States from 2014 to 2015. ....... 2 
Fig. 1-1-2. Projected CVD cost for different age in USA. ................................................. 2 
Fig. 1-1-3. Conduction potentials at different regions of heart in one cardiac cycle [6]. ... 3 
Fig. 1-1-4. Preventive oriented cardiac healthcare system. ................................................ 5 
Fig. 1-2-1. From portable, wearable, to flexible ECG sensor. ............................................ 7 
Fig. 1-3-1. Block diagram of the smart flexible ECG sensor. ............................................ 8 
Fig. 2-1-1. Models of different electrodes: wet Ag/AgCl electrode with 100K 
impedance; dry and insulated electrode with 10M impedance; non-contact electrode 
with over 200M impedance [32]. ................................................................................... 15 
Fig. 2-1-2. Illustration of the 12-lead electrodes‘ position [33]. ....................................... 15 
Fig. 2-1-3. Influence of electrode impedance to the input signal acquisition. .................. 17 
Fig. 2-1-4. Block diagram for ECG measurement regarding the power line interference 
[36]. ................................................................................................................................... 18 
Fig. 2-1-5. Testbench for ECG AFE measurement [40]. .................................................. 21 
Fig. 2-1-6. Topology of AFE with DRL [46]. .................................................................. 22 
Fig. 2-1-7. Two configurations of diode connected pseudo resistor [52]. ........................ 24 
Fig. 2-1-8. TBA and PGA of programmable AC-coupled AFE [56]. .............................. 25 
Fig. 2-1-9. Comparison of AC-coupled AFE with its variant with passive input resistor 
network [57]. ..................................................................................................................... 26 
  xv 
Fig. 2-1-10. Active electrode and negative capacitance for impedance boosting [61]. .... 27 
Fig. 2-1-11. Topology of three-amplifier DC-coupled AFE. ............................................ 28 
Fig. 2-1-12. DC-coupled AFE with baseline stabilizer [66]. ............................................ 30 
Fig. 2-1-13. Feedback filter techniques to suppress the DC offset [67]. .......................... 31 
Fig. 2-1-14. Operational principles of CS amplifier [79]. ................................................ 33 
Fig. 2-2-1. A typical ECG pattern. .................................................................................... 37 
Fig. 3-1-1. Topology of proposed FDDA DC-coupled IA. .............................................. 49 
Fig. 3-1-2. Simulated resistance for diode connected PR in Fig. 2-1-7(a). ...................... 53 
Fig. 3-1-3. Simulated resistance for diode connected PR in Fig. 2-1-7(b). ...................... 53 
Fig. 3-1-4. On-body DC bias for EEG and ECG monitoring. .......................................... 55 
Fig. 3-1-5. Schematic of proposed FDDA. ....................................................................... 57 
Fig. 3-1-6. Simulated integrated input referred noise at different transistor size and bias 
current. .............................................................................................................................. 60 
Fig. 3-1-7. Overview of 417μm×263μm input transistor layout....................................... 61 
Fig. 3-1-8. Symmetrically placed transistor unit. ............................................................. 62 
Fig. 3-1-9. Topology of PGA. ........................................................................................... 63 
Fig. 3-2-1. AFE transient simulation at different gain. ..................................................... 64 
Fig. 3-2-2. Transient current with 8 duplicated AFE at different gain. ............................ 64 
Fig. 3-2-3. Simulated frequency responses of the proposed AFE. ................................... 65 
Fig. 3-2-4. Monte Carlo simulation of frequency response at minimum gain. ................. 66 
  xvi 
Fig. 3-2-5. Noise simulation of the AFE........................................................................... 66 
Fig. 3-3-1. AFE chip micrograph. ..................................................................................... 67 
Fig. 3-3-2. Measured AFE frequency response at three different gain settings. .............. 68 
Fig. 3-3-3. Tested AFE CMRR at three different gain settings. ....................................... 68 
Fig. 3-3-4. Measured AFE input referred noise. ............................................................... 69 
Fig. 3-3-5. Analysis of harmonic distortion of the Proposed AFE. .................................. 69 
Fig. 3-3-6. Monitored ECG signal using the proposed AFE. ........................................... 71 
Fig. 3-3-7. Acquired EEG with distinct α component after eye blink. ............................. 72 
Fig. 3-3-8. Results of 2-hour ECG recording. .................................................................. 72 
Fig. 4-1-1. Diagram of proposed user adaptive QRS detection algorithm. ...................... 75 
Fig. 4-1-2. Diagram of the proposed one target clustering. .............................................. 79 
Fig. 4-1-3. Extracted normal and PVC QRS template by OTC. ....................................... 81 
Fig. 4-1-4. PCC values on record 208 with baseline drift and PVC distortions. .............. 84 
Fig. 4-2-1. Diagram of adaptive thresholding. .................................................................. 85 
Fig. 4-3-1. GUI of the implemented QRS detection in MATLAB. .................................. 89 
Fig. 5-1-1. Illustration of ECG identity. ........................................................................... 96 
Fig. 5-2-1. CTDA vs Nyquist sampling [177]. ............................................................... 100 
Fig. 5-2-2. Topology of CTDA ECG sensor................................................................... 103 
Fig. 5-2-3. Mapping of LC-ADC output and adjacent RR intervals to ANN-CAC input.
......................................................................................................................................... 103 
  xvii 
Fig. 5-2-4. Topology of the proposed patient specific CTDA ANN-CAC. .................... 105 
Fig. 5-3-1. Operational principle of SRAM. ................................................................... 110 
Fig. 5-3-2. Timing of customized SPI. ........................................................................... 111 
Fig. 5-3-3. Generated PP for 16-bit unsigned multiplicand and signed multiplier. ........ 114 
Fig. 5-3-4. Radix-4 booth encoding diagram. ................................................................. 115 
Fig. 5-3-5. Delay reduction with rearrange of input order. ............................................. 115 
Fig. 5-3-6. Rearrange the position of HA in TDM multiplier. ....................................... 116 
Fig. 5-3-7. Structure and pipeline data flow of the customized multiplier. .................... 117 
Fig. 5-3-8. Transition diagram for top FSM. .................................................................. 120 
Fig. 5-3-9. Data flow of FSM-CA. ................................................................................. 121 
Fig. 5-3-10. Transition diagram for FSM-CA. ............................................................... 122 
Fig. 5-3-11. Transition diagram for FSM-NN. ............................................................... 124 
Fig. 5-4-1. Comparison between conventional training and the proposed biased training.
......................................................................................................................................... 129 
Fig. 5-5-1. Implementation verified on Pynq-Z2 board. ................................................. 131 
Fig. 5-5-2. Resource utilization on Atrix-7 FPGA verification. ..................................... 132 
Fig. 5-5-3. Power statistic for the FPGA implementation at 2.5MHz. ........................... 132 
Fig. 5-5-4. Layout and performance summary of the proposed CTDA ANN-CAC. ..... 133 
Fig. 5-5-5. Simulated classification speed and power peaks at different clock frequency.
......................................................................................................................................... 134 
  xviii 
Fig. 5-5-6. Energy summary at 25MHz and 10KHz for 75bpm heart rate. .................... 135 
Fig. 5-5-7. Classification accuracy details on each records in none (N) groups. ........... 136 
 
  xix 
LIST OF ACRONYMS 
 
AAMI Association for the Advancement of Medical Instrumentation 
ADC Analog to digital convertor 
AFE Analog front-end 
AFE Analog front-end 
ANN Artificial neural network 
ASIC Application specific integrated circuits 
BT Biased training 
CAC Cardiac arrhythmia classifier 
CGS Conditional grouping scheme 
CMRR Common mode rejection ratio 
CNN Convolutional neural network 
CPA Carry propagate adder 
CSD Canonical signed digit 
CTDA Continuous-in-time discrete-in-amplitude 
CVD Cardiovascular disease 
DDA Differential difference amplifier 
DRL Driven right leg 
  xx 
ECG Electrocardiogram 
EEG Electroencephalogram 
FA Full adder 
FDDA Fully differential difference amplifier 
FN False negative 
FP False positive 
FSM Finite state machine 
HA Half adder 
IA Instrumentation amplifier 
LA Left arm 
LC Level crossing 
MBE Modified booth encoding 
ML Machine learning 
MLII Modified limb Lead II 
NEF Noise efficiency factor 
OTC One target clustering 
PCB Printed circuits board 
PCC Pearson correlation coefficient 
PD Peak detection 
PGA Programmable gain amplifier 
  xxi 
PMU Power management unit 
PP Partial product 
PR Pseudo resistor 
PVC Premature ventricular contraction 
RA Right arm 
ReLU Rectified linear unit 
RMS Root mean square 
SE Sensitivity 
TDM Three dimensional reduction 
THD Total harmonic distortion 
TP True positive 
VCS Vertical compress slice 
WHO World Health Organization 
WPD Windowed peak detection 
  
  1 
Chapter 1  
Introduction 
Cardiovascular diseases (CVDs) are the world leading cause of death. Early detection 
helps improving the life quality of CVD patients. In-home monitoring provides a 
convenient and affordable solution for the CVD symptom detection. Wearable 
electrocardiogram (ECG) sensor is one of the best candidates for in-home heart 
monitoring. Flexible ECG sensor is the future of the wearable ECG devices because of its 
non-invasiveness, fitness, compactness. This research aims for the design of critical 
building blocks, i.e., analog front-end (AFE), QRS detector, and cardiac arrhythmia 
classifier (CAC), for such a flexible ECG sensor. 
1.1 Motivation 
Diseases that related to the blood vessels, e.g., veins, arteries and capillaries, or heart are 
referred as CVDs. It includes coronary heart diseases, strokes, congestive heart failure 
and other CVDs that affect cardiovascular system [1]. People with unhealthy life style 
such as lack of sleep and exercises, unhealthy diet and eating, addicted drinking and 
smoking, and high mental stress level, are at high risks for the CVDs. 
According to the World Health Organization (WHO), CVDs are the number one 
cause of death globally [2]. An estimated 17.9 million people died from CVDs in 2016, 
representing 31% of all global deaths [2]. The direct and indirect costs of CVDs between 
2014 and 2015 were $351.2 billion in USA according to American Heart Association 
  2 
2019 Statistics [3]. The total annual diagnoses expenditures from 2014 to 2015 due to 
CVDs are higher than any other major causes from the statistic results of USA in the year 
2019, as shown in Fig.1-1-1 [3]. The number of CVD patients is on the rise due to aging, 
leading to the increasing total CVD cost as shown in Fig. 1-1-2 [3]. WHO predicts that 
43.9% of the US population will have some forms of CVD and the total direct-cost for 
CVDs will increase to over $1100 billion by 2035 [3]. It‘s reported that in total $21.2 
billion direct and indirect cost yearly on CVDs for Canadians and the number may reach 
to $293 billion by 2040 [4].  
 
Fig. 1-1-1. The diseases diagnose expenditures in United States from 2014 to 2015. 
 
 
Fig. 1-1-2. Projected CVD cost for different age in USA. 
  3 
 
Fig. 1-1-3. Conduction potentials at different regions of heart in one cardiac cycle [6]. 
 
Electrocardiogram (ECG) is a graph representation of continuously changed 
electrical field of the heart and the reflection of the heart electrical activities [5]. The 
generation of the ECG can be summarized as (1) initiation of the electrical stimulus from 
sinoatrial (SA) node, (2) depolarization and contract of the atrial, (3) depolarization of 
atrioventricular (AV) node, (4) depolarization of the ventricle, (5) start of the 
repolarization from the epicardial surface of the left ventricle, (6) repolarization spreads 
  4 
inwards to finish one cardiac cycle. The typical potentials at different heart region and the 
corresponding ECG are presented in Fig. 1-1-3 [6]. 
Early detection and prevention would help managing CVDs and improve the life 
quality of CVD patients. Detections based on the perceived warning symptoms of CVDs, 
such as chest pain, shortness of breath, and angina, are not effective, which are normally 
too late for preventive actions and even result the delay of clinical treatment. Monitoring 
the heart activities, including heart electrical depolarization and repolarization, and 
mechanical contractions in response to the heart electrical signals, provides a direct and 
timely approach for the detection of CVDs.  This is because abnormal heart activities can 
be detected in very early stage based on electrocardiogram (ECG). Meanwhile, the 
analysis of the heart activities enables the real-time monitoring of the heart condition, 
providing sufficient details for further diagnoses. Thus, monitoring of the heart electrical 
activities through examining the ECG of the subject is an efficient and effective way for 
preventing CVDs.  
Real-time ECG monitoring brings enormous benefits to CVD patients. Recent 
study shows that the use of home monitoring effectively lowers heart failure 
hospitalization rates [7], which not only reduces healthcare cost but also improves the life 
quality of the patients. A preventive oriented cardiac healthcare system for real-time heart 
condition monitoring is illustrated in Fig. 1-1-4. The body worn sensors collect vital signs 
and send raw data to the data center via body gateway. The data center performs data 
analysis and detects any potential problems from raw data. If any abnormality is found 
  5 
and confirmed by doctors at medical center, a warning message is sent to the person 
under monitoring and a follow-up action may take. In such a system, the portable ECG 
sensor is the foundation since it collects the raw ECG signal and directly interacts with 
the patients. 
 
Fig. 1-1-4. Preventive oriented cardiac healthcare system. 
1.2 Flexible ECG sensor 
Portable ECG sensor is not something new. The well-known Holter is a typical one as 
shown at the top of Fig. 1-2-1 [8]. It is a 3-lead ECG system. However, it has several 
drawbacks such as limited usage time, low patient acceptance rate due to many wires and 
electrodes, and skin irritation caused by additional tapes to get good skin-electrode 
  6 
contacts. Moreover, the swing of wires leads to low diagnosis yield due to the noises and 
distortions introduced in ECG recording process, i.e. motion artefacts caused by body 
movement. With the advancement of technology, wearable ECG sensors are getting 
smaller and lighter, and produce better recoding quality. As illustrated in the middle of 
Fig. 1-2-1, the wearable ECG sensor normally is a single lead ECG system [9]. 
Compared with portable ECG, it is much easier to use, and allows real-time reading of 
ECG waveform through built-in wireless transmitter and mobile phone. Although it is 
lighter than the Holter and requires no wires to attach to chest, the weight of electronics 
and battery still pulls the electrodes causing significant motion artifacts under non-rest 
condition.  
Flexible ECG sensors, such as epidermal electronics [10], are recent 
developments that are very thin and comfortable to wear. As shown at the bottom of Fig. 
1-2-1 [11], the flexible ECG sensors are built with bendable, breathable and stretchable 
thin materials that conform to body shape without losing of functionality [12].  In this 
way, the dry electrodes on the thin substrate can be comfortably attached on the chest, 
leading to better user acceptance. The recorded ECG signals have much less motion 
artifacts compared to the wearable ones, especially during exercise. Moreover, the skin 
friendly thin and light substrate makes the wearing of flexible ECG sensor almost 
imperceptible. Thus, among all types of ECG sensors, the flexible ECG sensor is the 
most attractive one since it not only offers real-time high recording quality, high 
diagnostic yield, but also the best user experience. Thus, flexible ECG sensors are 
  7 
gaining wide interests among researchers and clinicians. It is an ideal candidate for the 
preventive oriented cardiac healthcare system. 
 
Fig. 1-2-1. From portable, wearable, to flexible ECG sensor. 
1.3 Existing Issues in the Designing of Flexible ECG Sensor 
As shown in Fig. 1-3-1, a typical smart flexible ECG sensor is composed of an ECG 
application specific integrated circuits (ASIC), a battery, and electrodes. The ASIC 
consists of an analog front-end (AFE), an analog to digital convertor (ADC), a heart-rate 
detector or QRS detector (QRS-D), a cardiac arrhythmia classifier (CAC), a power 
management unit (PMU), and a wireless transceiver. It performs the main functions of 
  8 
the ECG recording: 1). conditioning the ECG signal by the AFE; 2). converting the 
analog signal to its digital representations by the ADC; 3). extracting the heart rate via the 
QRS detector; 4). detecting the arrhythmia beat types by the CAC. The AFE performance 
directly affects the obtained ECG signal quality. The QRS detection is the first step in 
arrhythmia classification. In addition, it also provides heart rate information. In the whole 
wireless ECG sensor, the wireless transmission of raw ECG data consumes most of 
power, i.e. more than 80% of total power [13]. Thus, the wireless transmission should be 
used only if ECG abnormalities are detected by the classifier. Thus, the on-chip CAC 
plays an important role in reducing the wireless data transmission. It should be built with 
low power, much lower than a wireless transceiver, while the classification accuracy is 
high so that to reduce the raw ECG data transmission due to false alarm.  This research 
aims to develop the three essential components in ECG ASIC for flexible ECG sensors.  
 
Fig. 1-3-1. Block diagram of the smart flexible ECG sensor.  
  9 
 
The design of three main building blocks for flexible ECG sensor faces much 
more challenges than the design for the wearable ones. Firstly, the dry electrode used in 
flexible ECG sensor shows over 10M contact impedance. It affects the recording 
quality of the AFE from two aspects: one is the signal attenuation due to the large 
impedance; the other is the distortion induced by the skin-electrode impedance variation 
and mismatch due to the body movement. The increase of input impedance of the AFE 
will help boost the input signal amplitude while minimizing the impact of variation in the 
skin-electrode contact impedance. For example, if the AFE input impedance is 1G, the 
effect of 10M is smaller than 1%. The high input impedance is also helpful to reduce 
the input mismatch and thus improve the CMRR (Common Mode Rejection Ratio) to 
suppress the power line interferences. Thus, the input impedance greater than 1G is 
preferred for the AFE of the flexible ECG sensor. 
Secondly, the flexible ECG sensors normally adopt dry-electrode approach such 
that the electrode can be printed on thin substrate. The use of dry electrode introduces 
imbalance in skin-electrode contact resistance between a pair of electrodes attached to 
chest.  As a result, large power line noise could be picked up by the AFE if the CMRR is 
not high. Furthermore, due to large gain used in flexible ECG sensor (for reason given in 
next paragraph), the power line noise could saturate the AFE. Thus, a high CMRR of 
great than 70dB is highly desired for reducing power line noise.  
  10 
Thirdly, the portable ECG sensor, such as Holter, places electrodes far away from 
each other. For wearable and flexible ECG sensor, the distance between two electrodes is 
limited, normally within 10cm. For example, in a Lead-II configuration using Holter, the 
two electrodes can be placed at the right and left shoulders with distance greater than 2cm 
where the standard ECG signal can be recorded. In the case of flexible ECG sensor, the 
distance can be just 2cm. The electrode distance affects the strength of picked up ECG 
signal, i.e. the longer the distance the larger the ECG amplitude. Therefore, the ECG 
amplitude by using flexible ECG sensor is weaker than the standard recording. To obtain 
clear signal, higher gain and lower noise of the AFE is required. 
What‘s more, bulky battery is impossible to be integrated to a flexible ECG 
sensor. Microwatt or nanowatt AFE circuit, QRS detector and CAC are necessary for the 
use of tiny on-chip battery. The design of ultralow power ECG circuit is a tough task for 
two reasons. On one hand, there is a limit on the reduction of the power for an AFE with 
acceptable noise performance. An AFE with the best power/noise efficiency should be 
designed for the flexible ECG sensor. On the other hand, the accuracy of the QRS 
detector and the CAC should be trustable. Machine learning (ML) algorithms are 
required for the confidence of arrhythmia alerting. However, it is almost impossible to 
achieve the microwatt ML based CAC using conventional Nyquist based signal flow, 
calling for innovation in the design of on-chip CAC. 
Last but not least, discrete components are not welcomed in flexible ECG sensor 
because they are difficult to be integrated on thin substrate and reduce the reliability of 
  11 
the sensor due to the stretching, bending or deforming of flexible material. Ideally, there 
should be only one ASIC on the flexible substrate. Moreover, the size of the ASIC and 
the amount of the footprints should be small for easy integration and manufacturing of 
flexible ECG sensor. In addition, more discrete components mean larger size and higher 
cost. Thus, the flexible ECG sensor should be designed in a compact fully integrated way, 
challenging the sensor circuit design. 
1.4 Objectives of Research 
This research aims to solve several critical issues in the design of sensor interface circuits 
for smart flexible ECG sensor as mentioned above section. Firstly, a fully integrated AFE 
with gain, bandwidth programmability should be designed for the compactness and 
robustness of the flexible ECG sensor. Secondly, high input impedance of the AFE 
should be achieved to handle the motion artefact and closely located electrodes. Thirdly, 
the ultra-low power AFE, QRS detector and CAC have to be implemented for a tight 
power budget of the flexible ECG sensor. Lastly, patient-specific ML based QRS detector 
and CAC are essential for the smart flexible ECG sensor.  
Upon our proposed solutions, we try to eventually achieve: an ultralow power, 
high performance AFE with 1G input impedance to handle the dry electrodes (most of 
existing designs are around 200MΩ), over 75dB CMRR (the state-of-the-art ultralow 
power designs shows around 70dB) to suppress the power line interferences, over 60dB 
gain (most designs presents gain less than 60dB) and less than 1.5μVrms input referred 
noise (state-of-the-art designs without chopper stabilization shows several μVrms noise) 
  12 
to pick up clear ECG signal on short electrode distance, a pass-band filter covers 0.5-
100Hz to filter out unwanted components; an robust and simple QRS detector to achieve 
99% heart beat detection accuracy; an μW machine learning based CAC with 
classification accuracy of over 95%. The goal is to implement a fully integrated, 
intelligent ECG processing ASIC with tens of μW power and less than 2mm2 area for 
flexible ECG sensor. 
The thesis is organized as follows. After the Introduction, we present the literature 
review and the current research progress on AFE, QRS detector and CAC in Chapter 2.  
In the design of AFE, we focus on improving the immunity of the motion artefact 
and reducing power consumption without too much sacrifice of noise and area 
performance. Inspired by the extremely high input impedance of DC-coupled AFE and 
reported high CMRR of DDA, a simple yet acceptable fully differential DDA DC-
coupled AFE will be presented in Chapter 3 [15]. 
For the QRS detector, an user-specific light-weight algorithm with detection 
accuracy of over 99% is proposed to handle the diverse of the ECG during practical 
prediction, as reported in Chapter 4 [16]. 
A 13.34µW patient-specific artificial neural network (ANN) CAC with 
classification accuracy of over 90% following the evaluation matrix recommended by 
Association for the advancement of Medical Instrumentation (AAMI) is implemented for 
arrhythmia classification. The network structure and the optimized multiplier will be 
presented in Chapter 5. 
  13 
Chapter 2  
Review of ECG Interface for Wearable Applications 
AFE, QRS detector, and CAC are three critical building blocks of an ultra-low power 
ECG-on-Chip that provides high recording quality and cardiac arrhythmia classification 
accuracy for flexible ECG sensor. In this chapter, literature reviews of the three blocks 
are presented. 
The organization of this chapter is as follows. In Section 2.1, a brief introduction 
to IA is presented. In Section 2.2, QRS detections are reviewed. In Section 2.3, machine 
learning based CACs are discussed. The summary is given in Section 2.4. 
2.1 Brief Introduction of ECG Signal Analog Front-End 
The quality of ECG signal depends heavily on the analog front-end (AFE) of an interface 
chip. The requirements include high input impedance, low noise, high common mode 
rejection ratio (CMRR) as well as low power. There are many attempts in the design of 
interface chip that satisfies these requirements. In the following, we will present why 
these requirements are important for wearable sensor interface chip and how to achieve 
these design goals. 
  14 
2.1.1 Design Challenges for AFE 
The design of ECG sensor interface circuits faces enormous challenges. In this section, 
we start with the introduction of skin-electrode interface, and then highlight the 
challenges as the result of skin-electrode interface.  
ECG signal with the distinguishing feature of 1-5mV amplitude and 0.05-100Hz 
bandwidth [17] should be sensed by an AFE with large input dynamic range, high input 
impedance, high common mode rejection ratio, good noise performance, ultra-low high-
pass cut-off corner frequency to reduce baseline drift, handle motion artefact, minimize 
mismatch of the electrode contact impedance, and electrical interferences [18], [19]. 
To transport the bio-potential, electrode should be adopted to connect the body to 
the front-end AFE [20]. The commonly used wet electrode, e.g. silver/silver chloride 
(Ag/AgCl) electrode, presents simplicity, good signal stability/quality and low cost. It 
was once a prevalent choice of physicians. However, during prolonged recordings [21], 
[22], the wet gel of the electrodes brings drawbacks: (1) the gel has to be reapplied to 
keep the signal stable during the long-term recordings [23], (2) preparation of skin should 
to be carried out to improve the interfacial impedance [24], (3) the use of electrolytic gel 
can cause inflammations [25], [26]. Recently, dry electrodes [27]-[29], insulated 
electrodes [30] and non-contact electrodes [31], are studied to improve the signal quality 
and the user acceptance. Electrical models of skin-electrode interface for different kinds 
of electrodes are summarized in Fig. 2-1-1 [32]. Generally, the interface can be taken as a 
series of RC elements with finite impedance. 
  15 
 
Fig. 2-1-1. Models of different electrodes: wet Ag/AgCl electrode with 100K 
impedance; dry and insulated electrode with 10M impedance; non-contact electrode 
with over 200M impedance [32]. 
 
 
Fig. 2-1-2. Illustration of the 12-lead electrodes‘ position [33]. 
 
The source signal of an ECG recording system is composed of ECG, 
electromyography (EMG), skin potential, and polarization potential coming from the 
induction of electrode interface with DC current. Besides the unwanted source signal, 
ECG is simultaneously distorted by the instrument noise, the electric and magmatic 
interferences brought by the power line, radio frequency, and body movement. When the 
muscle cells at the placed spot are electrically or neurologically activated, electric 
  16 
potential generated by those activations will be added in series with ECG. To obtain an 
excellent ECG recording during the excise recording, 12-lead diagnostic ECG is 
proposed [33]. The recommend 12-lead position is shown in Fig. 2-1-2. 
It is commonly agreed by physiologists that skin potential varies from -70mV to 
10mV at static state [34]. Thakor and Webster hypothesized that it is induced by the 
difference of metabolic activity between the dead cells lying in the stratum corneum layer 
and the cells lying in the stratum germinativum layer [35]. When stretching the skin, the 
skin potential increases by several millivolts [36]. The deformation of skin during an 
ECG recording, e.g. pressing, squeezing or stretching, changes the initial skin potential. 
This deformation drifts the baseline of recorded ECG, leading to unreadable or failed 
recordings. Not only skin potential but also skin impedance variations during the 
movement [34] distort the ECG. The distortions are called motion artefacts. In the early 
ECG monitoring system, cable movement is also a cause of motion artefact [36], which 
can be easily solved by shielding the cable. The motion artefact introduced by skin 
deformation can be dramatically reduced by using dry electrode, non-contact electrode. 
High impedance is the common feature of those electrodes. To reduce the input signal 
attenuation and common mode interferences, the input impedance of AFE should be 
extremely high.  
As shown in Fig. 2-1-3 for a low frequency case, one electrode has a mismatch of 
ΔZ compared with another electrode Zel. Assuming the two input impedances of the 
  17 
designed AFE are finite and equivalent,       , the differential input referred to common 
mode AC signal,    , can be derived as 
 
Fig. 2-1-3. Influence of electrode impedance to the input signal acquisition. 
 
        
   
          
        
   
       
}                   
         
(          )(       )
.    (2-1-1) 
This phenomenon that the differences in skin/electrodes impedance and/or 
common mode input impedance convert the common mode signal into differential signal 
is called as ‗potential divider effect‘. Similarly,     , the differential input referred to the 
original differential input Vd  is determined by 
     
   
       
  .       (2-1-2) 
It indicates the desired differential signal is attenuated by the finite impedance of AFE. 
For a typical dry electrode composed of conductive graphite that lightly impregnated with 
aluminum, it features with 1.36MΩ impedance at 0.1Hz and 1Hz [37]. To guarantee the 
attenuation less than 1% of the original differential input, the input impedance should be 
0.5Vd
0.5Vd
VCM
Zel+ΔZ
Zel
Zin
Zin
Vin
  18 
larger than 10 times of the electrode impedance, which is 136MΩ. In this case, Zin is 
much larger than Zel. So (2-1-1) can be approximately rewritten as 
     
   
 
  
   
.        (2-1-3) 
Thus, the inherent CMRRΔZ decided by the input impedance is given as 
            
     
       
.      (2-1-4) 
Common mode AC signal generated by the motion artefact contributes to part of 
VCM, meanwhile there is another portion coming from the power line interferences. As 
modeled by [38], the block diagram is illustrated in Fig. 2-1-4. The total interference is 
composed of the interference current through body, the interference current into the 
amplifier and the interference current into the measurement cables. A right-leg driven 
(DRL) circuit is required to relax the interference. 
 
Fig. 2-1-4. Block diagram for ECG measurement regarding the power line interference 
[36]. 
 
If we assume the electrodes with zero impedance, then the interference appears at 
the output due to the common mode AC is purely determined by the CMRR of AFE. As 
  19 
discussed in [39], the total CMRR of a cascaded system is produced by the combination 
of each stages, which is given as 
 
    
 
 
      
 
 
       
.      (2-1-5) 
To achieve high system CMRR, both CMRRΔZ and CMRRIA should be sufficient large. 
Practically, one of these will be a limiting factor. CMRRIA is mainly determined by the 
architecture of designed AFE and the mismatch of the components. While for CMRRΔZ, 
we can utilize (2-1-4) to calculate inherent CMRRΔZ by applying the interference VCM 
(coming from motion artefact and power line interference) and the mismatch of electrode 
impedance. Once the mismatch and VCM are settled, CMRRΔZ can be effectively reduced 
by increasing the input impedance of the AFE. So it is an effective way to achieve a 
proper CMRR by setting a moderate CMRRIA of AFE, and an extremely high CMRRΔZ 
achieved by extremely high AFE input impedance. Therefore, no matter for handling 
motion artefact or suppressing power line interference, high input impedance is one of the 
key performance indicators in a high performance ECG AFE design. 
Input DC offset is another challenge in the ECG AFE design. When the electrode 
contacts with a conducting medium like electrolyte, a polarized potential will be observed 
at the boundary region. This phenomenon is called as the electrode polarization. DC 
offset appears due to this polarization. According to the British and American standard 
IEC 60601-2-47:2001and ANSI/AAMI EC38:2007, the ECG AFE should have the 
capability to deal with maximum +/-300mV offset [40], [41].  
  20 
Trade-offs among noise, power dissipation and area are another challenge. The 
never-stopping pursuit of better user experience leads to the development of near zero 
power ECG sensors with tiny battery or energy harvester [42]. When the power budget of 
a CMOS amplifier comes to nanowatt, flicker noise (or 1/f noise) could take over the 
thermal noise to dominate the total amplifier noise, which is especially severe for low 
frequency ECG signal [43]. The dominating flicker noise limits the minimum detectable 
signal [44]. It can be effectively reduced by either using auxiliary technique like chopper 
modulation or using large-size PMOS input pairs. The reduction is at the price of either 
larger area or more power. So how to reduce the 1/f noise at the lowest power and 
smallest area should be considered. 
Table 2-1-1. Minimum requirements of medical grade ECG AFE. 
Requirement Range Unit Standard 
Input Dynamic 
Range 
10 mV p-v IEC 60601-2-47 
Electrode Offset ±300 mV IEC 60601-2-47 
Input Impedance >10 MΩ IEC 60601-2-47 
CMRR 
60 (50-60Hz) 
30 (100-120Hz) 
dB 
dB 
IEC 60601-2-47 
Gain Accuracy 
Error<10 and 
Amplitude<±10 
% 
µV 
IEC 60601-2-47 
Gain Stability over 
24h 
<3 % IEC 60601-2-47 
Noise <50 µV IEC 60601-2-47 
Crosstalk 
Error<5 and 
Amplitude<0.2mV 
% 
µV 
IEC 60601-2-47 
Frequency Response 
0.67-40/50 or 
0.05-55 
Hz 
Hz 
IEC 60601-2-47 
Timing Accuracy <30 over 24h s IEC 60601-2-47 
Temporal Alignment Error<20 ms IEC 60601-2-47 
 
  21 
Table 2-1-1 summarizes the minimum requirements for an ECG AFE [40]. Those 
requirements focus on the performance to guarantee the transient response (time 
accuracy), frequency response, electrode polarization, interferences (CMRR, input 
impedance) and the crosstalk from the defibrillation. Also the recommended performance 
testbench for the AFE is given in Fig. 2-1-5. The recorder and the entire test circuit are 
shield by wrapped foil to earth. The electrodes are shield by the common mode coming 
from DRL to reduce interference. The capacitors C1 between power line and the driven 
shield and Cx between earth and the driven shield are used to model the power line 
interference. The switch controlled battery simulates the electrodes polarization and a 
parallel connection of 51kΩ resistor R, 47nF capacitor C and a switch imitates the 
electrode. 
 
Fig. 2-1-5. Testbench for ECG AFE measurement [40]. 
  22 
In summary, high input impedance, high CMRR, low input DC-offset, and high 
power/noise/area efficiency are main design challenges of AFE. 
2.1.2 Resistive Three-Amplifier AFE 
In the early stage, the AC-coupled AFE with three-amplifier and gain ratio resistors is the 
best choice [45]. However, there are three drawbacks. First, power line interference is 
distinct due to poor CMRR. Second, μF off-chip capacitors are required for sub-Hz high-
pass cut-off frequency because of the MΩ limit of on-chip gain ratio resistor. Last, the 
resistive feed-back contributes extra thermal noise. 
 The Right-leg driven (DRL) circuits is commonly used to address the CMRR 
issue of the three-amplifier AC-coupled AFE. The overall topology of an AFE with DRL 
is simplified in Fig. 2-1-6 [46]. 
 
Fig. 2-1-6. Topology of AFE with DRL [46]. 
  23 
The parasitic input common mode signal introduced by the power line is 
determined by the coupling impedance between the power line and body, Zintf, and the 
grounding electrode impedance, Zgnd as in (2-1-6). 
   
          
    
          
.      (2-1-6) 
Introducing the active grounding electrode as in Fig. 2-1-6, the impedance of grounding 
electrode is attenuated by a factor of 1+A, of which A is the loop gain. Then the    
     is 
effectively reduced to the value  
   
          
    
   
    
   
      
.      (2-1-7) 
The active grounding is also called as DRL. 
2.1.3 AC-coupled Pseudo Resistor AFE 
To remove the off-chip capacitor in the resistive three-amplifier AFE, GΩ feedback 
resistor is necessary. However, the off-chip passive GΩ resistor is quite expensive. The 
study of ―adaptive element‖ that takes a diode connected transistor as a resistor [47] 
enables on-chip GΩ resistor.  
Inspired by the adaptive element, authors of [48], [49] proposed the subthreshold 
biased transistors to achieve hundreds of GΩ pseudo resistor (PR) and [50] applied the 
diode connected transistor with 15.9GΩ resistance on the biomedical amplifiers to 
achieve the low high-pass cut-off frequency for the first time. After the typical topology 
of the PR AFE reported in [51] for neural recording front-end, PR has been widely used 
in biomedical signal acquisition system. 
  24 
 
Fig. 2-1-7. Two configurations of diode connected pseudo resistor [52]. 
 
Sixteen different passive pseudo resistors and their corresponding resistance are 
studied in [52]. It is demonstrated that both of diode connected pseudo resistors as in Fig. 
2-1-7(a) and Fig. 2-1-7(b) shows over 20TΩ under small amplitude cross voltage e.g., 
0.15V. When the voltage across the pseudo resistor reach to 1.5V, configure in Fig. 2-1-
7(a) remains over 20TΩ while configure in Fig. 2-1-7(b) drops down to 100MΩ. As the 
output swing in PGA stage may has rail-to-rail property, adoption of Fig. 2-1-7(b) in 
PGA would introduce large non-linearity. Fig. 2-1-7(a) is a simple yet consistent choice 
in AFE. 
A typical AC-coupled PR AFE is composed of a low noise amplifier (LNA), 
feedback capacitors, and diode connected PRs [51]. Thanks to the on-chip GΩ PR, it‘s 
easy to achieve a fully integrated AFE with sub-Hertz high-pass cut-off frequency using 
pF capacitors. Due to the limit of the output headroom of the LNA in [51], the noise 
effective factor (NEF) cannot lower than 2.9. To address the issue, the flipped voltage 
follower (FVF) based OTA was proposed in [53].  
  25 
To achieve programmable gain and bandwidth, tunable bandwidth amplifier 
(TBA) connecting with programmable gain amplifier (PGA) are adopted in [54]-[56]. 
The TBA and PGA used in [56] are given in Fig. 2-1-8(a) and Fig. 2-1-8(b) respectively.  
 
Fig. 2-1-8. TBA and PGA of programmable AC-coupled AFE [56]. 
 
The high-pass cut-off for the TBA in Fig. 2-1-8(a) can be either changed by adjust 
C1 or pseudo R.  As C1 also contributes to the overall gain, it‘s practical to tune the 
pseudo R via the change of the bias voltage. The currents passed through M3 and M4 are 
tuned by the gating signal CtrlHFP1, CtrlHFP2, and CtrlHFP3. For a symmetrical wave across 
the biased pseudo resistor consisting of M1 and M2, M1 and M2 will alternatively 
activate in each of the half cycle. This presents a symmetrical tunable pseudo resistance, 
  26 
which eliminates the signal distortions comparing with the single biased unbalanced PR 
used in [54], [55]. 
Fig. 2-1-8(b) takes the off-state switch resistance Rx into consideration. The 
transfer function for the PGA is given by 
 ( )  
  
  
 
       
             
.      (2-1-8) 
As Cx is in picofarad range and Rx is comparable to 10
13
 ohm, the non-overlapping pole 
and zero will distort the high pass band. The flip-over-capacitor scheme shown at the 
bottom of Fig. 2-1-8(b) eliminates this distortion by excluding Rx out of the feedback 
loop through using the switch-controlled capacitors. 
Though the classical AC-coupled PR AFE inherent rejects the DC drift, their low 
input impedance is not good for handling motion artefact. Two main techniques can be 
applied to AC-coupled PR AFE for boosting the input impedance: passive input resistor 
network [57] and input positive feedback [58]-[65]. 
 
Fig. 2-1-9. Comparison of AC-coupled AFE with its variant with passive input resistor 
network [57]. 
 
The bias resistor R2 in Fig. 2-1-9(a) is replaced by the series connected resistors 
R1 and R
’
1 coming from the input terminals in Fig. 2-1-9(b). Therefore, there is no DC 
  27 
path to ground seen by the common mode signal and thus the common mode resistance is 
enhanced over 100MΩ [57]. 
 
Fig. 2-1-10. Active electrode and negative capacitance for impedance boosting [61]. 
 
Utilizing the feedback circuits, active shielding or active electrode and the 
negative capacitance are proposed to boost the input impedance to GΩ. As in Fig. 2-1-
10(a) [61], the active electrode keeps voltage across the external parasitic capacitance as 
constant. Assume the open loop gain of the amplifier in the unity gain buffer is AV, the 
Cext will be reduced to Cext/(1+ AV). The negative capacitance generated from the positive 
feedback circuits is given in Fig. 2-1-10(b). In the circuits, G(s) is designed with constant 
amplitude in signal pass-band such that the generated impedance is frequency 
independent. The transfer function in Fig. 2-1-10(b) is given as  
 ( )  
   
            ( ( )  )
       (2-1-9) 
  28 
Where, the CINT is the internal input capacitance to be cancelled, and the -CPF(G(s)-1) is 
the generated negative capacitance. The negative capacitance is designed to the value - 
CINT to achieve the extremely large input impedance. As reported in [62], the impedance 
can be as large as 400GΩ. 
2.1.4 DC-coupled Pseudo Resistor AFE 
Complex impedance boosting is essential to enhance the input impedance of the AC-
coupled AFE. However, the positive feedback suffers stability issue, thus external tuning 
capacitor [62] or complicated auto tuning circuits are required [61]. The feedback 
amplifier and the tuning circuits also consume extra power and area. Since the input is 
directly connected to the amplifier transistor gate, DC-coupled AFE provides a simpler 
way to improve the input impedance. There are several attempts for DC-coupled AFE in 
recent years [45], [66]-[69]. 
 
Fig. 2-1-11. Topology of three-amplifier DC-coupled AFE. 
  29 
The typical topology of a DC-coupled pseudo resistor AFE is illustrated in Fig. 2-
1-11 [66]. The circuit topology is quite similar with the three-amplifier topology except 
that all the resistors are replaced by the capacitors and diode connected PR (as in Fig. 2-
1-7). Since it is DC-coupled, the input impedance is inversely proportional to the input 
transistor parasitic capacitance of the low noise amplifier (LNA). For the picofarad range 
parasitic capacitance and sub-100Hz operating frequency, the reactance of the parasitic 
capacitance should be in GΩ level, which is high enough for the rejection of potential 
divider effect. The input DC bias is provided by the PR. It can be found the AFE with 
two amplification branches. The two branches amplify both common mode and 
differential signal at the same gain. Assume the input common mode signal is    
    
, and 
the monitored bioelectrical signal is vbio, then we have 
     
     
  
  
  
.   
     
    
 
/  
     
  
  
  
.   
     
    
 
/  .
     
  
  
  
 
     
  
  
  
/    
     
 
 
.
     
  
  
  
 
     
  
  
  
/     .    (2-1-10) 
Thus, the CMRR can be derived as 
      .
     
  
  
  
 
     
  
  
  
/  .
     
  
  
  
 
     
  
  
  
/.  (2-1-11) 
CMRR is determined by the mismatch of the gain ratio capacitors of the AFE. Since the 
three-amplifier topology requires an extra LNA comparing with AC-coupled approach, 
the area left for ratio capacitors is smaller than that of AC-coupled one if area constraint 
is the same, resulting larger mismatch and worse CMRR. In [69], open-loop DC-coupled 
AFE with complicated analog-digital mixed feedback loops are used for the improvement 
  30 
of CMRR and noise performance, which presents 75dB CMRR. However, the complex 
compensation scheme consumes more power. 
 
Fig. 2-1-12. DC-coupled AFE with baseline stabilizer [66]. 
 
Since bioelectrical signals have extremely weak drive capability, the input DC 
bias for the DC-coupled AFE is commonly provided by GΩ PR R2 as in Fig. 2-1-11. It is 
worth noting that the bias provided by the PR is unpredictable due to the leakage. 
Furthermore, the leakage of input transistor is not negligible if its size is made large for 
better noise performance. This leakage reduces the input impedance to the level close to 
the bias PR. In such case, the unpredictable bias resistance would produce uncontrollable 
input DC-offset. Upon this point, a baseline stabilizer is proposed in [66] to the typical 
DC-coupled AFE. The circuit architecture is depicted in Fig. 2-1-12. Once the inputs 
encounter a sudden unbalance baseline due to the motion artifact, the drift is detected by 
the baseline stabilizer. Once a motion artifact event occurs, the unbalance of the baseline 
saturates the PGA output at the movement. The reset action quickly brings the incorrect 
  31 
saturation state back to normal state through shorting the unbalanced input of the PGA to 
system DC common mode bias.  
Different from the reset method discussed above, to address the issue of DC offset 
at the input of DC-coupled AFE, feedback techniques with analog filter [70]-[72], 
analog-digital mixed filter [73], and digital filter [74]-[75], [67] have been reported. Fig. 
2-1-13(a) shows the topology of analog filter feedback path to cancel the offset [67]. The 
integrator contributes to the following overall transfer function 
 ( )  
      
      
 
  
,                   (12) 
where G is the gain of A1. When s=0, the overall gain is 0, which means the DC 
components are fully suppressed by this feedback loop. A differential version of this 
topology is discussed in [72]. For the response speed, A2 in Fig. 2-1-13(a) consumes 
large amount of power. In addition, the feedback loop contributes to extra noise to the 
input. 
 
Fig. 2-1-13. Feedback filter techniques to suppress the DC offset [67]. 
  32 
Fig. 2-1-13(b) gives the model of mixed signal filter. The extracted DC offset 
from VOUT is fed to a coarse and a fine controlled current source to cancel the offset in 
the input. Both Fig. 2-1-13(c) and Fig. 2-1-13(d) present the topologies of pure digital 
controlled feedback loop. They utilize the on-chip or off-chip low pass FIR or IIR digital 
filter (H(z)) to achieve the offset in digital codes format from the output of the ADC. The 
differences are Fig. 2-1-13(c) modulates the input transistors width by the filtered codes 
to suppress the offset, while Fig. 2-1-13(d) reconstruct the analog offset from the filtered 
codes via a DAC consisting of capacitors array. 
2.1.5 Chopper Stabilization 
In conventional AC-coupled AFE and DC-coupled AFE, the only way to reduce the 1/f 
noise is to use large gate area input transistor. The area is costly and inefficient for the 
high surface state densities processes [76]. Another technique is to use lateral bipolar 
mode input transistors which has a reduction of 40dB 1/f noise but results an offset 
between 1 to 10mV [77]. There are also circuit techniques, e.g., correlated double 
sampling (CDS) [78] and chopper stabilization (CS) [79], are applied for the reduction of 
1/f noise. 
CDS, also called as auto-zero, cancels the 1/f noise by subtracting the sampled 
correlated version of the current noise from a previous time. Due to the residual noise, 
which corresponds to the fold-over broadband white noise and flicker noise, the noise 
reduction is limited [79]. Another issue of CDS is the input-output short action requires 
high slew rate to slew the output back and forth in each sampling period. Moreover, the 
  33 
speed of the amplifier should be much higher than the sampling frequency. Overall, high 
power consuming is required. 
 
Fig. 2-1-14. Operational principles of CS amplifier [79]. 
 
The operation principles of a CS amplifier are illustrated in Fig. 2-1-14 [79]. 
Firstly, the input is multiplied by the chopping square wave via the modulator inserted 
between input and the amplifier. In this way, the input is modulated to the frequencies 
around the odd harmonics of the chopping frequency. After the second modulation, the 
output is demodulated back to the even harmonics of the chopping signal. However, for 
the noise and offset of the amplifier, they are only modulated once by the second 
multiplier and hence most of the noise and offset power are transported to the odd 
harmonics of the chopper square waves as illustrated by the second noise waveform in 
Fig. 2-1-14. Lastly, after the low pass filter, main signal power is recovered while 1/f 
noise and offset power are removed. To avoid the signal alias due to the chopping, the 
chopping frequency should be higher than twice of the signal bandwidth. Also, the gain 
of the amplifier at chopping frequency should be higher enough to guarantee the designed 
amplification, so the cut-off frequency of the amplifier should be higher than the 
  34 
chopping frequency. Due to the introducing of the four multiplier switches, extra spikes 
(right bottom of Fig. 2-1-14) are added by the charge injection during the transition of the 
switches states, which will result the parasitic noise. 
In CS AFE, spikes are introduced by the input modulator. Clock edge staggering 
is proposed to mitigate the distortion in [80]. In the past few years, many designs adopted 
the CS technique in the biomedical sensors [81]-[88] for the sake of high CMRR, low 
offset, and low noise.  In [81] the basic CS AFE structure is proposed by using the 
selective band-pass gm-C cell. To further improve the offset cancellation, a DC servo-
loop is proposed in [82]. To relax the amplifier bandwidth and hence reduce the power 
for CS amplifier, [83]-[85] proposed the integrator feedback CS (IFCS). In the IFCS 
structure, the distortions due to the switching of the chopper modulator are also reduced 
by subtracting the averaged distortion power from the input. The heterodyne chopper 
readout circuits for ECG extraction and QRS complex detection are possible [86]. 
Chopper stabilization combined with CDS noise cancelling technique has been discussed 
in [87] taking common gate (CMG) in parallel with common source (CMS) 
transimpedance. 
2.1.6 Differential Difference Amplifier 
Differential difference amplifier (DDA) could be a candidate for the implementation of 
high performance AFE featuring with the inherent DC-coupled high input impedance, 
circuit simplicity and low power. 
  35 
In the past two decades, there are many DDA IAs reported for their distinguishing 
CMRR [89]-[97]. A programmable resistive DDA AFE proposed by [89] introduces extra 
thermal noise, and extra band-pass filter is required for the bandwidth control. Also the 
offset in this initial DDA AFE structure is not well treated. To eliminating the effect of 
the amplifier offset and enhance the noise performance, CS technique has been applied to 
the DDA topology [90]. To suppress the slow changing baseline drift, the AC-coupled 
feedback AFE is proposed in [91]. In [92], dual DC servo-loop is added to compensate 
the baseline drift. A sub-Hz high-pass filter, formed by the input parasitic capacitance in 
parallel with a PR, is introduced to suppress the slow changing baseline drift [93]. But it 
transforms the original differential input DDA into a single ended DDA and thus folded 
the utility of the DDA by half. The two-stage AFE that is composed of DDA, OTA and 
capacitive feedback combined with PR was proposed by [94] to achieve the band-pass 
AFE. DDA AFE generally shows excellent CMRR. In [95], a chopper stabilized DDA 
AFE with CMRR of 85dB is proposed for bio-potential acquisition. The rail-to-rail DDA 
reported in [96] achieves 137dB CMRR for biosensor. As high as 150dB CMRR is 
possible as shown by the simulation result in [97] by using DDA. These results prompt us 
to design a simple yet acceptable AFE to address the CMRR issue in the three-amplifier 
DC-coupled ones. 
2.1.7 Summary of AFE 
Characteristics of different AFE structures for biomedical sensors are summarized in 
Table 2-1-2. AC-coupled AFE shows the lowest input impedance. The impedance 
  36 
boosting techniques enhanced the input impedance significantly but require extra power 
and area. Thus, AC-coupled AFE with auxiliary circuits, such as impedance boosting and 
DC-serve loop, is not the best candidate for the ultra-low power flexible ECG interfacing. 
DC-coupled AFE shows inherent high input impedance, but suffers poor CMRR. Without 
chopper, the way to improve the noise performance is either enlarging the input transistor 
size or increasing the bias current of the AFE input transistor. With chopper, the 
modulator deteriorates the input impedance, the clocking and auxiliary compensation 
circuits require the highest power among all topologies. Therefore, DC-coupled AFE 
without chopper should be the best choice for ultra-low power purpose. However, the 
poor CMRR of the three-amplifier DC-coupled AFE should be addressed for the 
suppression of power line interferences. DDA AFE shows highest CMRR among the 
structures without chopper. Combine of DDA with DC-coupling may provide a simple 
yet acceptable AFE with high input-impedance, ultra-low power, low noise and 
acceptable CMRR. 
Table 2-1-2. Summarize of different AFE used in biomedical sensor interface circuits. 
Techiniques
1 
Power Noise CMRR 
Anti-
offset 
Input 
Impedance 
Area 
AC-coupled H
 
M M M L M 
AC-coupled with DC-serve loop M M M H L L
2 
AC-coupled with impedance 
boosting 
M M M M H M 
AC-coupled with chopper L H H H L H 
DC-coupled M M L L H M 
DC-coupled with auto reset M M L M H M 
DDA M M H M M M 
  37 
*1. Note all the techniques utilize the pseudo resistor to achieve acceptable high-pass cut-
off frequency and all the amplifier operate in subthreshold region to suppress noise with 
μA current. 
*2. It may require off-chip capacitors to achieve DC-server loop. 
*3. The benchmarks of each technique are indicated by three letters: H, M and L 
respectively stands for high, middle and low performance. 
2.2 QRS detection 
As shown in Fig. 2-2-1, a normal ECG mainly contains P, Q, R, S and T five waves that 
can be further represented by PR interval, PR segment, QRS complex, ST segment and 
QT interval for the convenience of the medical analysis. The characteristics of these 
intervals, complex and segments are given bellow. 
 
Fig. 2-2-1. A typical ECG pattern. 
 
P wave: P wave indicates the depolarization of atrial prior to the atrial contraction 
with a general positive polarity. The time domain P wave features with an average 
duration of less than 0.12s and an average amplitude of 300µV. The spectrum of P wave 
is within 15Hz. 
PR interval: PR interval is the duration between the beginning of P wave and the 
beginning of QRS complex. It reflects the stimulation pulse transporting time spreads 
  38 
from the atria to AV node. The normal duration of PR interval varies from 0.12s to 0.2s 
among different person. A shorter PR interval less than 0.12s indicates a bypassing AV 
node stimulus while a longer PR interval more than 0.2s suggests a first-degree heart 
block. 
QRS complex: QRS complex reflects the rapid depolarization of ventricles. Due 
to the largest muscle mass of ventricles, QRS complex shows the highest amplitude 
ranging from 1mV to 3mV and hence it is commonly used for the automatic heart rate 
detection. The normal duration of QRS complex is in the range of 0.06s to 0.12s. The 
wider QRS complex higher than 0.12s indicates the disruption of the heart conduction 
system or ventricles rhythms such as ventricular tachycardia. A low-amplitude QRS 
complex much less than 1mV suggests the pericardial effusion or infiltrative disease 
while an unusually high QRS complex indicates the left ventricular hypertrophy. 
ST segment: The duration starts from the end of S wave to the beginning of the T 
wave is ST segment. The typical time of ST segments ranges from 0.08s to 0.12s. The 
elevated ST segment suggests ischemia while the depressed ST segment indicates the 
myocardial infarction. 
T wave: T wave represents the repolarization of ventricles and the duration of T 
wave is less than 0.3s. Peaked T wave can be a sign of very early myocardial infarction 
or hyperkalemia and the inverted T wave generally suggests the myocardial ischemia, 
metabolic abnormalities and etc. 
  39 
RR interval: RR interval is the duration between two nearest R peaks. It shows the 
time of one entire cardiac cycle and hence stands for the instantaneous heart rate. It is the 
critical parameter for the computation of heart rate which is related the heart arrhythmias. 
The change of the RR interval is called as the heart rate variability (HRV) which is 
related to the subject‘s heart health and stress level. 
Above mentioned morphology features of the ECG signal constraint the decision 
rules of the QRS detection algorithm. The study of QRS detection can be traced back to 
1970s [98], [99]. Detection accuracy is the most important performance in early QRS 
detectors since the algorithms is designed for ECG machines like intensive care unit 
where power is not constrained. The type of QRS detectors can be summarized as [100]:  
(A) Digital differentiation and filters based methods: digitalized ECG is 
differentiated to reflect the signal slope [101]-[103]. As the R peak shows the highest 
peak on the obtained ECG derivative, a threshold is used for comparison to decide the R 
peak. The main task is to reduce the noise interference and to get the proper threshold. 
Nonlinear digital filters, such as multiplication of backward difference [104], were 
studied to suppress the noise. Adaptive threshold was commonly used to dynamically 
update the threshold accompany with refraction period restrictions that two beats should 
have at least 200ms gap [98]. Those methods are simple yet effective for hardware 
implementation. 
(B) Wavelet transform based methods: discrete wavelet transform (DWT) 
decomposes the ECG waveform into different scales. Different frequency components 
  40 
are reflected on those scales. The local maxima of the decomposed coefficients on each 
scale represent the position of the signal peaks which is known as the singularity 
detection [105]. Pure DWT based QRS detection methods are based on the singularity 
detection by finding the local maxima on the decomposed scales [106]-[108]. The 
implementation of DWT singularity detection based QRS detector can be achieved by 
utilizing filter banks [109]. DWT is not only used as detection method but also as pre-
processing technique in the detector e.g., ECG denoising [110] and feature extraction 
[111].  
(C) Artificial neural network (ANN) based methods: Compare with conventional 
method, ANN can be trained to mimic any function with extreme nonlinearity. The 
detection accuracy and robustness are generally better than the handcrafted model based 
methods [112]. However, the training requires tremendous labeled samples and the 
trained network needs thousands of multiplications and accumulations. Thus, neural 
network methods are commonly adopted by arrhythmia classification [113] instead of 
detection. Nevertheless, there are still reports on QRS detection by using ANN for its 
superior accuracy [114], [115]. Besides classification and QRS detection, the ANN can 
also be used for pre-processing e.g. separating the noise and QRS on the unusable noisy 
ECG signal [116].  
(D) Other methods: Inspired by other prediction mathematic models, there are 
methods such as Hidden Markov Chain (HMC) [117], Genetic Algorithm (GA) [118], 
Maximum a Posteriori (MAP) [98], Herbert Transform (HT) [119], Shannon Energy 
  41 
Envelope [120], Mathematical Morphology [121], Zero-Crossing Counts [122] for QRS 
detection. Those unpopular methods are either lack of detection accuracy such as the 
HMC, GA, and HT or too computation intensive e.g., the MAP. 
Evaluating on MIT-BIH database [123], the accuracy of the QRS detector can 
easily surpass 99%. Thus, currently low power design is the main challenge for a real-
time QRS detector implementation.  
Combining the data compression with the digital differentiation, the joint QRS 
detection is proposed in [124] where the detection accuracy of over 99.6% is achieved 
with just 133nW power consumption for the detector. ECG compression with QRS 
detection is a favourable choice for the reduction of power and many other designs took 
the similar architecture [125]. The 20.9nW real-time event-driven QRS detector is 
proposed in [126] by cooperating with the level-crossing ADC and achieves an over 97% 
detection accuracy. The event-driven detector is a great idea as it compatible with the 
state-of-the-art low power ADC design for ECG acquisition where level-crossing 
sampling scheme is adopted for the ECG signal. Using the curve length transform and 
WT, the authors of [127] designed a 642nW ECG feature extractor which can extract the 
P-wave, QRS complex, and T-wave of the ECG segments for wearable devices. There are 
many other low power implementations [128]-[133] using power efficient simple QRS 
detection algorithm, e.g. level-crossing sampling [131] and Wavelet down sampling 
[133], and low power circuits techniques such as burst transfer and multi-voltage [130]. 
  42 
Except the computation intensive ANN based QRS detection, it is worth noting 
that all other existing algorithms try to use predefined parameters or constraints for all 
types of patient. That is, to be robust for handling special cases, leading to complicate 
models with large number of parameters. Both the diversity of ECG waveform due to the 
user‘s age, gender, region, heart condition and the monitoring environment limit the use 
of feature or model based QRS detection algorithms. Not to mention the time 
evolutionary property caused by the physical condition variation during long-term 
monitoring. Patient specified methods are becoming new trend for ECG waveform 
classification [134] and for heart rate detection based on Ballistocardiogram [135] but 
rarely studied in QRS complex detection. A low computation complexity patient specific 
QRS detector is in need. 
2.3 Machine Learning based Cardiac Arrhythmia Classification 
Similar with the QRS detector, the design considerations for the Cardiac Arrhythmia 
Classification (CAC) lies in two aspects. One is the classification accuracy; the other is 
the energy efficiency. Classification accuracy is mainly decided by the selected algorithm 
while the energy efficiency is related to both the algorithm and the implementation 
details. 
The implementation of ultralow power CAC has been studied for quite long time. 
Approaches, running on general purpose processor cores such as MSP430 [136] and 
customized RISC core [137], only perform feature extractions by employing simple 
algorithms, e.g., quad-level vector and waveform skeleton analysis, to meet the power 
  43 
requirement for ambulatory devices due to the inefficient code execution. For better 
processing performance and energy efficiency, discrete wavelet decomposition/transform 
(DWT) and simple machine learning (ML) methods e.g., support vector machine (SVM), 
with low computational complexity were proposed in recent years to achieve CAC 
application-specific integrated circuits (ASIC) [138]-[143]. In [141], pseudo down-
sampling wavelet transform and inverse wavelet transform were adopted to detect and 
analysis the ECG features. Both architecture-level and circuit-level low power design 
techniques, e.g., dynamic clocking and ultralow voltage operation, were used and finally 
achieved the 457nW feature extraction processor. However, the classification is still 
unavailable. The several tens of microwatt classifiers using maximum likelihood, SVM, 
and kth nearest neighbour techniques were proposed to classify two types of cardiac 
arrhythmia through analyzing the extracted features from a multivariate autoregressive 
feature extractor [138]. To further save power, single ML method is implemented, e.g., 
the 2.78µW Bayes classifier with 86% accuracy for the detection of ventricular 
arrhythmia [140], [141] and the classifier with energy dissipation of 
48.99nJ/classification [142] to identity three types of cardiac arrhythmia by utilizing 
cascaded SVM with granular resampling and adaptive speculative mechanisms. In [143], 
a 96pJ/clock classifier was implemented to detect atrioventricular block using the time 
domain P-QRS-T complex features. Due to the limited power budget, those reported 
ASIC based CACs are merely able to classify one or two types of cardiac arrhythmia 
with acceptable accuracy. 
  44 
According to Association for the Advancement of Medical Instrumentation 
(AAMI) recommendation, there are total five types of heart beats i.e., normal beat (N), 
supraventricular ectopic beats (S), ventricular ectopic beats (V), fusion beats (F) and 
unclassified beats (Q) [144]. Without consideration of type Q, there are one normal and 
three abnormal beats should be classified. The high-performance CACs with full 
classification types and high accuracy are commonly based on deep learning models, i.e., 
artificial neural network (ANN) and its variants [134], [145]-[149]. Due to the heavy 
computation burden, ANN based CACs (ANN-CAC) run on workstation, coprocessor 
and separate accelerators.  In order to reduce the computational load while accelerating 
the training and classification process, morphology features extraction technique is used 
to reduce the input sample numbers [148]. However, the choices of feature are dependent 
on specific patients and thus cannot cover the diversity and evolutionary of the ECG due 
to the gender, age, region, race, clinic environment, and recording time of the patients 
[149]. To overcome the non-ideal performance of a global network, raw patient specific 
training data instead of extracted features is adopted to train the network for each patient 
to improve the accuracy [134], i.e. the patient specific network. 
Using raw data samples as the input of the network presents better practical 
applicability but requires more computational resources since the input number of the 
network is much higher than that of selected features based networks. Nevertheless, the 
input data volume can be reduced via down sampling and lower resolution analog-to-
digital convertor (ADC), which sacrifices part of information. However, there is a limit 
  45 
on the reduction of sample amplitude, and the down sampling marginally relaxes the 
computation complexity. Level-crossing ADC (LC-ADC) has been demonstrated as the 
best energy-efficient ADC in dealing with bursty ECG signals [150]-[168]. Different 
from Nyquist sampling where samples are evenly distributed on time, the LC-ADC 
generates continuous-in-time and discrete-in-amplitude (CTDA) signal, i.e., the number 
of samples is proportional to the slope of the signal amplitude, extracting more 
information on fast changing part while less or even no sample on slowly changed or 
steady signal segments. 
Using CTDA signal flow, the ANN structure as well as the arithmetic operations 
of the CAC can be dramatically simplified. To our best knowledge, such a CTDA CAC 
has not been studied.  
Besides the power consideration, imbalanced data could lead to poor detection 
accuracy. To address the unbalanced issue in the global ANN-CAC, Chazal proposed the 
cross-validation data sets [146] where the 44 records from the MIT-BIH database [123] 
are divided into two datasets with same number of records and approximate proportion of 
beats by the mixture of routine and complex arrhythmia: the 50000 heart beats of the first 
data set (DS1) is used for evaluating the CAC performance while the other 50000 heart 
beats from the second data set (DS2) is used for the ANN CAC training. In the patient 
specific CAC, the network is individually trained for each record, i.e., each patient has 
his own CAC weights. In [134], the patient specific CAC is trained for each record in 
DS1. The training samples comes from two parts: in-common used heart beat samples 
  46 
from DS2 and the patient specific heart beats that are taken from the first 5 minutes of the 
specific record from DS1. In this way, the training samples are balanced. The rest of heart 
beats in the specific record are used for evaluation. Although the number of the training 
samples is balanced, the patterns in each labeled type are diverse due to the ECG identity. 
A proper grouping scheme for the training samples is necessary for the patient specific 
ANN CAC. 
2.4 Summary 
In this chapter, the design considerations and challenges for the AFE, QRS detector, and 
the CAC are discussed in the review of existing reports. AC-coupled AFEs with auxiliary 
circuits e.g., the impedance boosting, chopper stabilization, and DC-serve loop, are not 
the best choice for the ultralow power flexible ECG sensors. Thus, DC-coupled AFE is 
adopted in this research to achieve a medical grade AFE with low noise, ultralow power, 
high input impedance, and acceptable CMRR. The design of QRS detector has been a 
mature topic. Recent research mainly focuses on the optimization of power consumption 
of the detector. Due to the ECG diversity, personalized QRS detector is in need but rarely 
studied.  For the CAC design, deep learning based patient-specific CAC outperforms 
other CACs. How to reduce the computation complexity and how to balance the training 
samples are the two main challenges, which will be the focus of this research.  
  47 
Chapter 3  
A 2.55 NEF 76dB CMRR DC-Coupled FDDA AFE 
Wearable ECG sensors are cost effective tools for managing cardiovascular diseases. 
Analog front-end (AFE) is one of the critical components in biomedical sensors. A 
medical grade AFE should follow the medical equipment standard, such as IEC 60601-2-
47 for ambulatory electrocardiographic (ECG) systems [40] in which the required gain, 
bandwidth, noise, input offset, input impedance, common mode rejection ratio (CMRR), 
etc., are defined. For flexible ECG sensors, the design of AFE faces more challenges for 
two reasons. First, the motion artefact due to body movements could saturate the AFE 
output. Second, the wearable biomedical sensors need low power solution for long 
battery life and small sensor size. It is necessary to increase the AFE input impedance to 
handle the motion artefact. It is also important to achieve high CMRR to minimize 
ambient noise. For low power implementation, we should keep the circuits as simple as 
possible. 
AC coupled AFE with impedance boosting circuits falls short in power 
consumption. The clocking and auxiliary compensation of the chopper stabilization AFE 
faces similar problem. Thus, for our ultralow power flexible ECG sensor, DC-coupled 
AFE with inherent high input impedance is the best choice. In this chapter, several 
techniques are presented to achieve the ultralow power, high input impedance, low noise 
yet high CMRR AFE. The proposed AFE uses fully differential difference amplifier 
(FDDA) to replace the three-amplifier topology for the improvement of CMRR. An on-
  48 
body common mode DC voltage is introduced to provide the on-body DC bias voltage to 
the bioelectrical signal in order to handle uncontrollable input DC offset. To reduce the 
flicker noise, large-size input transistors are often used. We propose to reuse the parasitic 
capacitance in the large input transistors to improve noise performance and area 
efficiency. We also use the current mirror load in the first stage of FDDA to replace the 
common mode feedback (CMFB) for better power efficiency. 
The rest of this chapter is organized as follows. Details of the proposed FDDA 
DC-coupled AFE and design considerations are presented in Section 3.1. Simulation 
results are provided in Section 3.2. Measurement results as well as the performance 
summary are given in Section 3.3. Conclusion remarks are drawn in Section 3.4. 
3.1 FDDA DC-coupled AFE 
3.1.1 FDDA Instrumentation Amplifier 
The conventional three-amplifier DC-coupled Instrumentation Amplifier (IA) has two 
amplification branches. The input DC bias is provided by Vcm via PR [66] as in Fig. 2-1-
11. The CMRR is mainly determined by the mismatch of gain ratio capacitors. Due to the 
limited chip area, the feedback ratio capacitors are commonly selected as one or few 
times of minimum allowable MIM capacitor, otherwise the gain ratio capacitors would be 
unacceptable large. Hence, this topology generally presents poor CMRR. 
  49 
The proposed IA, as shown in Fig. 3-1-1, isolates the common-mode input from 
the amplification branches by utilizing a FDDA, where only differential signal is 
amplified. The inputs of FDDA are defined by： 
[
  
   
   
   
]  [
      
        
        
              
] [
   
   
   
   
].   (3-1-1) 
Where   ,    ,    , and     are respectively defined as the differential input signal, the 
none inverting common mode signal, the inverting common mode signal, and the 
differential common mode signal. 
 
Fig. 3-1-1. Topology of proposed FDDA DC-coupled IA. 
 
The linear model of the FDDA is given by: 
          (           ).     (3-1-2) 
  50 
where    represents the differential gain,     is the common mode signal. The second 
term in the parentheses can be rewritten as [169] 
   
    
 
   
     
 
   
     
 
   
     
.     (3-1-3) 
In (3-1-3), all the CMRRs are in dB; the CMRRp and CMRRn are only related to the 
mismatch of input pair; CMRRd is determined by both the input pair mismatch (       ) 
and tail current mismatch (       ) as given by (3-1-4) and (3-1-5) [169], respectively. 
          (  √     )     (3-1-4) 
        
 
         
.  
 
  
   
 /
 
  
 
  
   
 
     (3-1-5) 
In (3-1-4),    and    are geometry-dependent amplification factor separately for 
inverting and non-inverting pairs. In (3-1-5),   and    are expected ideal values, while     
and     are respectively the tail current of the practical inverting and non-inverting 
branches. The impact of tail current mismatch can be easily suppressed by increasing the 
impedance of the tail current via cascaded current mirror or increasing the transistor 
length to several times (e.g. 4×) of default length. To relax the mismatch of input 
transistor, large size and symmetrical layout are necessary for these transistors. The large 
size transistor provides extra benefits in additional to better CMRR in the proposed 
FDDA. The parasitic capacitance, (CGS+CGB) referred as C5 in Fig. 3-1-1, contributes to 
better gain consistency among different chips without extra area cost. That is because the 
unit size of transistor is much smaller than the MIM capacitor, i.e., many more 
symmetrically placed units are allowable for transistors than capacitors with same area. 
  51 
For a single MOS and MIM capacitor, the precision of absolute value from MOS 
is worse than MIM. In this process, the minimum size of MIM capacitor is 5μm×5μm. 
The area taken by 100 minimum MIM capacitors can be used to symmetrically place over 
800 transistors with size of W/L=0.5μm/2μm, i.e., the match of MOS capacitor is better 
than MIM capacitor. In addition to the much more units for reduction of geometry 
mismatch, the on-body bias relaxes the input DC offset to reduce the non-ideal effect. 
Thus, gain variation from chip to chip by using MOS capacitor is smaller than using 
MIM capacitors. 
Furthermore, the capacitor area of C2 can be reduced since part of capacitance is 
provided by C5, i.e., we can transfer part of capacitance from input transistor to ratio 
capacitor. In this design, 4 large size input transistors with size of 5mm/2μm are used to 
improve the noise performance and offset. From the simulation, the parasitic capacitance 
contributes about 1/3 of the gain, i.e., the ratio of C5 and C2 is around 1/2. This helps to 
reduce the chip area. Although the reuse of C5 improves the area utilization in terms of 
the CMRR and noise performance, MIM capacitors are not totally replaced by C5 for two 
reasons. The first is area budget. With same capacitance, C5 occupies roughly 7/3 area of 
C2. The second is the increased gate leakage. The impedance at DC is targeted at over 
1GΩ in this design, so C5 cannot be too large. 
Assume the two Gm cell of the FDDA is matched, and ignoring the CMRR term 
in (3-1-3). Then we have 
          ((       )  (       )).   (3-1-6) 
  52 
According to the circuit in Fig. 3-1-1, (3-1-6) can be transformed to: 
          ((       )  
         
       (        )
(       )). (3-1-7) 
Simplify (3-1-7) with assumption that    is very large (designed value is 70dB), transfer 
function for the FDDA IA is given by: 
 (  )    
     (     )
         
.      (3-1-8) 
That is, FDDA IA has unit DC gain, a high-pass cut-off frequency of   (      ), and 
midband gain of (C2+C5+C1)/C1. R1 is back-to-back connected pseudo resistor [66] which 
is hundreds GΩ. The high-pass cut-off is meant to be sub-Hertz, of which the precise 
value is obtained from the chip measurement. The low-pass cut-off frequency is 
determined by the FDDA bandwidth and the load (  , miller capacitor in FDDA, and the 
input capacitor of PGA). 
3.1.2 Back-to-back Connected Pseudo Resistor 
As discussed in Chapter 2 Section 2.1.3, different diode connected pseudo resistors have 
different properties. The fabrication process also decides its value. The extremely large 
value cannot be precisely calculated. But the equivalent resistor level can be 
approximated via simulation. The simulated diode connected pseudo resistor behaviour 
for Fig. 2-1-7(a) and Fig. 2-1-7(b) are respectively given in Fig. 3-1-2 and Fig. 3-1-3. For 
both configurations, the W/L is set as 0.7μm/1μm.  
  53 
 
Fig. 3-1-2. Simulated resistance for diode connected PR in Fig. 2-1-7(a). 
 
 
Fig. 3-1-3. Simulated resistance for diode connected PR in Fig. 2-1-7(b). 
 
The simulation resolution for absolute current and gm is respectively set as 10
-16
A 
and 10
-16
S to handle the TΩ resistance evaluation. In Fig. 3-1-2, the resistance keeps over 
10 TΩ, and it is proportional to the across voltage. In PGA, the output swing is large, the 
  54 
over 10 TΩ property is suitable for the suppression of output none-linearity variation. 
That is, Fig. 2-1-7(a) is chosen in PGA. While in Fig. 3-1-3, the situation is different. It 
can be found the resistance drops with the increase of the across voltage. When the 
voltage comes to 1V, the resistance is about 770MΩ.  If the voltage in several tens of 
millivolts, the resistance keeps around 11TΩ. It is a useful feature in IA. Normally, the 
output of IA is several tens millivolts, i.e., once established the stable feedback in IA, the 
voltage variation across the feedback pseudo resistor is no more than tens of millivolts. 
Once motion artefacts encountered, the voltage across the pseudo resistor is large. We 
hope the feedback can come back quickly. The deteriorated resistance in Fig. 2-1-7(b) 
helps it because of strong current of the low resistance of the feedback path. Thus, in IA 
stage, Fig. 2-1-7(b) is selected. 
It is noticeable that the simulated TΩ pseudo resistor is not true. The measured 
resistance is tens of GΩ. It is due to the limitation of the simulation model accuracy. 
Nevertheless, the simulation result is helpful in analysis since the resistance variation 
versus the across voltage coincides with the measurement results.  
3.1.3 On-body DC Biasing 
Employing large size input transistor increases the leakage current, making the leakage 
resistance comparable with the pseudo resistor. In this case, the inaccurate pseudo resistor 
bias, such as the bias via R2 in Fig. 2-1-11, results uncontrollable input offset. We 
introduce an extra DC electrode, as shown in Fig. 3-1-4, to deal with the DC offset. The 
  55 
DC electrode is placed away from the two input electrodes, to provide the DC bias Vcm 
for the bioelectrical signal. 
 
Fig. 3-1-4. On-body DC bias for EEG and ECG monitoring. 
 
Two examples for electrode placement are individually illustrated in Fig. 3-1-4 
for EEG and ECG monitoring. For ECG, as the DC-coupled input impedance is high 
enough, the distance between two input electrodes placed on chest can be as close as 2cm. 
The Vcm electrode is placed on the left upper arm to provide the DC bias voltage via the 
path constitute of body tissue (Rbody) and input electrodes (10nF in parallel with 1MΩ for 
dry electrode [33]). The equivalent resistance, Rbody in series with 1MΩ, is significantly 
smaller than the pseudo resistor and the equivalent input resistance of the gate leakage, 
presenting negligible DC offset. The position of the Vcm electrode is not limited to left 
upper arm. It can be placed any point that is at least 5cm away from the input electrodes, 
e.g., the point for driven right leg (DRL) electrode. The distance for the Vcm electrode to 
input electrode is found from our experiment. The 5cm distance guarantees the acquired 
signal quality. If the distance comes to a smaller value, the signal amplitude may 
attenuated by the absorbing of the DC biasing electrode. For EEG acquisition, the 
  56 
electrode connected to the non-inverting terminal Vinp is placed on the forehead, and the 
other input electrode is on earlobe. As amplitude of EEG signal is much smaller than 
ECG, the distance of the two input electrodes cannot be too close as for the ECG. The 
Vcm electrode is tagged on the other earlobe for biasing. 
3.1.4 Design Considerations of Proposed FDDA 
Besides the large-size input transistor (MN7-MN10), two more modifications are made to 
the conventional FDDA for better power efficiency and circuit stability as shown in Fig. 
3-1-5 by red dash boxes. Firstly, the CMFB module to supply the MP1 and MP2 bias is 
replaced by the current mirror load connection as denoted by the top box. If using CMFB, 
the loop is from VP and VN to the gate of MP1 and MP2, where there are two main poles. 
One is at the point VP (or VN) and the other is at the gate of MP1 and MP2. Assume it 
employs the same CMFB structure with the output stage as shown by the right side of the 
red dotted line in Fig. 3-1-5, the two poles can be approximated by 
   
 
(          )[(    )                          ]
,  (3-1-9) 
   
 
(           )[(      )    (           )]
,   (3-1-10) 
where,    is the gain for output stage. Due to the Miller effect and the large-size input 
transistor, the capacitance in (21) is much larger than (22), but the resistance part of (22) 
is over ten times of that in (21) because of the bias current for the CMFB (10nA) is much 
smaller than that in the first stage of the core circuit (150nA). That is, the two poles are 
very close to each other and appear at relative low frequency. Thus, the phase margin 
  57 
quickly degrades with the decrease of the loop gain. With one feedback amplifier, it‘s 
hard to maintain sufficient loop phase. Extra isolated feedback amplifier is required to 
achieve the required loop phase and gain [170], which needs extra current. Using the 
current mirror connection, the CMFB loop stability issue is mitigated. Moreover, this 
configuration wastes no power on the CMFB.  
 
Fig. 3-1-5. Schematic of proposed FDDA. 
 
The proposed asymmetrical connection deteriorates the CMRR of the FDDA. 
Even with extremely large size input transistor size and well-matched layout design, the 
CMRR does not match with the conventional FDDA topology of higher than 80dB [95]-
[97]. Nevertheless, it is still better than the conventional three-amplifier DC-coupled 
topology, and meets the requirements in ECG standard. 
The insertion of MN3 and MN4 at the output stage stabilizes the biasing current. 
Without MN3 and MN4, the current in the conventional FDDA output stage is determined 
by the feedback voltage VFB generated from the output of CMFB which is the classic one-
  58 
stage amplifier (the five-transistor cell at the right side). This bias is sensitive to process 
variation. The adoption of MN3 and MN4 makes the bias current independent on the fixed 
VB rather than the unpredictable VFB. In this way, the VFB just modulates the output 
common mode voltage to VCM but has no effect on bias current. 
3.1.5 Noise Analysis 
The input referred noise of the FDDA can be written as 
      
      
   
 ⁄  
 
  
      
    
      
  (         )
 
      
  (        )
 
     
 
.    .    ||    ||      /         /
 
 (         ⁄ )
 
      
  
       
        
 (       )
 
⁄  (         ⁄ )
 
      
 
 (    (            )⁄ )
 
     
  
       
  (         ⁄ )
 
      
 ,     (3-1-11) 
where      is the output equivalent resistance of the output stage, and    is the output 
equivalent resistance of the first stage. In (3-1-11), we assume MN9 is the same as MN7. 
Note that the 2
nd
 and 4
th
 terms contain the (       )
 
 attenuation leading to the final 
approximation. When transistors in subthreshold region, the relation between ID and VGS 
is given by [171] 
      
 
 
    (   )⁄ .        (24) 
In which     represents the thermal voltage,     stands for the characteristic current, and 
the parameter n is slope factor, both     and n are process dependent. Then, the 
transconductance of the transistor in subthreshold region can be expressed as 
  59 
   
   
    
 
 
   
   
 
 
    (   )⁄  
  
   
.    (3-1-12) 
Considering both thermal and flicker noises, the equivalent voltage noise source seen 
from the transistor gate is defined by 
   
  .
   (   )
   
 
 
    
/  ,      (3-1-13) 
where B is flicker noise constant and purely decided by process, η is determined by the 
body coefficient, the surface potential, and the source-body potential. 
If the design is ideal, then the current in MP1 is twice of MN7. Assuming η and 
  has no relation with the transistor size, from (3-1-10), (3-1-12), and (2-1-13), the 
equivalent input referred noise is obtained as 
      
  (
     (   )  
     
 
 
       
 
  
       
)  ,   (3-1-14) 
To achieve good noise performance, the bias current and the transistor size of MN7-MN10 
and MP1, MP2 should be large enough. The simulated integrated input referred noise at 
different currents and transistor sizes are provided in Fig. 3-1-6. It is obvious that the 
noise drops quickly before 10000μm2 while keeps steady or even slightly higher after that 
point, where the flicker noise is much lower than thermal noise. Thus, in this design, the 
size of MN7-MN10 is 5mm/2μm, and the size of MP1 and MP2 is 800μm /10μm. 
Furthermore, the noise level drops with the increasing of the bias current. However, in 
ECG and EEG applications, low bias current is preferred to achieve the narrow low-pass 
cut-off frequency. 
  60 
 
Fig. 3-1-6. Simulated integrated input referred noise at different transistor size and bias 
current. 
 
The load of IA is composed of feedback ratio capacitor, C1 (as in Fig. 3-1-1), the 
Miller capacitor, CM, and the input ratio capacitor in PGA, C4. If the gain of IA is G1, 
then the low-pass cut-off frequency of the IA can be derived as 
        *  ,(      )        -+⁄ .   (3-1-15) 
As      is purely proportional to bias current, to achieve     around 100Hz, the area 
used for capacitors are extremely large if the current is set to hundreds of nano-ampere. 
Also, the increase of the current is not preferable for wearable sensors. Thus, the bias 
current for non-inverting branch and inverting branch is kept as 150nA. 
  61 
In summary, the large-size input transistor and the load transistor improve the 
CMRR, gain consistence, and noise. As the parasitic capacitance of the input transistor is 
reused for gain ratio capacitors as discussed early, thus the increase of transistor size 
helps to reduce the ratio capacitor in IA. In other words, the large-size transistor doesn‘t 
post as thread to the area budget; rather it helps in improving performance. The increased 
leakage current due to the larger transistor size degrades the impedance, for which the 
pseudo resistor biasing is not suitable. The on-body DC biasing is thus proposed to 
mitigate the deteriorated DC-offset. 
 
Fig. 3-1-7. Overview of 417μm×263μm input transistor layout. 
3.1.6 Layout of Large Size Input Transistor 
The overview of the large size input transistor MN7-MN10 is given in Fig. 3-1-7. The 
area is 417μm by 263μm. To make a regular and compact layout, 10 units with 
centrosymmetic transistors are vertically placed at the left side and 16 units are placed at 
the right side. As each unit contains 80 small transistors, the orientation of the units has 
  62 
little mismatch effect. The details of the unit are shown in Fig. 3-1-8 where all transistors 
are centrosymmetic deployed with equal length metal connections. 
 
Fig. 3-1-8. Symmetrically placed transistor unit.  
3.1.7 PGA 
PGA has 4 gain settings controlled by two digital control bits as shown in Fig. 3-1-9. The 
default gain is set as the maximum value which is determined by C4/C3. The tunable 
capacitor can be changed among C3, 2C3, and 3C3, i.e., when switch S is closed, the gain 
can be set as 1/2, 1/3, and 1/4 of the default gain.  Different gain settings present different 
load for the amplifier OP2. Thus, minimum low-pass cut-off frequency in PGA stage 
should higher than that of IA stage to avoid the bandwidth changing when tuning gain, 
i.e., the low-pass cut-off frequency of PGA at minimum gain setting should larger than 
that in IA. 
As AFE is the cascade of IA and PGA. The gain of IA stage is generally higher 
than the gain in PGA to relax the design of OP2. In this design, the gain of IA and 
maximum gain of PGA are 75 and 36, respectively. As part of the gain of IA is 
contributed by the parasitic capacitance of the input transistor, the gain of IA stage is 
  63 
estimated by simulation. Although the absolute gain is not able to be precisely calculated 
in design stage, the gain variation is smaller than one using pure MIM capacitor as 
discussed before. By such gain distribution, the classic differential input, single-end 
output two-stage amplifier with Miller compensation is adopted for OP2 as in the right 
side of Fig. 3-1-9. 
To drive the on-chip module, generally ADC with capacitive coupling with 
several pico-farad, 20nA biasing is good enough in the output stage of the PGA. In this 
design, for testing purpose, extra 100nA is added to drive the probe of the oscilloscope, 
which is 10MΩ || 4pF, to avoid the use of extra buffer. 
 
Fig. 3-1-9. Topology of PGA. 
3.2 Simulation Results 
The transient, frequency, noise and current simulation results are presented in this section. 
All simulations are based on post-layout schematics with extracted parasitic RC. The 
transient responses of the four gain settings of the proposed AFE are given in Fig. 3-2-1 
  64 
with 100μV peak-to-peak 40Hz input signal. The perfect sinusoidal signal can be 
observed at the output with expected gain. 
To evaluate the current, transient responses of 8 duplicated AFEs at three gain 
settings are simulated as shown in Fig. 3-2-2. It can be found, the average current for all 
gain settings are at 3.94μA, i.e. the current for the proposed AFE is 492.5nA. The current 
variation is within 1μA, i.e. 125nA for single AFE. 
 
Fig. 3-2-1. AFE transient simulation at different gain. 
 
 
Fig. 3-2-2. Transient current with 8 duplicated AFE at different gain. 
 
  65 
The simulated frequency responses of the AFE at four gain settings are shown in 
Fig. 3-2-3. The gain in both real number and dB formats are given. The mid-band gain 
can be programmed at 57dB, 59dB, 63dB, and 65dB, which coincide with the results of 
the transient responses of Fig. 3-2-1. 
 
Fig. 3-2-3. Simulated frequency responses of the proposed AFE. 
 
Monte Carlo simulation with 100-time samplings is given in Fig. 3-2-4. Both 
process variation and layout mismatch are considered in the simulation. The minimum 
gain is 55.5dB and maximum gain is 57dB, i.e. a variation of 2.6% which meets the 
requirement of IEC standard [40]. 
The simulated integrated noise from 0.1 to 250Hz is 2.21μVrms. Due to the flicker 
noise and selective of the band-pass filtering of the AFE, the highest noise appears 
around 0.1Hz which is 11.5μV2. With the increase of the frequency, the noise drops 
quickly e.g., the noise at 100Hz is just 75.8nV
2
 as in Fig. 3-2-5. 
  66 
 
Fig. 3-2-4. Monte Carlo simulation of frequency response at minimum gain. 
 
 
Fig. 3-2-5. Noise simulation of the AFE. 
 
  67 
3.3 Measurement Results 
The design is fabricated in 0.35µm CMOS process with core area of 450µm×900µm. The 
chip micrograph is shown in Fig. 3-3-1. The current consumed by the entire AFE is 
499nA, of which 350nA is taken by IA, and 120nA is used for the output stage of the 
PGA to drive the oscilloscope probe with 10MΩ and 4pF load.  Except the maximum 
gain, other three gain settings are verified during the measurement of frequency response, 
CMRR, and gain variation as shown in Fig. 3-3-2, Fig. 3-3-3, and Table 3-3-1, 
respectively. It can be seen that different gain settings have fixed bandwidth from 0.4Hz 
to 120Hz and 76dB CMRR. The gain variation based on the measurement from 9 chips is 
within 3%, which is slightly higher than the simulation result in Fig. 3-2-4 but still much 
smaller than the 10% requirement in IEC Standard. The input referred noise is 1.02µVrms 
integrated over the pass-band as shown in Fig. 3-3-4. As the noise is integrated from 0.4 
to 120Hz during measurement, the result is close but smaller than the simulated one 
integrated from 0.1 to 250Hz. Two distortion peaks occur at 50Hz and 150Hz due to the 
power line interferences as given in Fig. 3-3-5. The total harmonic distortion (THD) is 
smaller than 1% excluding the components caused by the power line interference. 
 
Fig. 3-3-1. AFE chip micrograph. 
  68 
 
Fig. 3-3-2. Measured AFE frequency response at three different gain settings. 
 
 
Fig. 3-3-3. Tested AFE CMRR at three different gain settings. 
 
  69 
Table 3-3-1. Measured gain for the 9 AFE chips. 
Chip 1 2 3 4 5 6 7 8 9 
Gain 
(dB) 
55.6 56.0 56.8 57.0 56.4 56.0 56.3 57.0 56.1 
58.1 58.8 58.5 58.6 58.1 58.6 59.1 58.7 58.6 
61.8 62.1 63.2 62.3 61.8 62.1 62.4 63.4 62.2 
 
 
Fig. 3-3-4. Measured AFE input referred noise. 
 
 
Fig. 3-3-5. Analysis of harmonic distortion of the Proposed AFE. 
 
  70 
Noise efficiency factor (NEF) was first introduce by [171] to quantify the trade-
off between the noise and power, which is defined by 
           √
     
         
,      (3-3-1) 
where          is the input referred RMS noise voltage;      is the total current supplied to 
the OTA;    is the thermal voltage kT/q and BW is the OTA bandwidth in Hertz. The 
NEF of the proposed FDDA DC-coupled IA is 1.98, and 2.55 for entire AFE counting the 
current to drive oscilloscope probe. 
Table 3-3-2. AFE performance comparison with other related works. 
 
JSSC 
2016 
[66] 
JSSC 
2018 [61] 
JSSC 
2011 [60] 
JSSC 
2015 [172] 
TBioCAS 
2018 [173] 
TBioCAS 
2017 [67] 
TBioCAS 
2019 [62] 
TBioCAS 
2011 [59] 
This 
Work 
Current 1.19µA 0.361µA 5.3µA 31nA 160nA 8.25µA 18µA 50µA 499nA 
VDD 1.2V 0.8V 2V 0.6V 2V 1.2V 5V 1.8V 1.8V 
Gain 
38-
55dB 
26-52dB 50-62dB 51-96dB 40dB 52dB 0-20dB 
9.5-
40dB 
56-
68dB 
Bandwidth 
0.5-
150Hz 
1-400Hz 0-170Hz 0.1-250Hz 0.2-200Hz 
1Hz-
6.5KHz 
0.26-
100Hz 
N/A 
0.4-
120Hz 
Input 
Noise 
3.06µV 8.26µV 1.7µV 6.52µV 2.05µV 5µV 3.7µV 0.8µV 1.02µV 
CMRR 64.9dB 66dB 105dB 55dB 65dB 65dB 70dB 82dB 76dB 
THD N/A N/A N/A 2.87% 1% 0.95% N/A N/A 1% 
Area 
(mm2) 
0.35* 0.86* 5.2* 1.1* 0.18 0.018 1.23 6.48 0.405 
Input 
Impedance 
3.6GΩ 200GΩ N/A 110MΩ 20MΩ N/A 400GΩ 2GΩ 1GΩ 
Process 0.13µm 0.18µm 0.5µm 65nm 0.35µm 0.13µm 0.18µm 0.18µm 0.35µm 
NEF 10.6 8.43 11.7 2.64 2.26 7 N/A 12.3 
2.55, 
1.98 
* The area is approximated from the chip micrograph 
 
Compare with the state-of-the-art works as in Table 3-3-2, this design presents the 
best NEF which is 1.98, and best CMRR of 76dB among the designs without chopper 
stabilization (the two designs in the year 2011 from JSCC and TBioCAS utilized the 
chopper stabilization while all others didn‘t). Thanks to the DC-coupling, input 
  71 
impedance 1GΩ at DC is achieved (JSCC 2016 shows 3.6GΩ using DC-coupling but 
takes higher power, larger noise and smaller CMRR which are respectively 1.19×1.2μW, 
3.06μV and 65dB). The chip area is as small as 0.405mm2 benefiting from the reuse of 
the parasitic capacitance even with almost the highest gain setting. 
To demonstrate the capability of the chip in acquiring ECG and EEG signals, we 
conducted measurement on human using the electrode configuration shown in Fig. 3-1-4. 
The input electrodes are placed just 2cm apart for ECG recording. The top and bottom 
ECG waveforms in Fig. 3-3-6 are respectively obtained at resting and walking conditions. 
The motion artefact during walking slightly distorts the ECG baseline but does not 
saturate the AFE output. For EEG measurement, one second EEG signal after eye blink 
and its FFT is presented in Fig. 3-3-7. It is filtered by a notch filter cascaded with a low-
pass filter with cut-off frequency of 30Hz to remove power line interference. It is obvious 
from Fig. 3-3-7 that, oscillation with an energy peak at 12Hz is presented. This coincides 
with the well-known phenomenon that dominating α-wave appears after the eye blink. 
 
Fig. 3-3-6. Monitored ECG signal using the proposed AFE. 
  72 
 
Fig. 3-3-7. Acquired EEG with distinct α component after eye blink. 
 
To verify the variation of the baseline drift, 2-hour ECG recording is conducted. 
Part of the recording is shown in Fig. 3-3-8 with time length of 2 minutes. The baseline is 
steady.  
 
Fig. 3-3-8. Results of 2-hour ECG recording. 
  73 
3.4 Conclusion 
A FDDA DC-coupled AFE is presented in this chapter for flexible biomedical sensors. 
Using FDDA with improved power efficiency and stability, the AFE has significant 
improvement on input impedance, CMRR and NEF, e.g., 1GΩ at DC, 76dB CMRR, and 
2.55 for NEF, Utilizing the parasitic capacitance reuse technique, this implementation 
shows great noise/area efficiency, i.e. 0.405mm
2
 in 0.35µm process. With the high input 
impedance, the chip is able to not only pick up clear ECG signal at separation of 2cm 
electrode distance at both resting and walking conditions but also acquire the dominating 
α-wave after eye blink on EEG testing. 
  74 
Chapter 4  
Personalized QRS Detection Based on One Target 
Clustering and Correlation Coefficient 
ECG gives direct and robust evidence for the diagnosis of cardiovascular diseases such as 
arrhythmia, ischemia and ventricular hypertrophy. The diagnosis is derived from the 
study of heart rate features like heart rate variability (HRV) via R-R interval information, 
and ECG morphology characteristics like ECG pattern classification. How accurate the 
QRS detection algorithm in a system reflects the reliability of the HRV analysis.  In the 
cardiac arrhythmia classification process, the R peak position from the QRS detection is a 
prime input parameter. Thus, in an ECG processor, the QRS detection algorithm is the 
foundation of further analysis.  
QRS detection algorithm has been studied for long time. It is worth noting that 
existing algorithms try to use predefined parameters or constraints for all types of patient. 
That is, to be robust for handling special cases, leading to complicate models with large 
number of parameters.  
A computationally efficient personalized QRS detection algorithm for flexible 
ECG sensors is presented in this chapter. The proposed user specific QRS template 
avoids the complicate models and parameters used in existing algorithms while covers 
most situations for practical applications. The detection is based on the comparison of the 
correlation coefficient of the user-specific template extracted from individual user with 
the input ECG signal segment under detection. To reduce the computation, a novel one-
  75 
target clustering is proposed to reduce the required loops from (K+1)K/2 to 1 comparing 
to a K target group clustering; meanwhile, the detection judgement is triggered by the 
possible peaks to avoid a continuous time correlation calculation and can be taken as one-
point FIR filter operation following by the dividing of the standard derivation of the input 
ECG signal segment. It achieves an average positive prediction rate of 99.39% and 
sensitivity rate of 99.21% evaluated on 48 tapes of MIT-BIH arrhythmia database. 
This chapter is organized as follows. Section 4.1 presents the details of proposed 
algorithm including one target clustering (OTC) and the detection based on Pearson 
Correlation Coefficients (PCC). Section 4.2 shows the adaptive thresholding for accuracy 
enhancement. Section 4.3 provides experiment results. Conclusion remarks are drawn in 
Section 4.4. 
4.1 User Adaptive Detection 
The diagram of proposed algorithm is shown in Fig. 4-1-1. It is divided into two stages: 
template extraction stage and the detection stage (shown by the dotted and solid arrows 
respectively).  
 
Fig. 4-1-1. Diagram of proposed user adaptive QRS detection algorithm. 
  76 
 
At the beginning, patient‘s ECG signal (ECG clip as marked in Fig. 4-1-1) 
containing several or tens of QRS complex is sent to the windowed peak detector (WPD) 
during which the patient is asked to sit still to avoid motion artefacts. Potential R peaks in 
the ECG clip are identified by the WPD first and then ECG segments centered at the 
potential R peaks are sent to the customized OTC that employs PCC to calculate the 
similarity between each two segments to get the user-specified QRS template. Since the 
segments sent for clustering are QRS candidates selected by the WPD, most of them 
contains only R peak, thus the number of segments sent to OTC are effectively reduced to 
improve the computing efficiency. Meanwhile, the simple OTC optimized for the QRS 
segments further contributes to the lightweight computation. The WPD and OTC are one-
time processing i.e., only activated at the beginning for generating the template. Once 
switching to the detection stage, the peak detector (PD) and PCC are activated. PD is 
always-on to detect any local peaks by examining each input point. Whenever a peak is 
detected, the stand-by PCC will compare the segment centered at the detected peak with 
the stored template. If the similarity is high enough, this segment is taken as a QRS and 
the peak is labeled as an R peak; Otherwise PCC discards this segment and waits for next 
comparison. That is, in practical detection, PCC is idle in most time and only the simple 
PD requires real-time computing. 
  77 
4.1.1 Peak Detection 
The data is firstly processed by the Savtizky–Golay(SG) filter with a span of 15 and 
degree of 6 and then smoothed by a moving sum filter with length of 26, the same pre-
processing and peak detection method proposed in [124]. The peak detection is given by 
(4-1-1) and (4-1-2). 
⋀  (   )   (     )            (4-1-1) 
⋀  (     )   (   )            (4-1-2) 
The continuously rising edge is defined by (4-1-1) that the signal amplitude at the three 
points after i-3 meets S(i)>S(i-1)>S(i-2)>S(i-3) and similar with continuously falling 
edge. If S(i) is a peak for both continuously rising and falling edge, then S(i) is taken as a 
valid peak. 
It is obvious that all peaks no matter P, R, T or even peaks caused by noises could 
be detected if simply apply (4-1-1) and (4-1-2). Thus, windowed local maxima given by 
(4-1-3) is applied to the peak detection to remove unwanted P and T or other small peaks 
for the simplicity of clustering. 
 (  )     * (  )  (  )  (  )+    (4-1-3) 
In (4-1-3), the window is centered at  (  ) .  (  )  and  (  )  are the left and right 
boundaries. If   (  ) is the highest peak among all the peaks in the window, then it is a 
local maximum peak. The local window is decided between the longest QRS length and 
the shortest R-R interval. The window length is selected longer than QRS to contain 
enough waveform information. While, the length is set shorter than R-R interval to avoid 
  78 
multiple R peaks in one segment as multiple R peaks may lead to the multiple-target 
issue in clustering. As the normal QRS length is from 80ms to 100ms and the normal R-R 
interval is from 600ms to 1.5s, the window is set as 400ms to cover entire or most part of 
QRS complex within 400ms and any heart rate under 150bpm. 
4.1.2 One Target Clustering 
For a conventional clustering, the clustering loop searches for the balanced state that each 
group remains the same with last loop. Nested loops lead to the heavy computing. What‘s 
worse, initial kernel for each group should be determined. 
Table 4-1-1. Pseudo code of proposed one target clustering. 
Pseudo Code of OTC 
Step 1: Initialize T, GT, G, r, K, G1…K 
Step 2: for i from 1 to K 
                  for j from i+1 to K 
                       if  cov(Si, Sj)>r 
                           add Sj to Gi, and Si to Gj 
                       end if 
                  end for 
Step 3: G=the group with max member in G1…K 
Step 4: update GT with G, get T by averaging GT, empty G 
Step 5: for i from 1 to K 
                  if  cov(Si, T)>r 
                      add Si to G 
                 end if 
            end for 
Step 6: G==GT? Stop: Step 4 
Stop: Store T 
 
  Noticed that only one target group needs to be found from the segments of the 
WPD output. One target clustering (OTC) is proposed to reduce the computations. The 
  79 
simplified OTC is shown in Table 4-1-1 and Fig. 4-1-2. First, the template kernel T, 
temporary groups G1…K (where K is the total number of segments), target group GT and 
its updated version G are initialized to empty and threshold of PCC r is set to the proper 
value (from 0.6 to 0.8, in this paper 0.8 is chosen) that determines the similarity among 
segments in each group. Then, a general clustering method runs for one time as in Step 2 
to get the initial kernel. Since we know neither the kernel for the group nor how to group 
the segments, we assume each segment stands for one group and the corresponding 
kernel, which guarantees lossless grouping meanwhile provides initial kernels.  
 
Fig. 4-1-2. Diagram of the proposed one target clustering. 
 
One-time clustering with K groups and corresponding kernels are conducted by 
applying the nested for loop to traverse the segment similarity with each other. Once the 
  80 
PCC, i.e. cov(Si, Sj), equals or is higher than the expected threshold value r, then the 
segments Si and Sj are added to each other‘s group as new group members. Since there is 
only one major group in the input data segments, the group with maximum number of 
group members is taken as the initial target group. Meanwhile, we get the initial target 
kernel by averaging the target group in Step 3 and Step 4.  
After obtaining the initial target group and kernel, the OTC is conducted by 
iterating Steps 4 to 6. In Step 5, just one loop is sufficient to traverse the similarity 
between the current kernel and each segment. In Step 6, the algorithm decides whether to 
stop or continue the iteration based on the condition that if the current updated target 
group G is equal to the last target group GT or not. Comparing Step 2 and Step 5, OTC 
requires only one loop from 1 to K to calculate each update while K-target clustering 
needs (   )    loops from 1 to K. Moreover, less target means less updates required 
to reach the stable state. For the proposed OTC, several updates are enough to get the 
final kernel as the extracted template. 
MIT-BIH arrhythmia database [128] is used for verification. Fig. 4-1-3 gives 
template extraction results of the proposed OTC for a 10s ECG clip from a patient with 
premature ventricular contractions (PVCs) as shown on the top plot where the deep 
valley is labeled as Type 5 in the database standing for PVC and the narrow peak is 
labeled as Type 1 standing for normal QRS. As seen from the middle plot, several 
abnormal wide peaks after the PVC are detected by the WPD with 400ms window, but 
the OTC can correctly recognize the major target and extract the 1.2s length template for 
  81 
the normal QRS. Besides, by simply inversing the amplitude of the clip, the PVC 
template can be extracted for PVC detection for this patient as shown on the bottom plot 
with the same window and template length setting.  
 
Fig. 4-1-3. Extracted normal and PVC QRS template by OTC. 
  82 
4.1.3 Correlation Coefficient 
With the user-specific template, a valid QRS can be detected by comparing the similarity 
of incoming real-time ECG segment with the extracted user-specific template via the 
PCC as given by: 
  
∑ (    ̅)
 
   ∑ (    ̅)
 
   
√∑ (    ̅)
  
   √∑ (    ̅)
  
   
,      (4-1-4) 
where,    is the ith point of the current peak segment S, and    is the ith point of the 
template; n is the segment length which is 144 for 400ms window under 360Hz sampling 
frequency;  ̅ and  ̅ are the mean values of S and T, respectively. PCC indicates the linear 
correlation of the two samples and has a value between -1 and 1. When it is between 0.6 
and 0.8, the two samples are linear correlated; when it is higher than 0.8, the two samples 
are strongly linear correlated. For ECG wave during one record, the condition doesn‘t 
change too much, thus the QRS amplitude and duration are linear correlated. In this 
application, since the template is extracted, then (4-1-4) can be simplified as 
  
∑
   
  
(    ̅)
 
   
√∑ (    ̅)
  
   
 
∑   (    ̅)
 
   
  
.     (4-1-5) 
That is, the PCC is calculated by applying a FIR filter with coefficient    to the segment 
(    ̅) , then divide the filter output by the standard derivation of the same segment. 
During this calculation, the mean value of sample is removed which is useful to relax the 
effect of baseline drift. 
Detection results of Record 208 with baseline drift and PVC distortion are shown 
in Fig. 4-1-4. The PCCs of most of the normal QRS complexes are higher than 0.8. 
  83 
Several QRS complexes with large shape distortion, e.g. from the time 12s to 14s, are 
having PCCs between 0.6 and 0.8. The PCCs of most of the fake peak segments are 
lower than 0.4. Two noticeable points that have PCC close to 0.6 are for the segments 
whose shapes are close to the template. One located around 2s with tiny amplitude, it can 
be eliminated by comparing with the common R peak; the other is the one caused by the 
two adjacent PVCs from 6s to 7s, this is hard for detection even for an expert if has no 
knowledge about the nearby information. Except this segment, all others can be correctly 
detected. Moreover, utilizing the extracted PVC template by inversing the signal 
amplitude, the algorithm can be used to detect the PVCs. By combining PVCs position 
with the ECG R-R interval constraints, those abnormal or fake segments can be easily 
separated. In practical, the PCC can be set from 0.6 to 0.9 to achieve proper detection rate 
based on the template length. Short template requires high PCC to avoid too loose 
separation while long template needs low PCC to avoid too strict rules that may miss 
correct QRS. As shown on the bottom plot of Fig. 4-1-3, because of the long 1.2s 
template length, the segments‘ tail parts are not exactly overlapped which leads to the 
small oscillation before 0.2s. This small oscillation reduces the similarity between the 
template and the PVC segment. Thus, small PCC should be used in this case. As the 
normal QRS template is totally different from the PVCs, paced beats, and other beat 
types with distinct segment shapes, the PCC is rather small when comparing the extracted 
main heart beat template with other rarely happened heart beats. That is, the proposed 
  84 
algorithm can separate the major heart beats from other types based on the segment 
similarity judgment and vice versa. 
 
Fig. 4-1-4. PCC values on record 208 with baseline drift and PVC distortions. 
4.2 Adaptive Thresholding 
To further improve the QRS detection accuracy, we are making use of ECG signal 
properties to eliminate wrongly detected peaks. These wrong peaks are mainly caused by 
motion artefacts and noises. As ECG morphology varies from person to person, a fixed 
set of rules is not effective in dealing with these variations. Thus, we propose an adaptive 
  85 
thresholding scheme, in which PCC reference, R-R interval reference, and template are 
adaptively adjusted to enhance the detection accuracy. 
 
Fig. 4-2-1. Diagram of adaptive thresholding. 
 
As the PCC reference, R-R interval, and the template are critical to the accuracy 
of the QRS detection, the updating of them is subject to the strictest condition, i.e. they 
are updated only if the PCC is over 0.8, which means the segments under detection shows 
highest similarity with the stored template. In this way, the updating of the template is 
guaranteed as a segment centered at R peak. For the RR interval value, we set +/-20% as 
  86 
the maximum variation between two adjacent intervals [174] to remove an abnormal RR 
value. This conservative updating strategy preserves the judgment conditions from being 
updated using wrong detections. 
As shown in Fig. 4-2-1, the adaptive updating of the reference RR interval (TT), 
reference PCC (ST), and the stored template (C) for next detection is as follows.  
Step 1: The TT, ST, and C are initialized from the OTC stage, where the subject is 
asked to sit still. We treat the ECG signal form the OTC stage as the ―ideal‖ ECG signal. 
Thus, the extracted template, the average RR intervals, and average PCC of each detected 
QRS complex segments during the OTC are respectively taken as the initial values 
of C, TT, and ST.  The PCC of the last two QRSs, and time information are also 
initialized to S1, S2, T1, and T2, respectively.  
Step 2: calculate the PCC for current QRS complex under detection, SC. 
Step 3: if SC>ST (the current PCC is greater than the stored average PCC), this 
means that the current segment is strongly correlated with the stored template, i.e., the 
current segment is highly possible a QRS complex. To make sure that the current 
segment is indeed a QRS, we apply RR interval constraints under three cases: 
Case 1: the detected RR interval falls within +/-20% variation, i.e. RR interval = 
80% TT ~120%TT. 
Case 2: the detected RR interval is more than +20% variation, i.e. RR interval > 
120% TT. 
  87 
Case 3: the detected RR interval is less than -20% variation, i.e. RR interval < 
80%TT. 
In these three cases, we use three sets of rule to determine if the current segment is a QRS 
in  Step 4.1, Step 4.2, and Step 4.3, respectively. 
Step 4.1: the first case after Step 3 when SC>ST (the current PCC is greater than 
the stored PCC). For this case, the current RR interval T is within the reasonable range set 
by the reference value TT. In this case the segments will be reported as an R peak since it 
is qualified by the similarity and the time constraint. On this condition, the thresholds will 
be updated when the similarity is higher than the highest setting, 0.8. 
Step 4.2: the second case after Step 3. For this case, the current RR interval T is 
longer than the upper limit of the set reference range (1+F)TT. That means, the current 
segment is a special one. If the similarity is higher than the highest value 0.8, then it is 
reported as an R peak, and the thresholds will be updated to follow the RR interval 
variation.  While, if the similarity is less than 0.8, it is still reported as an R peak, but no 
thresholds updating occurs to avoid any possible risk. 
Step 4.3: the third case after Step 3. For this case, the current RR interval T is 
shorter than the lower limit of the set reference range (1-F)TT. That means, either the 
current or the previous detection is wrong. So the decision rules will be applied to remove 
one of the two detections. No threshold updating in this case. 
  88 
Step 4.4: if SC<ST, it is still possible that the current segment is a QRS 
considering the distortion of motion artifact. In this case, the ST is decreased 
with a slope of 0.4/TT until reaching the minimum value of 0.4. 
4.3 Experiment Results 
The algorithm is implemented in MATLAB, the GUI of the detection and analysis is 
shown in Fig. 4-3-1. The passband of the FIR pre-filter is set as 0.7 to 40Hz with 
transition width of 0.5 and 1Hz, respectively. The waveform of 2-hour recording by using 
the proposed AFE (as in Section 3.3) is adopted for functionality verification. As shown 
on the main window, the detected label is accurately predicted at each R peak of the 
filtered ECG. The detected results and the filtered ECG can be saved to a text file for 
further study. In the implementation, the HRV parameters are also calculated, e.g., the 
estimated heart rate (HR) is 82bpm.  
To evaluate the detection performance, sensitivity (SE) and positive prediction 
(+P) are used. As defined by (35) and (36), the false positive (FP), false negative (FN) 
and true positive (TP) are calculated where FP indicates a declaration of QRS while there 
is none, FN indicates lost detection of actual QRS event and TP indicates all correctly 
detected QRS. 
  ( )  
  
     
       (4-3-1) 
  ( )  
  
     
       (4-3-2) 
  89 
 
Fig. 4-3-1. GUI of the implemented QRS detection in MATLAB. 
 
The SE and +P for each record in MIT-BIH arrhythmia database are summarized 
in Table 4-3-1. Our algorithm can accurately separate the majority heart beat type from 
the total beat labels with an average +P of 99.39% and SE of 99.21%. The lowest 
performance appears at Record 203 where there are highest PVCs that leads to the lowest 
+P and SE. 
  90 
The performance comparison of proposed QRS detection algorithms with other 
algorithms are given in Table 4-3-2. The sensitivity and positive detection rate are 
comparable to or slight lower (less than 0.5%) than the state-of-the-art algorithm.  
Table 4-3-1. Performance Evaluation on MIT-BIH Database 
Tape Major/Total FN FP +P SE 
100 2273/2273 2 0 100% 99.91% 
101 1865/1865 6 0 100% 99.68% 
102 2183/2187 24 24 98.90% 98.90% 
103 2084/2084 4 0 100% 99.81% 
104 1380/2229 55 44 96.76% 95.98% 
105 2526/2572 84 23 99.06% 96.65% 
106 1507/2027 1 6 99.60% 99.93% 
107 2078/2137 2 1 99.95% 99.90% 
108 1746/1763 41 39 97.76% 97.65% 
109 2494/2532 11 12 99.52% 99.56% 
111 2123/2124 38 38 98.21% 98.21% 
112 2539/2539 0 0 100% 100% 
113 1795/1795 8 0 100% 99.95% 
114 1836/1879 4 1 99.95% 99.78% 
115 1953/1953 1 0 100% 99.95% 
116 2303/2412 29 0 100% 98.72% 
117 1535/1535 2 0 100% 99.87% 
118 2262/2278 6 2 99.91% 99.73% 
119 1543/1987 0 0 100% 100% 
121 1862/1863 2 1 99.95% 99.89% 
122 2476/2476 2 0 100% 99.92% 
123 1515/1518 2 0 100% 99.87% 
124 1572/1619 6 6 99.62% 99.62% 
200 1775/2601 14 24 98.66% 99.22% 
201 1765/1963 22 7 99.6% 98.74% 
202 2117/2136 1 4 99.81% 99.95% 
203 2536/2980 163 166 93.45% 93.41% 
205 2585/2656 1 1 99.96% 99.96% 
207 1457/1543 14 10 99.31% 99.04% 
208 1586/2955 22 26 98.68% 98.88% 
209 3005/3006 0 0 100% 100% 
210 2446/2640 19 12 99.50% 99.22% 
212 2748/2748 0 0 100% 100% 
  91 
213 3251/3471 48 22 99.31% 98.51% 
214 2003/2259 4 5 99.75% 99.80% 
215 3195/3363 2 4 99.87% 99.94% 
217 1542/2208 9 12 99.22% 99.42% 
219 2082/2154 1 3 99.86% 99.95% 
220 1954/2048 2 0 100% 99.90% 
221 2031/2427 5 2 99.90% 99.75% 
222 2061/2483 72 63 97.44% 97.09% 
223 2029/2589 33 34 98.39% 98.44% 
228 1688/2053 7 0 100% 99.58% 
230 2255/2256 1 0 100% 99.96% 
231 1254/1571 1 0 100% 99.92% 
232 1382/1780 4 2 99.89% 99.78% 
233 2230/3079 11 10 99.55% 99.51% 
234 2700/2753 0 0 100% 100% 
Total 99127/109369 786 604 99.39% 99.21% 
 
Table 4-3-2. Comparison with Other Algorithms 
Method SE(%) +P(%) 
Multiscale Morphology [121] 99.81 99.80 
Quadratic Spline wavelet [175] 99.31 99.70 
Joint QRS [124] 99.64 99.81 
Event Driven [126] 97.76 98.59 
Genetic Algorithm [176] 99.60 99.51 
Proposed Algorithm 99.21 99.39 
4.4 Conclusion 
In this chapter, the details of the proposed user adaptive QRS detection algorithm are 
presented. It employs PD triggering, simplified OTC, simplified PCC, and adaptive 
thresholding. The proposed OTC requires (      )   less loops compared to the K-
target group clustering.  The simplified PCC is equivalent to a one-point FIR operation 
following by a division by the standard derivation of the triggered input segment. 
Average +P and SE of the proposed algorithm are 99.39% and 99.21%, respectively. The 
  92 
GUI of the algorithm is developed in MATLAB environment, in which pre-filtering 
parameters can be set. Moreover, based on the detected QRS position, HRV analysis is 
also conducted. 
 
 
 
  93 
Chapter 5  
An Event-driven Patient Specific ANN-CAC 
Deep learning methods, e.g., artificial neural network (ANN) and convolutional neural 
network (CNN), are favoured in cardiac arrhythmia classifications (CAC) for its high 
accuracy. The insufficient abnormal beats and extremely unbalanced data set limit the 
ANN-CAC performance. Moreover, the implementation of such method is challenging 
for wearable electrocardiogram (ECG) sensors due to heavy computational cost. 
An event-driven ANN-CAC is presented in this chapter to address the challenges. 
Continuous-in-time discrete-in-amplitude (CTDA) signal flow is adopted for the 
reduction of multiplication operations in ANN-CAC. Customized three-stage pipelined 
multiplier and customized finite stage machine (FSM) are adopted to enhance the 
execution speed and efficiency. Regarding the unbalanced data set and the ECG identity, 
conditional grouping scheme (CGS) accompany with the biased training (BT) are 
proposed to enhance the classification accuracy. The design is verified on FPGA and 
implemented in CMOS 0.18μm process. Simulated average power is 13μW for the 
patient with heart rate of 75bpm. Verified on MIT-BIH arrhythmia database, the design 
shows over 99% classification accuracy, 97% sensitivity, and 94% positive predictivity 
on the 5 types defined by the AAMI standard.  
This chapter is organized as follows. In Section 5.1, data preparation and the 
details of CGS is described. Feature selections, implementation considerations, and 
topology of the proposed CTDA patient specific ANN-CAC are presented in Section 5.2. 
  94 
BT process is shown in Section 5.3. Implementation details including the customized 
multiplier and FSMs are presented in Section 5.4. Benchmark results of the implemented 
event-driven ANN-CAC are given in Section 5.5. Conclusions are discussed in Section 
5.6. 
5.1 Data Preparation 
Table 5-1-1. Heart beat label and corresponding meaning in MIT-BIH 
Symbol Label Meaning 
N 1 Normal beat 
L 2 Left bundle branch block beat
 
R 3 Right bundle branch block beat 
a 4
 
Aberrated atrial premature beat 
V 5 Premature ventricular contraction 
F 6 Fusion of ventricular and normal beat 
J 7 Nodal (junctional) premature beat 
A 8 Atrial premature beat 
S 9 Premature or ectopic supraventricular beat 
E 10 Ventricular escape beat 
j 11 Nodal (junctional) escape beat 
/ (P) 12 Paced beat 
Q 13 Unclassified beat 
e 34 Atrial escape beat 
f 38 Fusion of paced and normal beat 
 
Table 5-1-2. Mapping of MIT-BIH labels to AAMI types 
AAMI Meaning MIT-BIH Redefined Label 
N Any beat not in S, V, F, and Q 
classes 
1, 2, 3, 34, 
11 
[1, 0, 0, 0, 0] 
S Supraventricular ectopic beats 4, 7, 8, 9 [0, 1, 0, 0, 0] 
V Ventricular ectopic beats 5, 0 [0, 0, 1, 0, 0] 
F Fusion beats 6 [0, 0, 0, 1, 0] 
Q Unknown beats 12, 13, 38 [0, 0, 0, 0, 1] 
 
For the consistency of comparison with other works on the benchmarking, the standard 
MIH-BIH arrhythmia database is used for network training and performance evaluation. 
  95 
There are 48 half-hour sets of ECG records from 47 subjects [123]. Modified limb Lead 
II (MLII) and modified Lead V ECG signals are stored in each record. Only MLII data is 
used in this study to keep consistency with other works. All records are digitalized with 
11-bit resolution over 10mV at 360Hz sampling frequency. The entire database provides 
15 types heart beats. The beat type related labels and its corresponding annotation 
meaning as defined in MIT-BIH are shown in Table 5-1-1. As recommended by AAMI 
practice [144], the 4 paced records including 102, 104, 107 and 217, should be removed 
from the study. Except the unnecessary Q type for the evaluation of CAC in AAMI, i.e., 
paced beat (label 12), unclassified beat (label 13), and fusion of normal and paced beat 
(label 38), all other N, S, V and F types should be evaluated from MIT-BIH labels. For 
further update and the coverage of all situations in practical, all the 5 types are considered 
in the proposed ANN-CAC even though Q type is not required by AAMI Standard, i.e., 
this design has five output types. The mapping of each AAMI heartbeat type and the 
redefined labels for classification purpose are provided in Table 5-1-2. 
5.1.1 Identity of ECG 
The morphologies of same type of beat defined in AAMI from different patients are quite 
different. Taking type S as an example as shown in Fig. 5-1-1. The S type from 44 
records excluding paced beats can be divided into sub-morphologies. Six of them from 
records 100, 207, 220, 118, 228 and 124 with obvious different waveforms are presented. 
In record 100, the ECG QRS complex shows symmetrical small Q and S valleys. Except 
the flat T peak, the entire waveform is similar with a normal heartbeat, while the shapes 
  96 
are quite different from the normal one in other records. The heartbeats in records 207, 
220, 118, 228 and 124 have remarkable deep R valley, distinguishing sharp and deep S 
valley, deep but wide S valley, obvious Q valley, relative short PQ segment, respectively. 
It should be noticed that the morphology differences of S type are not limited to the 
illustrated six records but each record has its own details.  Not only S type but also V 
type has its own property in each record. We call the diversity of the ECG morphology as 
the ―identity‖ of the ECG. 
 
Fig. 5-1-1. Illustration of ECG identity. 
 
Apparently, a global classifier is not able to handle the identity well and thus 
leads to low classification accuracy. The joint patient specific set [134] also introduces 
interferences during training. For example, in the training of the ANN-CAC for patient 
207, there is no S type appears in the first 5 minutes. Thus, all the S types are borrowed 
  97 
from the in-commonly used beats during training. In this case, the trained network is 
biased to the shapes like 100 and 124 that are totally different with 207, finally leading to 
poor accuracy on the detection of S type for patient 207. In addition, the inconsistency of 
the morphology under the same type from in-common used records generally results in 
error oscillations during training. Therefore, more reasonable data grouping and training 
strategy should be adopted. 
5.1.2 Conditional Data Grouping Scheme (CGS) 
We propose to use the CGS data grouping method to address issues related to the identity 
of ECG, i.e. grouping the data according to the heartbeat statistics of each record in the 
database to remove the records with too few abnormal beats from training and 
classification. The threshold is set at 1% of the entire record. If the abnormal beats less 
than this criterion, the record will not be considered in the group that contains the 
corresponding label. 
The statistics of beat types for both entire record and the first 5 minutes in each 
record are provided in Table 5-1-3. We clustered the dataset into 5 groups, i.e., (N, S), (N, 
V), (N, S, V), (N, V, F), and (N). For (N), the records provide negligible number of 
abnormal beats (less than 1% of the total beats), e.g., the record 113 contains total 6 S 
type beats out of 1789 and only 3 of them occur in the first 5 minutes. If we train the 
ANN-CAC using their own abnormal beats, the samples are far from enough. While, 
borrowing from other records will make part of the N types being treated as other types 
due to the ECG identity.  
  98 
 
Table 5-1-3. Statistics of heartbeat number in each record and the proposed CGS 
Record 
Entire Record 1% 
Total 
Total 
First 5 Minutes 
CGS 
N S V F N S V F 
232 398 1382 0 0 18 1780 120 342 0 0 
(N, S) 
209 2621 383 1 0 30 3005 452 10 0 0 
222 2274 209 0 0 25 2483 462 0 0 0 
118 2166 96 16 0 23 2278 441 18 3 0 
220 1954 94 0 0 20 2048 458 4 0 0 
202 2061 55 19 1 21 2136 454 1 7 0 
234 2700 50 3 0 28 2753 462 0 0 0 
100 2239 33 1 0 23 2273 457 5 0 0 
233 2230 7 831 11 31 3079 327 3 128 4 
(N, V) 
106 1507 0 520 0 20 2027 402 0 60 0 
203 2529 2 444 1 30 2980 398 2 62 0 
119 1543 0 444 0 20 1987 355 0 107 0 
221 2031 0 396 0 24 2427 373 0 89 0 
228 1688 3 362 0 21 2053 369 0 93 0 
214 2003 0 256 1 23 2262 404 0 56 0 
210 2423 22 195 10 27 2650 428 2 31 1 
215 3195 3 164 1 34 3363 430 0 32 0 
116 2302 1 109 0 24 2412 450 0 12 0 
205 2571 3 71 11 27 2656 450 1 10 1 
219 2082 7 64 1 22 2154 442 4 15 1 
114 1820 12 43 4 19 1879 431 1 28 2 
105 2526 0 41 0 26 2572 448 0 14 0 
109 2492 0 38 2 25 2532 454 0 6 2 
201 1635 128 198 2 20 1963 459 3 0 0 
(N, S, V) 
207 1543 107 210 0 19 1860 361 0 101 0 
200 1743 30 826 2 26 2601 326 2 134 0 
223 2045 73 473 14 26 2605 430 7 19 6 
124 1536 31 47 5 16 1619 410 30 20 2 
208 1586 2 992 373 30 2955 250 0 147 65 
(N, V, F) 
213 2641 28 220 362 33 3251 360 1 17 84 
113 1789 6 0 0 18 1795 459 3 0 0 
(N) 
108 1740 4 17 2 18 1763 458 0 4 0 
101 1860 3 0 0 19 1865 460 1 0 0 
103 2082 2 0 0 21 2084 462 0 0 0 
112 2537 2 0 0 25 2539 462 0 0 0 
231 1568 1 2 0 16 1571 459 1 2 0 
121 1861 1 1 0 19 1863 462 0 0 0 
117 1534 1 0 0 15 1535 462 0 0 0 
123 1515 0 3 0 15 1518 461 0 1 0 
111 2123 0 1 0 21 2124 462 0 0 0 
230 2255 0 1 0 23 2256 462 0 0 0 
115 1953 0 0 0 20 1953 462 0 0 0 
122 2476 0 0 0 25 2476 462 0 0 0 
212 2748 0 0 0 27 2748 462 0 0 0 
 
  99 
In the proposed CGS, first four groups are considered for classification purpose. 
Each group has its own training data and classification task. For example, types N, V, and 
F samples from an individual record form the training ensemble and evaluation set for the 
records in (N, V, F) to avoid the interferences from other records. Since the 1% criterion 
is set, the considered records in each group contain at least several tens corresponding 
beats for training purpose. The training set is completely from the patient specific record. 
For other groups, same principle is applied. 
5.2 Proposed CTDA ANN-CAC 
Continuous-in-time and discrete-in-amplitude (CTDA) is a type of signal that features 
both analog signal and digital signal properties. It is a non-uniform signal processing 
concept firstly reported in 1966, but has not been implemented until 1996 [151]-[153]. 
CTDA produces samples proportional to the signal amplitude instead of evenly sampling 
on time. It is free from aliasing compared with the Nyquist sampling scheme. With slow 
changing amplitude, few or even no samples will be generated using CTDA scheme. The 
inherent adaptive sampling significantly reduces the number of samples in the recovery 
of burst signal such as ECG and music. Level-crossing analog-to-digital converter (LC-
ADC) is the circuits implementation of CTDA signal flow. Adoption of LC-ADC 
becomes an excellent candidate for the design of ultralow power wearable ECG sensors 
in recent years. 
  100 
5.2.1 CTDA vs Nyquist Sampling 
 
Fig. 5-2-1. CTDA vs Nyquist sampling [177]. 
 
Nyquist sampling scheme acquires samples that are evenly distributed in time regardless 
of the amplitude change which is also the basic operational principle of general-purpose 
ADCs. Contrast to Nyquist sampling scheme, the CTDA sampling scheme acquires 
samples that are evenly distributed in amplitude, which can be done by a level-crossing 
ADC.  Fig. 5-2-1 shows the two sampling schemes for an ECG signal. The Nyquist 
sampling scheme treats the baseline part and the P-QRS-T segment of the ECG at the 
same weight of length, i.e., the length of the signal decides the number of samples. As 
illustrated in Fig. 5-2-1(A), 23 and 12 samples are acquired for the baseline signal and P-
  101 
QRS-T signal, respectively, during one ECG cycle. The sampling effort at each point is 
the same, but baseline samples contribute little useful morphology information on the 
classification of cardiac arrhythmia. 
Different from Nyquist sampling, the CTDA scheme samples the ECG signal 
based on the change of amplitude, i.e., obtains samples evenly distributed in amplitude. 
The number of samples is sensitive to the change of the amplitude. Sampling occurs 
whenever the signal surpasses the upper or lower thresholds of an amplitude interval 
window. For the ECG signal as shown in Fig. 5-2-1(B), 26 samples are obtained for the 
ECG complex with no samples at baseline. Comparing Fig. 5-2-1(A) and Fig. 5-2-1(B), 2 
and 2 samples are for P peaks, 4 and 20 samples are for QRS complex, and 5 and 4 
samples are for T peaks in the corresponding Nyquist and CTDA schemes, respectively. 
That is, the sampling efforts at baseline are suppressed while more samples are assigned 
to the fast-change part of the ECG signal in CTDA. Thus, the computations for baseline 
samples are saved in the ANN-CAC. 
Considering a person with 75bpm heart rate, 160ms length P-QRS-T segment, 
1mV R peak, S and T are at 1/5 of R, and the ECG signal is monitored with a AFE with 
gain of 60dB, then the number of samples generated in one ECG cycle by a 250Hz 8-bit 
Nyquist ADC and a LC-ADC with 20mV level window are 200 and 140, respectively, 
i.e., 1600 bits for Nyquist and 140 bits for CTDA. Moreover, just 20% of the 1600 bits in 
Nyquist data are for useful P-QRS-T segment while almost all 140-bit CTDA data are for 
the rapidly changing of the P-QRS-T. The samples have 90% reduction using LC-ADC. 
  102 
5.2.2 Processing CTDA Signal for ANN-CAC  
As shown in Fig. 5-2-2, the topology of the state-of-the-art ultralow power wearable ECG 
sensor with CTDA signal flow is composed of power management unit (PMU), analog 
frontend (AFE), LC-ADC, event-driven QRS detector, encoder and transmitter with on-
chip antenna [66]. The instrument amplifier (IA), which commonly has fixed gain and 
low noise, buffers the vulnerable on-body ECG signal (RA and LA leads are used for 
illustration) and passes the output to the followed programmable gain amplifier (PGA) to 
amplify the obtained weak ECG to a proper amplitude. Then the acquired analog signal 
will be converted into pulses with direction and time labels in the LC-ADC. Based on the 
labeled pulses from the LC-ADC, the event-driven QRS detector can accurately detect 
the position of R peak and extracts the two consecutive RR intervals. Raw samples from 
the output of LC-ADC are encoded into transmitter preferred format, and then wireless 
transmits to the gateway via on-chip power amplifier and antenna. The ECG visualization 
terminal, such as smartphone, tablet and laptop, captures the in-air signal and interacts 
with user. 
The proposed ANN-CAC requires the samples from an LC-ADC and the 
extracted two adjacent RR intervals from the event-driven QRS detector rather than 
complex feature extractions. The pulses with positive and negative direction from the 
LC-ADC are mapped to 1s and 0s in the input of the ANN-CAC, respectively, as shown 
in Fig. 5-2-3 taking the R peak as an example.  With a stream of 1s and 0s and RR 
  103 
interval information, the presented ANN-CAC with CTDA signal flow can be seamlessly 
integrated into an ultralow power ECG sensor.  
 
Fig. 5-2-2. Topology of CTDA ECG sensor. 
 
 
Fig. 5-2-3. Mapping of LC-ADC output and adjacent RR intervals to ANN-CAC input. 
  104 
5.2.3 Topology of Proposed CTDA ANN-CAC 
Benefiting from the CTDA LC-ADC sampling scheme, the input of the ANN is a vector 
of pure 1s and 0s, of which 22-bit are from the two RR intervals and 74-bit are the 
mapped series from the output of the LC-ADC to represent the morphology as illustrated 
in Fig. 5-2-3. As shown in Fig. 5-2-4(A), the network size is 32×16×5. The 32 neurons in 
the first layer need no multiplications because of only 1 or 0 are possible for each input. 
Thus, the arithmetic operations before the activation can be expressed as: 
      ∑                  ,     (5-2-1) 
where      is the weight for the j
th
 input to the i
th
 neuron in the first layer,    represents 
the j
th
 input sample, and    and    are respectively the bias and the activation input of the 
i
th
 neuron in the first layer. 
The 16-bit weights and biases are fixed point format with 1-bit sign, 1-bit integer. 
Since the input and targeting output are 1s and 0s, the absolute value of all the weights 
and biases after training are less than or close to 2. 1-bit integer is enough for this 
scenario. The minimum integer bit leaves maximum room for the best representation of 
decimal part of the weights and biases. Although the input is either 1 or 0, the 
accumulation in the first layer is for weights and biases of the neurons, i.e., the 
activations of the first layer are decimals. Thus, the operations ahead of the activation of 
neurons in second and third layers should be in full word-length: 
      ∑(       ) ,      (5-2-2) 
  105 
in which,    is the activation of the j
th
 neuron from previous layer,      stands for the 
weight from the j
th
 neuron in previous layer to the i
th
 neuron in current layer, and    and 
   respectively represent the corresponding neuron bias and the input of the current 
neuron activation. 
 
Fig. 5-2-4. Topology of the proposed patient specific CTDA ANN-CAC. 
 
There are two main types of activation functions used in ANN. One is the totally 
non-linear functions such as Sigmoid, TanH and SoftPlus. The other is partial linear 
functions, e.g., rectified linear unit (ReLU), bipolar ReLU and leaky ReLU. Each type of 
  106 
activation has its own advantages and disadvantages that affect the training process. For 
example, the Sigmoid suffers from gradient vanishing issue, may saturate the gradient, 
and leads to slow convergence in training. ReLU solves the gradient vanishing well, 
while it may lead to dead neurons in training. Hence, the disadvantages are not 
limitations in the implementation of the ANN once the issues are addressed in training 
stage, where final fixed weights and biases are obtained. 
For power constraint wearable applications, the main consideration in the 
implementation should be the simplicity of the function. The sigmoid function, as defined 
in (5-2-3), contains division and exponential operations. To simplify the calculations, 
exponential operations can be approximated by (5-2-4) and (5-2-5) [142]. In (42), floor(x) 
is the maximum integer less than x and delt is the fractional part of x. Nevertheless, the 
Sigmoid activating operations needs at least one FSM and tens of register shifting for 
division, and several times shift operation, one multiplication and addition for 
exponential approximation with marginal error. Different from sigmoid and any other 
totally non-linear functions, the ReLU as defined in (5-2-6) just require a MUX. Hence, 
in the implementation, ReLU is selected. 
    (  )    (   
   ),      (5-2-3) 
                 ,      (5-2-4) 
         ( )            ( )(      )     (5-2-5) 
       (  )  {
                   
                 
.    (5-2-6) 
  107 
The operation principles of the implementation are shown in Fig. 5-2-4(C). Once 
the QRS detector reports an R peak, the Top FSM will be activated to classify the 
heartbeat centered at the previous R peak. The Top FSM then sends ―start‖ instruction to 
the FSM Conditional Accumulating (FSM-CA), which reads the 96-bit input stored in the 
first address sector of the SRAM and controls the MUX to pass the current weights or last 
weights to the input of the adder based on the current input value. If input is 0, the 
adder‘s inputs and output keep same with their last state; otherwise update the adder input 
with current weight. In this way, no dynamic power wastes on 0 input. After 96 times 
accumulations and bias adding, the activation MUX controlled by the FSM-CA will pass 
through the final accumulation to SRAM if the sign bit of the final accumulation is 0. 
Otherwise, it sends 0 to SRAM. One cycle of this executes the (5-2-1) and (5-2-6) for one 
neuron of the first layer. When 32 cycles are done, FSM-CA handshakes with Top FSM 
and turns to sleep. After handshaking, the Top FSM activates the FSM Normal Neuron 
Function (FSM-NN). The accumulation and activation MUX in second and third layers 
are the same as in the first layer. Thus, single adder and activation MUX are reused in 
these three layers for all neurons. Different from the operation in the first layer, the 
conditional MUX at the input of the adder is replaced by a multiplier to achieve the 
operations in (5-2-2) and (5-2-6). Similarly, single multiplier is reused for the second and 
third layers. After 21 multiplication-accumulation cycles (16 for layer 2 and 5 for layer 3), 
FSM-NN informs the Top FSM to activate the final FSM Max Out (FSM-MO) to find the 
position of the maximum value in the final 5 activations of the network, and convert the 
  108 
value on the position to 1 and others to 0, i.e., achieve the redefined heartbeat type label 
in Table 5-1-2. Once finished, the Top FSM sets the ANN_ready signal to 0 to indicate 
the value in the 5-bit output register is ready for reading. Then, the classifier switches 
into idle mode for power saving. The reuse of single multiplier and adder eliminates 
unnecessary wastes on leakage power. Thanks to the simplified network operations, 
classification can be finished in short time with single multiplier and adder. The reserved 
redundancy sector of the SRAM is for testing purpose and the entire SRAM is 
programmable by the on-chip SPI slave module. 
It should be noted that accumulation of continuous positive or negative values is 
risky. Thus, 3 more bits are given for the integer value in adder, i.e. 1-bit sign, 7-bit 
integer and 16-bit decimal is taken as the data format of the adder. To cooperate with the 
multiplier, 3-bit extension is added to the most significant bit (MSB) of the multiplication 
result as shown in Fig. 5-2-4(B). Activation is taken from the successive 2-bit integer and 
14-bit decimal part of the accumulation results. 
5.3 Design Considerations 
Dynamic power dominates the total power consumption when the chip operates at high 
speed (or high clock frequency). With the drop of the speed, leakage power increases 
while dynamic power keeps almost the same due to the constant switching activities, i.e. 
high clock frequency is preferred in order to minimize the power.  With the limit of the 
execution speed of sub modules, e.g., SRAM, multiplier, and adder, there is a limit on the 
clock frequency.  Thus, the optimization of sub modules is important. Adder is much 
  109 
faster than multiplier, and SRAM is the hard IP that we can do nothing on it. In this 
design, the default 24-bit Sklansky Tree adder is used. The three-dimensional optimized 
(TDM) multiplier is enhanced to operate at the same speed as the adder. Besides the 
operation speed of adder and multiplier, an efficient schedule to reduce the total clock 
cycles is necessary. That is, finite state machines (FSM) for the control of the parameters 
reading and arithmetic executing should be well designed. Since thousands of weights 
need to be stored in the SRAM for processing, customized SPI protocol with 48-bit 
length of each cycle is designed. 
5.3.1 SRAM 
The block diagram and operational principles of the SRAM are shown in Fig. 5-3-1. The 
1024 words by 32-bit single input and output data port SRAM has 10-bit address port, 
ADR[9: 0], 32-bit data in port Din[31: 0], 32-bit data out port Dout[31: 0], memory 
enable port ME, write enable WE, read enable OE, and clock CLK. Memory is activated 
only if ME is asserted. As shown in Fig. 53(B), writing happens at the second rising edge 
of the CLK, where ME and WE are both high. The value of Din is written in the memory 
indexed at 0x002. Assume the content at ADR 0x008 is 0x00000000, then at the third 
rising clock edge where both ME and OE are high, the content at address 0x008 is read 
out to Dout.   
  110 
 
Fig. 5-3-1. Operational principle of SRAM. 
 
The maximum operating frequency for SRAM is 166MHz with typical average 
reading power of 35μW/MHz and writing power of 38μW/MHz. Due to the sub-threshold 
and junction leakage, the typical average static current is 2.75μA when ME is high. When 
ME is disabled, SRAM takes zero power. Thus, it is important to disable ME whenever 
there is no memory operation. 
5.3.2 SPI 
The SPI is customized to 48-bit length with SRAM controlling. As shown in Fig. 5-3-2, 
32-bit of master in slave out (MISO) signal at positive edge of SCK is for reading SRAM 
output data Dout. For the simplicity of testing, reading and writing share the address 
represented by the first 10-bit of master out slave in (MOSI), where reading address use 
the represented value while writing address always higher than the represented value by 
one, e.g., 10‘h001 for reading address, and 10‘h002 for writing address. In this way, each 
SPI cycle can both read the value written to SRAM during last SPI cycle, and write the 
  111 
value represented by the MOSI (12
th
 to 43
th
 bit) to SRAM via SPI slave. The last bit of 
MOSI controls the execution of the classifier, e.g., the value in Fig. 5-3-2 is 1 which will 
trigger the execution of the classifier. 
 
Fig. 5-3-2. Timing of customized SPI. 
 
During the first 11 cycles, the address is fetched by the on-chip SPI slave module 
and the writing address is calculated at the end of the 44
th
 cycle. Two 1024 words SRAM 
is used, thus the total number of bits for address is 11. The 10-bit ADR will be assigned 
to the corresponding SRAM by SPI slave according to the first bit. At the beginning of 
12
th
 cycle (12
th
 negative edge of SCK), memory enable (ME) and read enable (OE) are 
activated by SPI slave, i.e., Dout is ready slightly after the end of the 12
th
 cycle (12
th
 
positive edge of SCK). Then, the 32-bit Dout will be sequentially presented at the 
positive edge of SCK on MISO for reading by off-chip SPI master. The data to be written, 
i.e., Din, follows the address on MOSI and will be fetched by the SPI slave at each 
  112 
negative edge. At the beginning of 45
th
 cycle all 32-bit Din will be ready and the ME and 
write enable (WE) will be enabled by SPI slave to write the Din to SRAM. The pulse of 
start signal is possible to be generated by SPI slave based on the last valid bit on MOSI as 
indicated by the red solid circle in Fig. 5-3-2, i.e., ―1‖ for generating the pulse to trigger 
the classification and ―0‖ for doing nothing. 
5.3.3 Multiplier 
Operation of ANN is based on thousands of multiplications and accumulations. Thus, the 
multiplier optimization is critical for performance enhancement in ANN. For wearable 
and mobile applications, digital multiplier array is not preferred due to the limit of power 
and area. Thus, the optimization of multiplier is critical in the implementation of ANN 
for wearable devices. The optimization mainly focuses on two aspects: reducing the 
number of partial product (PP) [178], [179]; shortening the critical path in PP addition 
[180], [181]. In this section, the customized TDM multiplier with three-stage pipeline is 
presented [182]. 
To reduce the PP, two encoding schemes are widely used. One is canonical signed 
digit (CSD) which transforms the zero followed by ones into one followed by zeros. In 
CSD encoding, the probability of digit to be zeros is about 16% higher than in two‘s 
complementary representation. Thus, the number of partial products can be reduced by 
16%. However, the CSD encoding just aims at constant values, since the encoding is not 
a uniform rule, i.e., each number must be manually encoded. For the proposed ANN-
  113 
CAC, each neuron has individual weights, thus CSD is not a good choice. For general 
purpose multiplier, modified booth encoding (MBE) is the best choice.  
The MBE reduces the number of PP from LMR (the length of multiplier) to 
(LMR+2)/2 or (LMR +1)/2 by encoding the radix-2 operation into radix-4 [179], which can 
be summarized as 
             ,       (5-3-1) 
        ̅̅ ̅̅ ̅̅ ̅                     ̅̅ ̅̅       ,   (5-3-2) 
        ,        (5-3-3) 
    *   *  +    
   +,      (5-3-4) 
    * 
       *  +  +,      (5-3-5) 
    (       ) ((     )*  +),     (5-3-6) 
where X and Y are multiplier and multiplicand, respectively; LMD is the length of 
multiplicand;    *  + stands for concatenation of LMD copies of   ;    means the i
th
 bit 
of X;    is single Y indicator to generate    ;    is double Y indicator to generate    ; the 
i
th
 single and double Y,     and    , and the i
th
 sign extension    will be used to generate 
the i
th
 partial production     by 
    * 
      ̅̅̅̅   *  +    +          ;    (5-3-7) 
    * 
      ̅      
        +           .   (5-3-8) 
  114 
 
Fig. 5-3-3. Generated PP for 16-bit unsigned multiplicand and signed multiplier. 
 
In this design, ReLU activation function is selected, i.e., all the multiplications at 
layer 2 and layer 3 have one non-negative input as the output of ReLU is non-negative 
based on (5-2-6). Thus, the MBE scheme with unsigned multiplicand and signed 
multiplier is designed to generate the PP. Taking the 16-bit length as an example, the 
details of the encoding is illustrated in Fig. 5-3-3. First, the extension of multiplier is 
done by adding one 0 to the lower position of the least significant bit and one sign bit (if 
LMR is odd) or two sign bits (if LMR is even) to the higher position of most significant bit 
(MSB). The NPP is determined by  
    {
     
 
                     
     
 
                      
.     (5-3-9) 
  115 
Adding the sign extension operation to the multiplicand in the encoding, the 
stored sign bit removed multiplicand can be one more bit resolution higher than the 
original one. From (43) to (50), the radix-4 booth encoder is shown in Fig. 5-3-4 [180]. 
 
Fig. 5-3-4. Radix-4 booth encoding diagram. 
 
To short the PP addition, TDM with rearranged half-adder position is proposed to 
further short the critical path [181], [183]. TDM takes all the PP as input and sorts them 
in proper order to connect the early ready signal to the slow path of the half-adder (HA) 
or full-adder (FA) to shorten the critical path as shown in Fig. 5-3-5. Assuming the delay 
from inputs a and b to outputs are all 2, from inputs carry, cin, to outputs are 1, then the 
output sum, s, and carry out, co are ready at 5. Sweeping the connection of a and cin, 
then the ready time can be shortened to 4 as shown on the right side of Fig. 5-3-5. 
 
Fig. 5-3-5. Delay reduction with rearrange of input order. 
  116 
 
 
Fig. 5-3-6. Rearrange the position of HA in TDM multiplier. 
 
As shown in Fig. 5-3-6, the TDM will divide the PP into vertical compress slice 
(VCS), and then arrange the input connection orders of the HA and FA. Finally, the 
output of VCS is added in the output carry propagate adder (CPA). Fig. 5-3-6 shows the 
details of VCS16. In the original TDM, the HA is placed at the beginning to minimize the 
vertical delay. With the increase of the bit, horizontal delay dominates the critical path. In 
this design, the HA is placed at the end as indicated by the red arrow. According to the 
simulation, it is 0.8ns faster than the original position. 
  117 
 
Fig. 5-3-7. Structure and pipeline data flow of the customized multiplier. 
 
Although both MBE and TDM effectively reduced the length of the critical path 
of the multiplier, the speed of multiplier still falls short of the adder. One way to 
accelerate the multiplication is parallel implementation. But it increases silicon area and 
leakage power. Another way is to use pipelined multiplication at the price of additional 
registers, the negligible system delay and the design complexity. Thus, in this design the 
pipeline is adopted as shown in Fig. 5-3-7. 
In Fig. 5-3-7(a), the TDM is divided into three modules in the three stages with 
almost same propagation delay, Td, by examining the delay information calculated from 
the port connection and HA, FA delay. After the division, the output of the module in the 
  118 
first-stage is sorted into three groups {s12, c12} (sum and carry need to be sent to the 
second module), {s13, c13} (sum and carry need to be sent to the third module), and o1 
directly to output o12. The sort of the second module takes the same way. Assuming the 
registers pass data at positive edge of CLK, the three-stage pipeline operational principle 
is illustrated in Fig. 5-3-7(b): At first positive edge CLK, the first valid pp_1 is passed to 
register array pp1. At the second edge, the first valid output group s12_1, c12_1, c13_1, 
s13_1 and o1_1 from the output ports s12, c12, c13, s13 and o1 of module one are passed 
to the register arrays reg1, reg2, reg7, reg5 and o11, respectively. Meanwhile, pp_1 is 
passed to pp2, and pp_2 is ready to be passed to pp1. At next edge, pp_1, pp_2, pp_3, 
s12_2, c12_2, s13_1, s13_2, c13_1, c13_2, o1_1 and o1_2 propagate to the register 
arrays pp3, pp2, pp1, reg1, reg2, reg6, reg5, reg8, reg7, o12 and o11. Before the edge, the 
valid values at s23 and c23 are produced by the second module with pp_1, s12_1 and 
c12_1 as input, i.e., s23_1 and c23_1 are generated before the edge and propagate to reg3 
and reg4 at the edge. After the edge, the inputs of the third module are s13_1, c13_1, 
s23_1 and c23_1. Thus, o3_1 can be generated ahead of the fourth edge. Meanwhile, 
o2_1 and o1_1 are ready at register arrays o21 and o12. That is, the first results are 
generated at the fourth positive CLK edge by passing the results of register arrays o12 
and o21 and wires o3 to the output register. From above analysis, it can be found that the 
speed is 3 times better than TDM without pipeline for continuous multiplications. The 
system delay introduced by the pipeline is 3 clock cycles. 
  119 
The simulation result shows the designed three-stage pipelined TDM multiplier 
can operate over100MHz and takes 0.050mm
2
 area in 0.18μm process. 
5.3.4 Top FSM 
The Top FSM controls the execution of each sub-module. The transition diagram is 
presented in Fig. 5-3-8. After resetting, the Top FSM keeps in idle stage, until the start 
from SPI is encountered. The start pulse changes the FSM to the stage start1. In start1, 
the ANN_ready is assigned to 1 to indicate the execution of the ANN in the meanwhile 
start is assigned to 1 to trigger the execution of the FSM for conditional accumulation 
(FSM-CA) for the first layer. Without any condition, the state transfers to stop1 from 
start1. In stop1, if ready1=1 (ready1 will be generated from FSM-CA when the 
execution in the first layer is finished), move to start2 to trigger the FSM for normal 
neuron (FSM-NN) of the second layer, otherwise toggles in state stop1. Receiving 
ready2=1 from FSM-NN of the second layer in stop2, Top FSM knows the second layer 
calculations are done, then move to start3 to trigger the FSM-NN for the third layer. 
Similarly, the FSM for final max out (FSM-MO) will be started by receiving ready3=1 in 
stop3. Once the ready4 from the FSM-MO, the FSM navigates to ready and ANN_ready 
will be assigned to 0, to tell the external module that the ANN-CAC is in idle state. 
  120 
 
Fig. 5-3-8. Transition diagram for top FSM. 
5.3.5 FSM for Conditional Accumulation (FSM-CA) 
The FSM-CA is for the 32 neurons without multiplications in the first layer. As discussed 
in Section 5.2.3, the 96-bit input is stored in the first three addresses of the 32-bit SRAM, 
and the 16-bit weights and biases are sequentially programmed as in the right side of Fig. 
5-3-9. Since the SRAM is 32-bit, for better utilization of memory space, each SRAM 
address stores two weights. While the number of biases is much smaller than weights (in 
this design, there are 3664 weights but just 53 biases), for simplicity of control logic, 
biases are individually stored in each SRAM address. Thus, each SRAM reading requires 
maximum twice accumulation of weights. So, the ANN-CAC input register input[95:0] 
should be circulated shifted by two bits when the SRAM address increased by 1 after 
each two weights reading as illustrated in Fig. 5-3-9. In this way, the two weights at 
current address are always for the updated two inputs in input[95:94]. In other words, the 
corresponding inputs (accumulation condition) are updated to fixed register input[95:94], 
  121 
enabling the reuse of the conditioning circuits. According to the value of input[95:94], 
the operations can be divided into 4 cases: Case 1) input[95:94]=2‘b00, the two inputs are 
zeros, no need for accumulation and SRAM reading; Case 2) input[95:94]=2‘b01, one 
weight from Dout[15:0] (the low 16-bit of SRAM output) should be accumulated; Case 3) 
input[95:94]=2‘b10, the other weight from Dout[31:16] (the high 16-bit of SRAM output) 
should be accumulated; Case 4) input[95:94]=2‘b11, both two weights should be 
accumulated, twice additions are required. To control such a data flow, a customized 
FSM-CA is proposed in Fig. 5-3-10. 
 
Fig. 5-3-9. Data flow of FSM-CA. 
  122 
In the FSM-CA, there are three counters. Counter1 records how many inputs were 
fetched. Counter2 indicates how many input shifting occurred. Counter3 stands for how 
many neurons have been executed. Signal start1=1 from the Top FSM activates FSM-CA 
from state idle to sr_in, preparing reading SRAM. Then, in the following three clock 
cycles in state r_in, in[95:0] at the first three addresses of SRAM will be passed to the 
input register input[95:0]. The 4 chains judge1-> add1, judge1-> add2, judge1-> add3, 
and judge1-> add4-> add5 execute the 4 different operation cases based on input[95:94]. 
Counter2 always increased by 1 in state judge1 to record one more input shifting. When 
counter2 gets to 48, accumulations for 96 inputs are done, meanwhile, the input shifted to 
the initial value for next accumulation cycle. Then, at the end of the 4 chains, the state 
will go to rb to accumulate the bias. Otherwise, the state return to judge1 for next input 
shifting and operation.  
 
Fig. 5-3-10. Transition diagram for FSM-CA. 
  123 
 
In judge1-> add1, no SRAM reading and addition, i.e., no dynamic power for 
accumulation because of the two zero inputs. In judge1-> add2 and judge1-> add3, 
SRAM reading and one accumulation is required. In judge1-> add4-> add5, SRAM 
reading and twice accumulations are needed. In rb, the bias for current neuron is 
accumulated, meanwhile, counter2 will be reset to zero for next neuron accumulations, 
and counter3 is increased by 1 to indicate one more neuron is done. In judge2, if 
counter3 equals to 32, all the 32 neurons are done, then FSM-CA get to ready1 to set the 
signal ready1=1 to inform the Top FSM that FSM-CA is done, otherwise, return to 
judge1 for the conditional accumulation of next neuron. 
5.3.6 FSM for Normal Neural Operation (FSM-NN) 
For the layer 2 and layer 3, they use the same FSM for normal neural operation as in (39) 
but with different parameters since the neuron numbers in the layer 2 and layer 3 are 
different. In FSM-NN, the three-stage pipelined multiplier is involved. The pipelined 
multiplication lagging should be considered. Take the layer 2 (16 neurons) as an example, 
the transition diagram of the FSM-NN is shown in Fig. 5-3-11.  
The counters, i.e., counter1, counter2, counter3, and counter4, control the signal 
flow. Each multiplication will increase counter1 by 1 until it gets to 4 that the 
multiplication pipelining delay has passed. Counter2 records the number of weights that 
have been read. When counter2 gets to 32, next value from SRAM is bias, which won‘t 
be passed for multiplication. Counter3 records how many valid multiplication-
  124 
accumulations are executed. Due to the pipeline lagging, counter2 reaches to 32 earlier 
than counter3, i.e., the final multiplication-accumulation is after the bias accumulation. 
Thus, when counter3 get to 32, all operations for one neuron are done. Counter4 records 
how many neurons are executed. Each address stores two weights, the same memory 
distribution as in Fig. 5-3-9, i.e., each SRAM reading gives two weights. 
 
Fig. 5-3-11. Transition diagram for FSM-NN. 
 
For the execution of the first neuron in layer 2, it first runs the circle rds->rd1-
>rd2->mp1->add1->rds for 16 times, after which counter2 equals to 32 but counter3 
equals to 28 due to the delay of the multiplier pipelining. As counter2 gets to 32, next 
reading of the SRAM is bias, thus it goes to rd1->rd2->mp2->rb2->add3 to read the 
  125 
bias and accumulate the bias to the intermediate multiplication-accumulation results. For 
now, the counter4 is 1 as it is the first neuron, thus it gets back to the rds for continuing 
the reading of the weighs for next neuron. Due to the pipelining delay, the output of the 
multiplier is still for the last four weights of the current neuron until counter3 reaches to 
32. Then, it goes to the branch rb1->add2->rds for the result writing back and 
parameters resetting. 
Other neurons executing the same procedures as the first neuron except the last 
one, which goes to the four times circulating of mp3->add4->mp3 without reading 
anything but keep multiplying and accumulating to get the final result. Then goes to rb3 
to generate the handshake pulse and write back the results.  
5.4 Training of ANN-CAC 
The training is conducted on a desktop with i7-4790 CPU. Thus, resolution of multiplier 
and adder is not limited.  Quadratic cost is selected in each epoch update, C, as defined 
by 
  
 
  
∑∑(     )
 ,      (5-4-1) 
where n is the number of samples used in each mini-batch,    is the expected value in the 
i
th
 position of the sample label,     is the activation of the i
th
 neuron of the output layer, 
the first sum is over n and the second sum is over output neurons. It evaluates the mean 
square error between the expected label and the activation of the output neurons. Hence, 
the training promotes the output approaching to the redefined label by minimizing the 
  126 
cost. The value at corresponding label position get closer to 1 while others to 0, i.e., the 
max out module is used to find the position of maximum value in the output activations 
to get the classified heartbeat label. 
Regularization factor, λ, is introduced in the training process to reduce the 
overfitting [184]. Regularization technique is defined in (5-4-2) to get the regularized 
cost, Cr, by including the weights decay. In (5-4-2), the sum is over weights, w. 
Meanwhile, learning rate,  , is adopted to guarantee the optimization of the 
backpropagation. 
     
 
  
∑         (5-4-2) 
5.4.1 Backpropagation 
Assuming, the input at the j
th
 neuron in the layer l has a small change    
 , and the cost 
variation caused by    
  is   (   
 ) which can be expressed as a cost partial derivation 
(or the observed error at this neuron   
 ) in terms of   
  as: 
  (   
 )  (     
 ⁄ )   
    
    
 .     (5-4-3) 
Applying the chain rule,   
  can be rewritten as: 
  
  ∑.
  
   
   / (
   
   
   
 ) (
   
 
   
 )      
 (  
 )∑   
       
   .  (5-4-4) 
In (5-4-4), the sum is over the neurons in the layer l+1, and     
    is the weight for the j
th
 
neuron in the layer l to the m
th
 neuron in the layer l+1, i.e., the observed error can be 
computed for neurons layer by layer, i.e. from the last to the beginning layer. Since the 
  127 
calculation is from the output to input, the propagation is called as backpropagation [185]. 
Relation in (5-4-4) holds only for none output layers. For the out layer, the error is 
defined by 
  
       (  
 )(     
 ⁄ ).      (5-4-5) 
Notice the relation of the neuron input with the weights and bias as defined in (5-
2-2), the gradient of the cost regarding bias and weight are respectively given as 
     
 ⁄  (     
 ⁄ )(   
     
 )    
 ,    (5-4-6) 
       
 ⁄  (     
 ⁄ )(   
       
 )    
   
   .   (5-4-7) 
The update bias,    
 , and weights,      
 , can be respectively expressed as (5-4-8) 
and (5-4-9) by applying the learning rate according to the gradient decent rule. 
   
    (     
 ⁄ )      
       (5-4-8) 
     
    (       
 ⁄ )      
   
       (5-4-9) 
5.4.2 Biased Training 
Each record of the MIT-BIH database only contains a small portion of abnormal 
heartbeats (type S, V, and F). We noticed that the normal mini-batch sampling treats all 
samples with same probability, i.e. the proportion of different types in the obtained mini-
batch keeps same with the entire training samples. Thus, sampling the unbalanced 
training data with even probability produces the unbalanced mini-batch for training (the 
evenly sampled training).  Since N type dominates the unbalanced data, the trained 
weights and biases are significantly affected by N type beats but lack of effects from 
  128 
abnormal types. Hence, the biased training (BT) process is proposed to address the issue. 
The details of the BT are provided in Table 5-4-1. 
Table 5-4-1. Details of the proposed biased training. 
Stochastic gradient decent algorithm with biased sampling update 
Required: Learning rate  , regularization factor  , beat type grouping, minibatch size m, 
total training samples ts, and number of training epochs ne. 
Required: Randomly initialize weights     
 , and   
  with Gaussian distribution and 
normalize them.
 
     while target training accuracy not reached do: 
                Updating times T floor(ts/m). 
                for 1 to T do: 
                        Obtain m samples with same beat type proportion.  
                        Calculate inputs and activations of each neuron  
                        according to (5-2-2) and (5-2-6) by forward propagation. 
                        Get cost based on (5-4-1) and (5-4-2). 
                        Derive output observed error   
 , updating amount for  
                        biases and weights of output layer    
  and      
  from  
                        (5-4-6) to (5-4-9). 
                        Backpropagate   
  for none output layers based on (5-4-5). 
                        Update   
    
     
  and     
      
       
 . 
                 end for 
        end while 
 
As shown in the first step of the for loop, the proposed BT guarantees each beat 
type takes same proportion of the m samples. For example, in our CGS, record 208 
belongs to (N, V, F). Firstly, the samples in record 208 are divided into three subsets with 
individual N, V, and F labels. During colleting the m samples, e.g., m is 12, the biased 
training would take 3 samples from each subset to form the balanced m samples for 
updating. The comparison of CGS-BT with conventional cross-validation scheme based 
evenly sampled training (CV-ET) during 200 training epochs is shown in Fig. 5-4-1. For 
both methods,   and   is respectively selected as 0.001 and 0.0001. Two remarkable 
  129 
conclusions can be observed. First, the CGS-BT converges quicker and shows lower cost 
than the CV-ET on training dataset. Second, the cost on evaluation dataset coincides with 
the trend on training dataset by using the CGS-BT, while the training convergence has 
little help on the evaluation result in the CV-ET and even the evaluation cost oscillates 
across the whole 200 epochs. Thus, cooperating with the CGS, the BT handles the 
imbalanced database well and shows excellent performance. 
 
Fig. 5-4-1. Comparison between conventional training and the proposed biased training. 
 
5.5 Experiment Results 
During experiments, all patients in group (N, V), (N, S), (N, S, V) and (N, V, F) are 
evaluated. Considering the ECG identity, each patient uses their own ECG data for both 
training and evaluation without borrowing from others. Randomly selected 70% samples 
  130 
of a record is taken as the training data set of the patient, and the rest 30% samples are 
used for evaluation. Take patient 209 as an example, the record contains 2621 N Types 
and 383 S Types. Thus, the training samples consists of 1835 N Types and 268 S Types 
that are randomly chosen respectively from the total N Type and S Type. The left 30% 
samples with 796 N Types and 115 S Types are used for evaluation. 
5.5.1 Verification on FPGA 
Pynq-Z2 board with Xilinx Artix-7 family FPGA is target for the verification as shown in 
Fig. 5-5-1. The Zynq-7020 chip contains an Artix-7 and dual ARM Contex-A9 core. In 
this verification, it is programmed via JTAG protocol where the ARM core is bypassed as 
circled by the yellow box. There are in total 4 different types of beats to be classified in 
the MIT-BIH, i.e., Types N, S, V, and F. The 4 LEDs and buttons as in the bottom yellow 
box to respectively indicate the classified type and trigger the classification. According to 
the label in the database, the ECG segments from the specific patient are correspondingly 
stored in memory sector N, S, V, and F to prepare for the input of the implemented 
classifier. The trained weights and biases for the patient is programed via the SPI master 
and slave to the instantiated two 1K 32bit SRAM modules, i.e., to keep exactly same with 
the ASIC implementation.  
The 4 buttons sequentially transfer the input from the corresponding 4 memory 
sectors to the classifier and trigger the classification as in the bottom yellow box of Fig. 
5-5-1. That is, the corresponding LED at the top of the button is expected to be lighted by 
pressing of the button. For example, when verify the accuracy of Type N detection, all 
  131 
the LED lighting times are recorded during continuously pressing the first button. Once 
finish the classification of Type N, then the pressing of other buttons in the same way to 
get the fusion matrix of the classification for the patient. When current patient is done, 
the specific weights, biases, and the ECG segments of the next patient should be 
reprogrammed. The final fusion matrix is the same with the simulation in Modelsim and 
MATLAB which will be discussed in Section 5.5.3. 
 
Fig. 5-5-1. Implementation verified on Pynq-Z2 board. 
 
The Artix-7 FPGA on Zynq-7020 has 13.3K logic slices, 630KB RAM and 220 
DSP slices. In this design, we only use the logic slices to keep the implementation as 
same as the ASIC implementation, where customized multiplier is used for the arithmetic 
operations. As in Fig. 5-5-2, 5269 look up tables (LUTs), 1024 LUTs based RAM 
(LUTRAM), 1331 flip flops (FF), and 3 buffer gates are utilized for this implementation. 
It takes roughly only 10% logic slices thanks to the single multiplier and adder. The 
power consumption at 2.5MHz is shown in Fig. 5-5-3, where the classification dynamic 
  132 
power is about 3mW which is much smaller than the device static power. The clock 
speed can be increased up to 125MHz. 
 
Fig. 5-5-2. Resource utilization on Atrix-7 FPGA verification.  
 
Fig. 5-5-3. Power statistic for the FPGA implementation at 2.5MHz. 
5.5.2 ASIC Implementation 
The design is also implemented in a CMOS 0.18µm process, where 64Kb SRAM is used 
for the storage of weights, biases, inputs, and activations. To short the path, we 
symmetrically placed two separate 32Kb SRAM blocks as shown at the left side of Fig. 
5-5-4. The area is 0.9246mm
2
, of which approximate 2/3 is taken by SRAM and 1/3 is for 
FSMs, multiplier, adder, and the SPI slave module. 
  133 
 
Fig. 5-5-4. Layout and performance summary of the proposed CTDA ANN-CAC. 
 
The synthesizing, placement, and routing efforts aims for up to 50MHz from 1.8V 
to 3.3V VDD with minimum slack of 100ps. For each classification, it takes 6298 clock 
cycles with single multiplier and adder. For the proposed 96×32×16×5 ANN-CAC, 3717 
additions, 592 multiplications, 3 cycles to fetch input, and 40 clocks to calculate the max 
out layer, i.e., theoretically a minimum of 4352 clock cycles can finish the classification 
if each multiplication and addition takes one clock. That is, we still have 30.9% room for 
optimization on the reduction of the clock cycles for better speed and energy efficiency.  
Nevertheless, the implemented design is able to finish each beat from 629.8ms to 
251.92µs at the operating frequency from 10KHz to 25MHz as illustrated by the black 
line in Fig. 5-5-5. 
  134 
 
Fig. 5-5-5. Simulated classification speed and power peaks at different clock frequency. 
 
Table 5-5-1. Current summary at 75bpm heart rate. 
Frequency(MHz) 0.01 0.05 0.1 0.5 1 10 25 
Current(μA) 15.51 8.993 8.18 7.526 7.444 7.371 7.366 
 
Since, the circuits and clock tree are fixed to meet the maximum 50MHz 
frequency and the CAC goes to sleep after the task, longer processing time means longer 
leakage but no effect on dynamic power as the total clock cycles are the same. As shown 
in Table 5-5-1, with the increase of clock frequency, the average current decreases at 
75bpm heart rate. From 10KHz to 500KHz, the current reduction is significant, i.e. 
almost 50%. Beyond 500KHz, the current reduction is small. It should be noted that, with 
the increase of clock frequency, the power peak increases almost linearly as shown by the 
red line in Fig. 5-5-5. The high power peak requires a power management unit (PMU) 
with strong instantaneous load capacity. Thus, 500KHz is recommended for the tasks that 
are not time sensitive. Without doubt, 50MHz is the choice for applications that are 
  135 
expected to process each classification within 126µs. With fixed architecture and almost 
same arithmetic operations, energy consumed on the detection of different types is very 
close as in Fig. 5-5-6. The top plot shows current for different types are around 7.35µA 
and the energy per clock cycle are close to 935pJ at 25MHz. The bottom is for 10KHz, 
current and energy per clock cycle are approximate 15.51µA and 1.97nJ. The power and 
energy results are on 3.3V supply, they can be accordingly reduced with the decrease of 
supply voltage e.g., at 1.8V the energy per cycle is around 510pJ at 25MHz. With the 
CTDA signal flow, the remarkable reduction of the network size and arithmetic 
operations enables a micro power ANN-CAC. 
 
Fig. 5-5-6. Energy summary at 25MHz and 10KHz for 75bpm heart rate. 
5.5.3 Benchmarking 
As the CGS divisions in Table 5-1-3, records in none (N) groups are taken into 
consideration. During training, randomly selected 70% and 30% samples from each 
record are used for the training and verification purposes, respectively. The classification 
  136 
fusion matrix is given in Table 5-5-2 and the accuracy performance on each individual 
record of the proposed classifier is summarized in Fig. 5-5-7. Thanks to the CGS, BT and 
the plentiful P-QRS-T complex details in CTDA signal flow, the ACC, SE, and +P for 
type N are over 99.3%. The ACCS, SES, +PS, ACCV, SEV and +PV reach the excellent 
level of 99.70%, 97.75%, 94.39%, 99.68%, 98.67%, 98.07%, respectively. Even for Type 
F, the ACC and +P get to the high level of 98.87% and 87.80% except the 49.03% SE. 
Table 5-5-2. Classification fusion matrix based on simulation and FPGA verification. 
  Detected Label 
  N S V F 
Ground Truth 
N 59874 150 104 75 
S 57 2609 3 0 
V 67 5 6805 20 
F 23 0 27 684 
 
 
Fig. 5-5-7. Classification accuracy details on each records in none (N) groups. 
  137 
 
Table 5-5-3. Classification accuracy comparison. 
Method 
Type V Type S 
ACC SE +P ACC SE +P 
Hu et al. [145] 94.8 78.9 75.8 N/A N/A N/A 
Chaza et al. [147] 93.6 78.9 76.0 N/A N/A N/A 
Jiang et al. [148] 98.1 86.6 93.3 96.6 50.6 67.9 
Ince et al. [149] 97.6 83.4 87.4 96.1 81.8 63.4 
Kiranyaz et al. [134] 98.6 95.0 89.5 96.4 64.6 62.1 
This Work 99.6 98.6 98.0 99.7 97.7 94.3 
 
Table 5-5-4. Performance comparison with other implementations. 
 
TCASI 
2017[141] 
VLSI 
2016[140] 
JSSC 
2014[139] 
TCAD 
2018[142] 
This 
Work 
Process 0.13μm 65nm 0.18μm 40nm 0.18μm 
Frequency 
(Hz) 
4K 10K 250-500 1M 10K-50M 
Area 
(mm
2
) 
1.21
 
0.112 N/A 0.135 0.925 
VDD (V) 0.9 1 0.5 1 1.8 
ACCN 
Doesn‘t 
follow 
AAMI 
86% Just 
differentiates 
P, QRS, and 
T 
95.84% 99.32% 
ACCS N/A 89.69% 99.70% 
ACCV N/A 78.67% 99.68% 
ACCF N/A N/A 98.87% 
Energy 
(pJ/cycle) 
96 278 1740-870 94 1075-510 
Class Number 2 2 N/A 3 4 
 
As shown in Table 5-5-3, with the help of CTDA, CGS, and BT, this work shows 
the best classification accuracy comparing with other deep learning based classifiers. 
More importantly, those deep learning methods are all running on a workstation or PC. 
The implementation we proposed consumes average power of 13.34μW with 126μs 
classification speed at the condition of 1.8V VDD and 50MHz clock. 
  138 
Compare with other implemented classifiers, this design presents the best 
classification accuracy with maximum class numbers and acceptable energy consumption 
as summarized in Table 5-5-4. 
5.6 Conclusions 
In this chapter, an ultralow power CTDA based ANN-CAC is presented. Considering the 
rarely happened abnormal beats and the unbalanced dataset, CGS-BT is proposed to 
improve the accuracy. The CTDA signal flow significantly simplified ANN topology and 
the arithmetic operations, especially at first layer where just conditional accumulations 
need to be used. For ultralow power implementation, a customized three-stage pipelined 
TDM multiplier is proposed to replace the multiplier arrays. Besides the multiplier, other 
on-chip modules, e.g., the SPI slave and FSMs are customized to cooperate with the 
SRAM and multiplier to achieve excellent execution efficiency.  
The classification performance of the proposed ANN-CAC is verified by both 
simulation and FPGA. The classification accuracy for Types N, S, V, and F are 
respectively 99.32%, 99.70%, 99.68%, and 98.87%.  The simulated power for the ASIC 
implementation is 13.34μW with maximum 126μs classification speed. Comparing with 
the state-of-the-art pure software implementation, this design shows comparable accuracy 
while requires no CPU or GPU. Comparing with other hardware CAC, this design shows 
the best classification accuracy and is capable of classifying all 4 Types beats at the cost 
of slightly higher power.  
  139 
Chapter 6  
Conclusions and Future Works 
6.1 Fulfilled Objectives and Future Works 
For the low-cost flexible ECG patches, belts, and other types of wireless wearable ECG 
sensors, power is the main constraint due to the compact size (or limited battery). For 
ambulatory ECG recording, high input impedance is necessary to handle the motion 
artefact for better diagnostic yield. Besides power and input impedance, the International 
Standards set the requirements on gain variation, CMRR, and noise for ambulatory ECG 
systems.  
  In this research, a novel DC-coupled analog frontend based on FDDA is proposed 
to achieve high input impedance and high CMRR. In conventional three-amplifier DC-
coupled AFE, both common mode and differential mode signals are amplified in the two 
branches which is the reason of the poor CMRR. We propose to replace the two branches 
by an FDDA to suppress the common mode amplification leading to a high CMRR of 
76dB while maintaining an input impedance of 1 GΩ. To improve the noise performance 
and area utilization, the parasitic capacitance of the large size input transistor is reused as 
part of the gain ratio capacitor leading to excellent noise performance of 1.02μVrms with 
noise effective factor of 2.55 and smaller size of 0.405mm
2
. An on-body DC bias is 
proposed to stabilize the DC bias while handling the DC off-set. Verified in 0.35μm 
CMOS process, the designed AFE can pick up clear ECG signal with the electrode 
distance as close as 2cm during both static and walking condition. It also records the EEG 
  140 
signal that shows clear oscillations with frequency around 10Hz as the α-wave observed 
after eye blinking during EEG testing.  Performance testing results show that the power 
of the AFE is 900nW under 1.8V VDD, gain variation is smaller than 3% with 
programmability of 56dB to 68dB with fixed bandwidth from 0.4Hz to 120Hz.  
High input impedance is helpful to handle the motion artefact by relaxing the 
impact of the skin-electrode contact impedance variation (potential divider effect). It‘s 
noticeable that the contact impedance variation is not the only source but also the skin 
potential variation due to the deformation of the skin during body movement. No research 
has been reported on how to address the skin potential variation, neither in the proposed 
AFE design. Thus, in the future, circuits that are tolerant to the skin potential variation 
should be studied. The designed AFE shows 1GΩ at DC which can be further improved. 
Possible solution is to combine impedance boosting with DC-coupled IA.  
Existing QRS detectors can easily achieve over 99% detection accuracy based on 
the MIT-BIH database. In practice, the detection accuracy is much lower than the 
reported results because of noise corrupted ECG and differences in ECG morphology of 
each individual. Current researches on the QRS detection for wearable applications focus 
on the improvement of the detector‘s power efficiency. However, complex handcrafted 
model and parameters are required due to the diversity of the subject. User specific 
method aims for each individual which has becoming the mainstream in ECG 
classification but rarely studied in QRS detection.  
  141 
A personalized QRS detection method is proposed in this study to cover the ECG 
diversity. We propose to generate an ECG template for each individual, which can be 
done at the first use by recording a 20 second ECG segment. One target clustering is 
applied to the segment to extract the user-specified ECG template. After the extraction, 
the segment under detection is compared against the user-specified template by 
calculating the Pearson Correlation Coefficients (PCC). Whenever, the PCC surpasses a 
pre-set threshold, a QRS complex centred at R peak is generated. For the accuracy 
enhancement, adaptive thresholding is also proposed for the real-time updating of the RR 
interval, QRS template, and the threshold.  
Evaluated on the 48 records of the MIT-BIH database, the majority 99127 normal 
beats are detected from the total 109369 beats. The total false positive and false negative 
are respectively 786 and 604. The average 99.21% sensitivity and 99.39% positive 
prediction are achieved. Compare with other algorithms, this proposed one shows 
comparable performance. In addition, it is capable of separating the majority QRS type 
from other types, which is not available from existing algorithms. 
The proposed personalized QRS detection is implemented in software only. In the 
future, the ASIC implementation can be considered.  In addition, it is interest to explore a 
CTDA version of the algorithm to get better energy efficiency. 
The CAC gives patients the direct information on their cardiac conditions, 
enabling a smart ECG sensor. Currently, the design of high-performance CAC is a hot 
research topic. There are two main branches. The first is the low power hardware 
  142 
implementation of CAC algorithms with linear model or simple machine learning method, 
which presents limited detection accuracy but suitable for wearable applications for their 
ultralow power consumption. The second is the patient specific algorithms using neural 
network based model, where the best detection accuracy can be achieved but requires 
more computational power such as CPU or GPU.  
With the introduction of CTDA signal flow, the use of LC-ADC to generate event 
based samples is getting more popular in ultralow power ECG sensor research. Different 
from the evenly distributed Nyquist sampling, the samples generated by LC-ADC is 
proportional to the signal slope. This is preferable in processing ECG signal since less 
samples are generated on the baseline while more details are available for the useful P-
wave, QRS complex, and T-wave.  
In this study, a CTDA ANN-CAC is proposed to reduce the hardware cost and 
thus power consumption while achieving the comparable classification performance with 
the software implementations. Two main benefits are the result of CTDA. First, the total 
number of samples for the ANN-CAC input is reduced which dramatically simplifies the 
ANN structure. Second, the input samples in a CTDA data stream are pure ―1‖ and ―0‖, 
the multiplications of the first layer can be removed. A fully connected three-layer ANN 
with structure of 32×16×5 with 96-bit input samples is trained. For such structure, each 
classification requires 3717 additions and 592 multiplications. Besides the simplification 
of the ANN-CAC structure, imbalanced training samples are also considered to improve 
the classification accuracy. According to the statistics of the MIT-BIH database, the 
  143 
records are divided into groups with sufficient abnormal beats. Then the biased training 
method is proposed to cooperate with the grouping scheme for quicker training 
convergence.   
Several considerations are applied in the hardware implementation of the CTDA 
ANN-CAC. Firstly, ReLU activation is adopted to simplify the circuits as the simple 
activation function achieves the similar inference performance with complex activation 
functions by proper training. With the ReLU, the outputs of the neurons are non-negative, 
thus the customized Booth Encoding saves 1-bit in the design of the multiplier. More 
multiplier and adder result more area and leakage power because of the fixed arithmetic 
operations. By sharing one multiplier and one adder, we manage to minimize the area 
while guarantee the classification time within 1ms. For the arithmetic operations, 16-bit 
fixed point multiplication and 24-bit accumulation are used to guarantee the accuracy and 
avoid overflow of the accumulation. To resolve the speed limitation of the multiplier, 
three-dimensional reduction (TDM) for optimization of the addition of partial products as 
well as three-stage pipelining are used in multiplier design, which results a multiplier that 
has similar speed of the adder. With the customized multiplier, corresponding control 
logics for different layers are developed. 
The classification accuracy is verified by simulation in MATLAB, 
implementation in FPGA, and the simulation of ASIC in 0.18μm process. Equivalent 
results are obtained. Evaluated on the 44 records of the MIT-BIH database, the average 
ACC, SE, +P, and FPR for Types N, S, V, and F are comparable to the current state-of-
  144 
the-art. Compared with the stage-of-the-art software implementations, the design shows 
the best performance except for Type F.  
Implementing on Artix-7 FPGA, the resources taken is smaller than 10% of the 
total FPGA hardware. The estimated power for each classification is about 3mW. 
Implemented in 0.18μm CMOS process, the design can be operated from 10KHz to 
50MHz from 1.8 to 3.3V. The maximum classification speed for a patient with 75bpm 
heart rate is within 128μs. The average power can be as low as 13.34μW. Compared with 
other implementations, this design shows the best classification performance with slightly 
higher power.  
The CTDA ANN-CAC has been verified in simulation and FPGA. However, 
pruning is not conducted during the implementation. It is possible to reduce the power by 
pruning and sub-threshold techniques.  
For now, the three modules are independent with each other. In the next stage, all 
three modules will be implemented on a single chip to form a smart ECG-on-Chip.  
6.2 Research Contributions 
This research has made several contributions to the flexible/wearable ECG sensor 
building blocks. For the ECG analog front-end circuit, we demonstrated a DC-coupled 
FDDA analog front-end that achieves 1GΩ input impedance, 76dB CMRR, an excellent 
noise performance of 1.02μVrms with noise effective factor of 2.55, small size of 
0.405mm
2
, tuneable gain of 56-68dB, and 900nW of power.  The high input impedance 
and high CMRR are the result of proposed DC-coupled FDDA circuit topology. The 
  145 
good noise performance and small area are achieved by proposed parasitic capacitor 
sharing scheme. The proposed on-body biasing scheme stabilizes the DC offset caused by 
skin-electrode interface and eliminates the need of drive-right-leg circuit, leading to 
significant power saving. The AFE achieves the best Noise Effective Factor among all 
existing state-of-the-art ECG AFEs.  
For the QRS detector, we proposed the personalized ECG template based 
detection method, which is capable of tracking the variation in ECG morphology from 
person to person. This feature enables better detection accuracy and is more robust in 
dealing with people with different race, gender, age, and weight. The proposed one target 
clustering method reduces the computational load for personalized template generation, 
which make it suitable to embeded into flexible ECG sensors.  
For cardiac arrhythmia classifier, we introduced continuous-in-time discrete-in-
amplitude signal flow to the machine learning, which reduces the number of arithmetic 
operations by more than 100 times compared to Nyquist sampling based scheme.  The 
use of level-crossing ADC to convert ECG signal into the continuous-in-time discrete-in-
amplitude signal results over 90% input sample reduction. Based on a simple ANN, we 
are able to achieve more than 98% classification accuracy for all four types of 
arrhythmias at the power consumption of 13.34μW, the best among all start-of-the-art 
implementations of cardiac arrhythmia classifier.  
 
  146 
Bibliography 
[1]. S. Mendis, P. Puska and B. Norrving, ―Global atlas on cardiovascular disease 
prevention and control,‖ World Health Organization, 2011. 
[2].  ―Cardiovascular diseases (CVDs),‖ World Health Organization. [Online]. 
Available: https://www.who.int/en/news-room/fact-
sheets/detail/cardiovascular-diseases-(cvds). 
[3]. E. J. Benjamin , P. Muntner , A. Alonso , M. S. Bittencourt , C. W. Callaway , 
A. P. Carson  and et al., ―Heart disease and stroke statistics - 2019 update: A 
report from the American Heart Association,‖ Circulation, vol. 139, Issue 10, 
pp. e56–e528, 2019. 
[4]. ―2019 report on heart, stroke and vascular cognitive impairment,‖ [Online]. 
Available: https://www.heartandstroke.ca/-/media/pdf-files/canada/2019-
report/heartandstrokereport2019.ashx. 
[5]. L. Sornomo and P. Laguna, Bioelectrical Signal Processing in Cardiac and 
Neurological Applications. Elsevier Academic Press, 2005. 
[6]. ―ECGpedia,‖ Available: http://en.ecgpedia.org/wiki/Main_Page. 
[7]. A. L. Bui and G. C. Fonarow, ―Home monitoring for heart failure 
management‖, Journal of the American College of Cardiology, Vol.59, No.2, 
Jan. 2012. 
[8]. ―Holter monitor,‖ Available: https://en.wikipedia.org/wiki/Holter_monitor. 
[9]. ―ZioXT,‖ Available: https://www.irhythmtech.com/products-services/zio-xt. 
  147 
[10]. D. H. Kim, N. S. Lu, R. Ma and et. al. ―Epidermal Electronics‖, Science, Vol. 
333, pp.838-843, Aug. 2011. 
[11]. ―A flexible device to monitor heart health,‖ Available: 
https://www.asianscientist.com/2018/08/tech/bioelectronic-mesh-heart-health/. 
[12]. D. Siegel, and S. Shivakumar, ―The flexible electronics opportunity‖, National 
Academies Press,2014,  ISBN:978-0-309-30591-4. 
[13]. C. J. Deepu, X. Y. Zhang, C. H. Heng, and Y. Lian, ―A 3-Lead ECG-on-Chip 
with Joint QRS Detection and Lossless Compression for Wearable Sensors‖, 
IEEE Transactions on Circuits and Systems II, Vol. 63, No.12, pp.1151-1155, 
Dec. 2016. 
[14]. I. Kim, R. Lobo, J. Homer and Y. A. Bhagat, "Multimodal analog front-end for 
wearable bio-sensors," 2015 IEEE SENSORS, Busan, 2015, pp. 1-4. 
[15]. Y. Zhao, Z. Shang and Y. Lian, ―A 2.55 NEF 76dB CMRR DC-Coupled Fully 
Differential Difference Amplifier Based Analog Front-end for Wearable 
Biomedical Sensors,‖ IEEE Transactions on Biomedical Circuits and Systems. 
[Early Access] DOI: 10.1109/TBCAS.2019.2924416. 
[16]. Y. Zhao, Z. Shang and Y. Lian, ―User Adaptive QRS Detection Based on One 
Target Clustering and Correlation Coefficient,‖ 2018 IEEE Biomedical 
Circuits and Systems Conference (BioCAS), Cleveland, OH, 2018, pp. 1-4. 
[17]. J. G. Webster, Ed., Medical Instrumentation: Application and Design, 3rd ed., 
New York: Wiley, 1998. 
  148 
[18]. J. G. Webster, ―Reducing motion artifacts and interference in biopotential 
recording,‖ IEEE Transactions on Biomedical Engineering, 31, 823–826, 
1984. 
[19]. T. Degen and H. JÄckel, ―Continuous Monitoring of Electrode--Skin 
Impedance Mismatch During Bioelectric Recordings,‖ IEEE Transactions on 
Biomedical Engineering, vol. 55, no. 6, pp. 1711-1715, June 2008. 
[20]. J. C. Huhta and J. G. Webster, ―60-Hz interference in electrocardiography,‖ 
IEEE Transactions on Biomedical Engineering, 20, 91–101, 1973. 
[21]. D.M.D. Ribeiro, L.S. Fu, L.A.D. Carlos and J.P.S. Cunha, ―A Novel Dry 
Active Biosignal Electrode Based on an Hybrid Organic-Inorganic Interface 
Material,‖ IEEE Sensors Journal, vol.11, no.10, pp.2241-2245, Oct. 2011. 
[22]. A. Pantelopoulosm and N. G. Bourbakis, ―A survey on wearable sensor-based 
systems for health monitoring and prognosis,‖ IEEE Transactions on Systems, 
Man, and Cybernetics, Part C (Applications and Reviews), vol. 40, no. 1, pp. 
1-12, Jan. 2010. 
[23]. D. Abaudon, J. Sztajzel, K. Sievert and T. Landis, ―Usefulness of ambulatory 
7-day ECG monitoring for the detection of atrial fibrillation and flutter after 
acute stroke and transient ischemic attack,‖ Stroke, vol. 35, pp. 1647–1651, 
Jul. 2004. 
[24]. H.-C. Shih, C.-M. Shih, K.-Y. Chou, S.-J. Lin, U.-Y. Su and R. A. Gerhardt, 
―Mechanism of degradation of AgCl coating on biopotential sensors,‖ Journal 
  149 
of Biomedical Materials Research, Part A, vol. 82A, no. 4, pp. 872–883, Mar. 
2007. 
[25]. D.M.D. Ribeiro, L.S. Fu, L.A.D. Carlos and J.P.S. Cunha, ―A Novel Dry 
Active Biosignal Electrode Based on an Hybrid Organic-Inorganic Interface 
Material,‖ IEEE Sensors Journal, vol.11, no.10, pp.2241-2245, Oct. 2011. 
[26]. W. Uter and H. J. Schwanitz, ―Contact dermatitis from propylene glycol in 
ECG electrode gel,‖ Contact Dermatitis, vol. 34, pp. 108–115, Mar. 1996. 
[27]. T. C. Ferree, P. Luu, G. S. Russell and D. M. Tucker, ―Scalp electrode 
impedance, infection risk, and EEG data quality,‖ Clinical Neurophysiology: 
official journal of the International Federation of Clinical Neurophysiology, 
vol. 112, pp. 536–544, Mar. 2001. 
[28]. W. Pei and et al., ―Skin-Potential Variation Insensitive Dry Electrodes for 
ECG Recording,‖ IEEE Transactions on Biomedical Engineering, vol. 64, no. 
2, pp. 463-470, Feb. 2017. 
[29]. A. Gruetzmann, S. Hansen and J. Mller. ―Novel Dry Electrodes for ECG 
Monitoring,‖ Physiological Measurement, 28(11):1375, 2007. 
[30]. J. Y. Baek, J. H. An, J. M. Choi, K. S. Park and S. H. Lee, ―Flexible Polymeric 
Dry Electrodes for The Long-Term Monitoring of ECG,‖ Sensors and 
Actuators A: Physical, 143(2):423 – 429, 2008. 
  150 
[31]. T. Matsuo, K. Iinuma, and M. Esashi, ―A Bariumtitanate-Ceramics Capacitive-
Type EEG Electrode,‖ IEEE Transactions on Biomedical Engineering, BME-
20(4):299 –300, July 1973. 
[32]. G. Peng and M. F. Bocko, ―A low noise, non-contact capacitive cardiac sensor,‖ 
2012 Annual International Conference of the IEEE Engineering in Medicine 
and Biology Society, San Diego, CA, 2012, pp. 4994-4997. 
[33]. Y. M. Chi, T. P. Jung and G. Cauwenberghs, ―Dry-Contact and Noncontact 
Biopotential Electrodes: Methodological Review,‖ IEEE Reviews in 
Biomedical Engineering, vol. 3, no. , pp. 106-119, 2010. 
[34]. R. E. Mason and I. Likar, ―A new system of multiple lead exercise 
electrocardiography,‖ American Heart Journal, vol. 71, pp. 196-205, 1966. 
[35]. H. de Talhouet and J. G. Webster, ―The origin of skin-stretch-caused motion 
artifacts under electrodes,‖ Physiological Measurement, vol. 17, no. 2, p. 81, 
1996. 
[36]. N. V. Thakor and J. G. Webster, ―The origin of skin potential and its 
variations,‖ Proceedings of Annual Conference on Engineering in Medicine 
and Biology, vol. 20, 1978, p. 212. 
[37]. M. J. Burke and D. T. Gleeson, ―A micropower dry-electrode ECG 
preamplifier,‖ IEEE Transactions on Biomedical Engineering, vol. 47, no. 2, 
pp. 155-162, Feb. 2000. 
  151 
[38]. A. C. Metting van Rijn, A. Peper and C. A. Grimbergen, ―High-quality 
recording of bioelectric events Part 1 Interference reduction, theory and 
practice,‖ Medical and Biological Engineering and Computing, vol. 28, pp. 
389–397, Sept. 1990. 
[39]. R. Pallas-Areny and J. G. Webster, ―Common mode rejection ratio for 
cascaded differential amplifier stages,‖ IEEE Transactions on Instrumentation 
and Measurement, vol. 40, no. 4, pp. 677-681, Aug 1991. 
[40]. BSI, ―Medical electrical equipment —Part 2-47: Particular requirements for 
the safety, including essential performance, of ambulatory electrocardiographic 
systems,‖ IEC 60601-2-47:2001, Nov. 2001. 
[41]. AAMI, ―Medical electrical equipment —Part 2-47: Particular requirements for 
the safety, including essential performance, of ambulatory electrocardiographic 
systems,‖ ANSI/AAMI EC38:2007, Dec. 2007. 
[42]. A. L. Mansano, Y. Li, S. Bagga and W. A. Serdijn, ―An Autonomous Wireless 
Sensor Node With Asynchronous ECG Monitoring in 0.18um CMOS,‖ IEEE 
Transactions on Biomedical Circuits and Systems, vol. 10, no. 3, pp. 602-611, 
June 2016. 
[43]. W. M. Leach, ―Fundamentals of low-noise analog circuit design,‖ Proceedings 
of the IEEE, vol. 82, no. 10, pp. 1515-1538, Oct 1994. 
  152 
[44]. K. A. Ng and P. K. Chan, ―A CMOS analog front-end IC for portable 
EEG/ECG monitoring applications,‖ IEEE Transactions on Circuits and 
Systems I: Regular Papers, vol. 52, no. 11, pp. 2335-2347, Nov. 2005. 
[45]. L. Fay, V. Misra and R. Sarpeshkar, ―A Micropower Electrocardiogram 
Amplifier,‖ IEEE Transactions on Biomedical Circuits and Systems, vol. 3, no. 
5, pp. 312-320, Oct. 2009. 
[46]. B. B. Winter and J. G. Webster, ―Reduction of interference due to common 
mode voltage in biopotential amplifiers,‖ IEEE Transactions on Biomedical 
Engineering, vol. BME-30, no. 1, pp. 58–62, Jan. 1983. 
[47]. T. Delbruck and C. A. Mead, ―Adaptive photoreceptor with wide dynamic 
range,‖ Proceedings of IEEE International Symposium on Circuits and 
Systems - ISCAS '94, London, 1994, pp. 339-342 vol.4. 
[48]. A. P. Chandran, K. Najafi and K. D. Wise, ―A new dc baseline stabilization 
scheme for neural recording microprobes,‖ Proceedings of the First Joint 
BMES/EMBS Conference. 1999 IEEE Engineering in Medicine and Biology 
21st Annual Conference and the 1999 Annual Fall Meeting of the Biomedical 
Engineering Society (Cat. N, Atlanta, GA, 1999, pp. 386 vol.1-. 
[49]. P. Mohseni and K. Najafi, ―A low power fully integrated bandpass operational 
amplifier for biomedical neural recording applications,‖ Proceedings of the 
Second Joint 24th Annual Conference and the Annual Fall Meeting of the 
  153 
Biomedical Engineering Society] [Engineering in Medicine and Biology, 2002, 
pp. 2111-2112 vol.3. 
[50]. R. H. Olsson, M. N. Gulari and K. D. Wise, ―Silicon neural recording arrays 
with on-chip electronics for in-vivo data acquisition,‖ 2nd Annual 
International IEEE-EMBS Special Topic Conference on Microtechnologies in 
Medicine and Biology. Proceedings (Cat. No.02EX578), Madison, WI, 2002, 
pp. 237-240. 
[51]. R. R. Harrison and C. Charles, ―A low-power low-noise CMOS amplifier for 
neural recording applications,‖ IEEE Journal of Solid-State Circuits, vol. 38, 
no. 6, pp. 958-965, June 2003. 
[52]. Xiaoyang Zhang, ―Ultra low power circuits for wearable biomedical sensors,‖ 
Ph.D Theses (Open), Available: 
http://scholarbank.nus.edu.sg/handle/10635/119211 
[53]. T. Farouk, M. Elkhatib and M. Dessouky, ―A 1V low-power low-noise 
biopotential amplifier based on flipped voltage follower,‖ 2015 IEEE 
International Conference on Electronics, Circuits, and Systems (ICECS), 
Cairo, 2015, pp. 539-542. 
[54]. R. H. Olsson, D. L. Buhl, A. M. Sirota, G. Buzsaki and K. D. Wise, ―Band-
tunable and multiplexed integrated circuits for simultaneous recording and 
stimulation with microelectrode arrays,‖ IEEE Transactions on Biomedical 
Engineering, vol. 52, no. 7, pp. 1303-1311, July 2005. 
  154 
[55]. W. Wattanapanitch, M. Fee and R. Sarpeshkar, ―An Energy-Efficient 
Micropower Neural Recording Amplifier,‖ IEEE Transactions on Biomedical 
Circuits and Systems, vol. 1, no. 2, pp. 136-147, June 2007. 
[56]. X. Zou, X. Xu, L. Yao and Y. Lian, ―A 1-V 450-nW Fully Integrated 
Programmable Biomedical Sensor Interface Chip,‖ IEEE Journal of Solid-
State Circuits, vol. 44, no. 4, pp. 1067-1077, April 2009. 
[57]. E. M. Spinelli, R. Pallas-Areny and M. A. Mayosky, ―AC-coupled front-end 
for biopotential measurements,‖  IEEE Transactions on Biomedical 
Engineering, vol. 50, no. 3, pp. 391-395, March 2003. 
[58]. Y. M. Chi, C. Maier and G. Cauwenberghs, ―Ultra-High Input Impedance, 
Low Noise Integrated Amplifier for Noncontact Biopotential Sensing,‖ IEEE 
Journal on Emerging and Selected Topics in Circuits and Systems, vol. 1, no. 
4, pp. 526-535, Dec. 2011. 
[59]. J. Xu, R. F. Yazicioglu, B. Grundlehner, P. Harpe, K. A. A. Makinwa and C. 
Van Hoof, ―A 160uW 8-Channel Active Electrode System for EEG 
Monitoring,‖ IEEE Transactions on Biomedical Circuits and Systems, vol. 5, 
no. 6, pp. 555-567, Dec. 2011. 
[60]. Q. Fan, F. Sebastiano, J. H. Huijsing and K. A. A. Makinwa, ―A 1.8 uW 60 
nV/√Hz Capacitively-Coupled Chopper Instrumentation Amplifier in 65 nm 
CMOS for Wireless Sensor Nodes,‖ IEEE Journal of Solid-State Circuits, vol. 
46, no. 7, pp. 1534-1543, July 2011. 
  155 
[61]. J. Lee, G. Lee, H. Kim and S. Cho, "An Ultra-High Input Impedance Analog 
Front End Using Self-Calibrated Positive Feedback," IEEE Journal of Solid-
State Circuits, vol. 53, no. 8, pp. 2252-2262, Aug. 2018. 
[62]. M. Chen, H. Chun, I. D. C. Miller, T. Torfs, Q. Lin, C. Van Hoof, G. Wang, Y. 
Lian and N. V. Helleputte, ―A 400 GΩ Input-Impedance Active Electrode for 
Non-Contact Capacitively Coupled ECG Acquisition With Large Linear-Input-
Range and High CM-Interference-Tolerance,‖ IEEE Transactions on 
Biomedical Circuits and Systems, vol. 13, no. 2, pp. 376-386, April 2019. 
[63]. G. Peng and M. F. Bocko, ―A low noise, non-contact capacitive cardiac sensor,‖ 
2012 Annual International Conference of the IEEE Engineering in Medicine 
and Biology Society, San Diego, CA, 2012, pp. 4994-4997. 
[64]. D. Svärd, A. Cichocki and A. Alvandpour, ―Design and evaluation of a 
capacitively coupled sensor readout circuit, toward contact-less ECG and EEG,‖ 
2010 Biomedical Circuits and Systems Conference (BioCAS), Paphos, 2010, 
pp. 302-305. 
[65]. Xiong Zhou, Qiang Li, S. Kilsgaard, F. Moradi, S. L. Kappel and P. Kidmose, 
―A wearable ear-EEG recording system based on dry-contact active electrodes,‖ 
2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), Honolulu, HI, 2016, 
pp. 1-2. 
[66]. X. Zhang, Z. Zhang, Y. Li, C. Liu, Y. X. Guo and Y. Lian, ―A 2.89μW Dry-
Electrode Enabled Clockless Wireless ECG SoC for Wearable Applications,‖ 
  156 
IEEE Journal of Solid-State Circuits, vol. 51, no. 10, pp. 2287-2298, Oct. 
2016. 
[67]. A. Bagheri, M. T. Salam, J. L. Perez Velazquez and R. Genov, ―Low-
frequency noise and offset rejection in DC-coupled neural amplifiers: a review 
and digitally-assisted design tutorial,‖ IEEE Transactions on Biomedical 
Circuits and Systems, vol. 11, no. 1, pp. 161-176, Feb. 2017. 
[68]. B. Lee and M. Ghovanloo, ―An adaptive averaging low noise front-end for 
central and peripheral nerve recording,‖ IEEE Transactions on Circuits 
Systems II, Express Briefs, vol. 65, no. 7, pp. 839-843, July 2018. 
[69]. R. Muller, S. Gambini and J. M. Rabaey, ―A 0.013mm2 5μW DC-coupled 
neural signal acquisition IC with 0.5V supply,‖ IEEE Journal of Solid-State 
Circuits, vol. 47, no. 1, pp. 232-243, Jan. 2012. 
[70]. E. M. Spinelli, N. Martinez, M. A. Mayosky and R. Pallas-Areny, ―A novel 
fully differential biopotential amplifier with DC suppression,‖ IEEE 
Transactions on Biomedical Engineering, vol. 51, no. 8, pp. 1444-1448, Aug. 
2004. 
[71]. T. Degen and H. Jackel, ―A pseudo differential amplifier for bioelectric events 
with DC-offset compensation using two-wired amplifying electrodes,‖ IEEE 
Transactions on Biomedical Engineering, vol. 53, no. 2, pp. 300-310, Feb. 
2006. 
  157 
[72]. B. Gosselin, M. Sawan and C. A. Chapman, ―A Low-Power Integrated 
Bioamplifier With Active Low-Frequency Suppression,‖ IEEE Transactions 
on Biomedical Circuits and Systems, vol. 1, no. 3, pp. 184-192, Sept. 2007. 
[73]. R. F. Yazicioglu, P. Merken, R. Puers and C. Van Hoof, ―A 200uW Eight-
Channel EEG Acquisition ASIC for Ambulatory EEG Systems,‖ IEEE Journal 
of Solid-State Circuits, vol. 43, no. 12, pp. 3025-3038, Dec. 2008. 
[74]. R. Muller, S. Gambini and J. M. Rabaey, ―A 0.013mm2, 5uW, DC-Coupled 
Neural Signal Acquisition IC With 0.5 V Supply,‖ IEEE Journal of Solid-State 
Circuits, vol. 47, no. 1, pp. 232-243, Jan. 2012. 
[75]. W. Biederman et al., ―A Fully-Integrated, Miniaturized (0.125 mm²) 10.5 µW 
Wireless Neural Sensor,‖ IEEE Journal of Solid-State Circuits, vol. 48, no. 4, 
pp. 960-970, April 2013. 
[76]. P. R. Gray, D. Senderowicz and D. G. Messerschmitt, ―A low-noise chopper-
stabilized differential switched-capacitor filtering technique,‖ IEEE Journal of 
Solid-State Circuits, vol. 16, no. 6, pp. 708-715, Dec. 1981. 
[77]. E. A. Vittoz, "MOS transistors operated in the lateral bipolar mode and their 
application in CMOS technology," IEEE Journal of Solid-State Circuits, vol. 
18, no. 3, pp. 273-279, June 1983. 
[78]. C. Enz, ―Analysis of low-frequency noise reduction by autozero technique,‖ 
Electronics Letters, vol. 20, no. 23, pp. 959-960, November 8 1984. 
  158 
[79]. C. C. Enz, E. A. Vittoz and F. Krummenacher, ―A CMOS chopper amplifier,‖ 
IEEE Journal of Solid-State Circuits, vol. 22, no. 3, pp. 335-342, Jun 1987. 
[80]. Qiuting Huang and C. Menolfi, ―A 200 nV offset 6.5 nV/√Hz noise PSD 5.6 
kHz chopper instrumentation amplifier in 1 /μm digital CMOS,‖ 2001 IEEE 
International Solid-State Circuits Conference. Digest of Technical Papers. 
ISSCC (Cat. No.01CH37177), San Francisco, CA, USA, 2001, pp. 362-363. 
[81]. A. Uranga, X. Navarro and N. Barniol, "Integrated CMOS amplifier for ENG 
signal recording," IEEE Transactions on Biomedical Engineering, vol. 51, no. 
12, pp. 2188-2194, Dec. 2004. 
[82]. R. F. Yazicioglu, P. Merken, R. Puers and C. Van Hoof, ―A 60uW 60 nV/√Hz 
Readout Front-End for Portable Biopotential Acquisition Systems,‖  IEEE 
Journal of Solid-State Circuits, vol. 42, no. 5, pp. 1100-1110, May 2007. 
[83]. T. Denison, K. Consoer, W. Santa, A. T. Avestruz, J. Cooley and A. Kelly, ―A 
2 μW 100 nV/√Hz Chopper-Stabilized Instrumentation Amplifier for Chronic 
Measurement of Neural Field Potentials,‖ IEEE Journal of Solid-State 
Circuits, vol. 42, no. 12, pp. 2934-2945, Dec. 2007. 
[84]. R. Wu, K. A. A. Makinwa and J. H. Huijsing, ―A Chopper Current-Feedback 
Instrumentation Amplifier With a 1mHz 1/f  Noise Corner and an AC-Coupled 
Ripple Reduction Loop,‖ IEEE Journal of Solid-State Circuits, vol. 44, no. 12, 
pp. 3232-3243, Dec. 2009. 
  159 
[85]. R. Wu, J. H. Huijsing and K. A. A. Makinwa, ―A Current-Feedback 
Instrumentation Amplifier With a Gain Error Reduction Loop and 0.06% 
Untrimmed Gain Error,‖ IEEE Journal of Solid-State Circuits, vol. 46, no. 12, 
pp. 2794-2806, Dec. 2011. 
[86]. R. F. Yazicioglu, S. Kim, T. Torfs, H. Kim and C. Van Hoof, ―A 30uW 
Analog Signal Processor ASIC for Portable Biopotential Signal Monitoring,‖ 
IEEE Journal of Solid-State Circuits, vol. 46, no. 1, pp. 209-223, Jan. 2011. 
[87]. V. Balasubramanian, P. F. Ruedi, Y. Temiz, A. Ferretti, C. Guiducci and C. C. 
Enz, ―A 0.18um Biosensor Front-End Based on 1/f  Noise, Distortion 
Cancelation and Chopper Stabilization Techniques,‖ IEEE Transactions on 
Biomedical Circuits and Systems, vol. 7, no. 5, pp. 660-673, Oct. 2013. 
[88]. N. Verma, A. Shoeb, J. Bohorquez, J. Dawson, J. Guttag and A. P. 
Chandrakasan, ―A Micro-Power EEG Acquisition SoC With Integrated Feature 
Extraction Processor for a Chronic Seizure Detection System,‖ IEEE Journal 
of Solid-State Circuits, vol. 45, no. 4, pp. 804-816, April 2010. 
[89]. G. Nicollini and C. Guardiani, ―A 3.3-V 800-nVrms noise, gain-programmable 
CMOS microphone preamplifier design using yield modeling technique,‖ 
IEEE Journal of Solid-State Circuits, vol. 28, no. 8, pp. 915-921, Aug 1993. 
[90]. P. K. Chan, K. A. Ng and X. L. Zhang, ―A CMOS chopper-stabilized 
differential difference amplifier for biomedical integrated circuits,‖ The 47th 
Midwest Symposium on Circuits and Systems, 2004, pp. iii-33-6 vol.3. 
  160 
[91]. W. S. Wang, Z. C. Wu, H. Y. Huang and C. H. Luo, ―Low-Power Instrumental 
Amplifier for Portable ECG,‖ 2009 IEEE Circuits and Systems International 
Conference on Testing and Diagnosis, Chengdu, 2009, pp. 1-4. 
[92]. D. Han and Y. Zheng, ―A chopper stabilized instrumentation amplifier with 
dual DC cancellation servo loops for biomedical applications,‖ 2012 IEEE 
Asian Solid State Circuits Conference (A-SSCC), Kobe, 2012, pp. 49-52. 
[93]. J. Zhang, S. C. Chan and L. Wang, ―A 1.8 µW area-efficient bio-potential 
amplifier with 90 dB DC offset suppression,‖ 2014 IEEE 57th International 
Midwest Symposium on Circuits and Systems (MWSCAS), College Station, TX, 
2014, pp. 286-289. 
[94]. Y. C. Huang, T. S. Yang, S. H. Hsu, X. Z. Chen and J. C. Chiou, ―A novel 
pseudo resistor structure for biomedical front-end amplifiers,‖ 2015 37th 
Annual International Conference of the IEEE Engineering in Medicine and 
Biology Society (EMBC), Milan, 2015, pp. 2713-2716. 
[95]. S. Wu, P. Huang, T. Huang, K. Chen, J. Chiou, K. Chen, C. Chiu, H. Tong, C. 
Chuang and W. Hwang, ―Energy-efficient low-noise 16-channel analog-front-
end circuit for bio-potential acquisition,‖ Technical Papers of 2014 
International Symposium on VLSI Design, Automation and Test, Hsinchu, 
2014, pp. 1-4. 
[96]. T. Liu, I. Wang, P. Yen, S. Lin, K. Lin, J. Jheng, D. Chang and C. Lin, 
―CMOS-based biomolecular diagnosis platform,‖ 2017 IEEE 12th 
  161 
International Conference on Nano/Micro Engineered and Molecular Systems 
(NEMS), Los Angeles, CA, 2017, pp. 96-100. 
[97]. W. Wang, Z. Wu, H. Huang and C. Luo, ―Low-Power Instrumental Amplifier 
for Portable ECG,‖ 2009 IEEE Circuits and Systems International Conference 
on Testing and Diagnosis, Chengdu, 2009, pp. 1-4. 
[98]. L. Sornmo, O. Paklm and M. Nygards, ―Adaptive QRS Detection: A Study of 
Performance,‖ IEEE Transactions on Biomedical Engineering, vol. BME-32, 
no. 6, pp. 392-401, June 1985. 
[99]. P. S. Hamilton and W. J. Tompkins, ―Quantitative Investigation of QRS 
Detection Rules Using the MIT/BIH Arrhythmia Database,‖ IEEE 
Transactions on Biomedical Engineering, vol. BME-33, no. 12, pp. 1157-
1165, Dec. 1986. 
[100]. B. -. Kohler, C. Hennig and R. Orglmeister, ―The principles of software QRS 
detection,‖ IEEE Engineering in Medicine and Biology Magazine, vol. 21, no. 
1, pp. 42-57, Jan.-Feb. 2002. 
[101]. M. Okada, ―A Digital Filter for the ORS Complex Detection,‖ IEEE 
Transactions on Biomedical Engineering, vol. BME-26, no. 12, pp. 700-703, 
Dec. 1979. 
[102]. M. L. Ahlstrom and W. J. Tompkins, ―Automated High-Speed Analysis of 
Holter Tapes with Microcomputers,‖ IEEE Transactions on Biomedical 
Engineering, vol. BME-30, no. 10, pp. 651-657, Oct. 1983. 
  162 
[103]. W. P. Holsinger, K. M. Kempner and M. H. Miller, "A QRS Preprocessor 
Based on Digital Differentiation," IEEE Transactions on Biomedical 
Engineering, vol. BME-18, no. 3, pp. 212-217, May 1971. 
[104]. S. Suppappola and Ying Sun, "Nonlinear transforms of ECG signals for digital 
QRS detection: a quantitative analysis," IEEE Transactions on Biomedical 
Engineering, vol. 41, no. 4, pp. 397-400, April 1994. 
[105]. S. Mallat and W. L. Hwang, "Singularity detection and processing with 
wavelets," IEEE Transactions on Information Theory, vol. 38, no. 2, pp. 617-
643, March 1992. 
[106]. M. Bahoura, M. Hassani, and M. Hubin, ―DSP implementation of wavelet 
transform for real time ECG wave forms detection and heart rate analysis,‖ 
Computer Methods and Programs in Biomedicine, vol. 52, no. 1, pp. 35-44, 
1997. 
[107]. S. Kadambe, R. Murray and G. F. Boudreaux-Bartels, ―Wavelet transform-
based QRS complex detector,‖ IEEE Transactions on Biomedical Engineering, 
vol. 46, no. 7, pp. 838-848, July 1999. 
[108]. Cuiwei Li, Chongxun Zheng and Changfeng Tai, ―Detection of ECG 
characteristic points using wavelet transforms,‖ IEEE Transactions on 
Biomedical Engineering, vol. 42, no. 1, pp. 21-28, Jan. 1995. 
  163 
[109]. V. X. Afonso, W. J. Tompkins, T. Q. Nguyen and Shen Luo, ―ECG beat 
detection using filter banks,‖ IEEE Transactions on Biomedical Engineering, 
vol. 46, no. 2, pp. 192-202, Feb. 1999. 
[110]. C. May, N. Hubing, and A.W. Hahn, ―Wavelet transforms for 
electrocardiogram processing,‖ Biomedical Sciences Instrumentation, vol. 33, 
pp. 1-6, 1997. 
[111]. E. B. Mazomenos et al., ―A Low-Complexity ECG Feature Extraction 
Algorithm for Mobile Healthcare Applications,‖ IEEE Journal of Biomedical 
and Health Informatics, vol. 17, no. 2, pp. 459-469, March 2013. 
[112]. Q. Xue, Y. H. Hu and W. J. Tompkins, ―Neural-network-based adaptive 
matched filtering for QRS detection,‖ IEEE Transactions on Biomedical 
Engineering, vol. 39, no. 4, pp. 317-329, April 1992. 
[113]. Z. Zhao et al., ―Noise Rejection for Wearable ECGs Using Modified 
Frequency Slice Wavelet Transform and Convolutional Neural Networks,‖ 
IEEE Access, vol. 7, pp. 34060-34067, 2019. 
[114]. R. Silipo and C. Marchesi, ―Artificial neural networks for automatic ECG 
analysis,‖ IEEE Transactions on Signal Processing, vol. 46, no. 5, pp. 1417-
1425, May 1998. 
[115]. J. Kim and P. Mazumder, ―Energy-Efficient Hardware Architecture of Self-
Organizing Map for ECG Clustering in 65-nm CMOS,‖ IEEE Transactions on 
  164 
Circuits and Systems II: Express Briefs, vol. 64, no. 9, pp. 1097-1101, Sept. 
2017. 
[116]. S. Ansari, J. Gryak and K. Najarian, ―Noise Detection in Electrocardiography 
Signal for Robust Heart Rate Variability Analysis: A Deep Learning 
Approach,‖ 2018 40th Annual International Conference of the IEEE 
Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, 2018, 
pp. 5632-5635. 
[117]. D. A. Coast, R. M. Stern, G. G. Cano and S. A. Briller, ―An approach to 
cardiac arrhythmia analysis using hidden Markov models,‖ IEEE Transactions 
on Biomedical Engineering, vol. 37, no. 9, pp. 826-836, Sept. 1990. 
[118]. R. Poli, S. Cagnoni and G. Valli, ―Genetic design of optimum linear and 
nonlinear QRS detectors,‖ IEEE Transactions on Biomedical Engineering, vol. 
42, no. 11, pp. 1137-1141, Nov. 1995. 
[119]. Zhou Song-Kai, Wang Jian-Tao and Xu Jun-Rong, ―The real-time detection of 
QRS-complex using the envelope of ECG,‖ Proceedings of the Annual 
International Conference of the IEEE Engineering in Medicine and Biology 
Society, New Orleans, LA, USA, 1988, pp. 38 vol.1-. 
[120]. K. Kumari, S. S. Sahu and R. K. Sinha, ―R Peak Detection using Empirical 
Mode Decomposition with Shannon Energy Envelope,‖ 2018 Second 
International Conference on Inventive Communication and Computational 
Technologies (ICICCT), Coimbatore, 2018, pp. 550-554. 
  165 
[121]. F. Zhang and Y. Lian, ―QRS Detection Based on Multiscale Mathematical 
Morphology for Wearable ECG Devices in Body Area Networks,‖ IEEE 
Transactions on Biomedical Circuits and Systems, vol. 3, no. 4, pp. 220-228, 
Aug. 2009. 
[122]. M. I. Khan, M. B. Hossain and A. F. M. N. Uddin, ―Performance analysis of 
modified zero crossing counts method for heart arrhythmias detection and 
implementation in HDL,‖ 2013 International Conference on Informatics, 
Electronics and Vision (ICIEV), Dhaka, 2013, pp. 1-6. 
[123]. G. B. Moody and R. G. Mark, ―The impact of the MIT-BIH Arrhythmia 
Database,‖ IEEE Engineering in Medicine and Biology Magazine, vol. 20, no. 
3, pp. 45-50, May-June 2001. 
[124]. C. J. Deepu and Y. Lian, ―A Joint QRS Detection and Data Compression 
Scheme for Wearable Sensors,‖ IEEE Transactions on Biomedical 
Engineering, vol. 62, no. 1, pp. 165-175, Jan. 2015. 
[125]. T. Tekeste, H. Saleh, B. Mohammad and M. Ismail, "Ultra-Low Power QRS 
Detection and ECG Compression Architecture for IoT Healthcare Devices," 
IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 2, 
pp. 669-679, Feb. 2019. 
[126]. X. Zhang and Y. Lian, ―A 300-mV 220-nW Event-Driven ADC With Real-
Time QRS Detection for Wearable ECG Sensors,‖ IEEE Transactions on 
Biomedical Circuits and Systems, vol. 8, no. 6, pp. 834-843, Dec. 2014. 
  166 
[127]. T. Tekeste, H. Saleh, B. Mohammad, A. Khandoker and M. Ismail, ―A Nano-
Watt ECG Feature Extraction Engine in 65-nm Technology,‖ IEEE 
Transactions on Circuits and Systems II: Express Briefs, vol. 65, no. 8, pp. 
1099-1103, Aug. 2018. 
[128]. T. Tekeste, H. Saleh, B. Mohammad, A. Khandoker, H. Jelinek and M. Ismail, 
―A Nanowatt Real-Time Cardiac Autonomic Neuropathy Detector,‖ IEEE 
Transactions on Biomedical Circuits and Systems, vol. 12, no. 4, pp. 739-750, 
Aug. 2018. 
[129]. X. Liu, Y. Zheng, M. W. Phyu, F. N. Endru, V. Navaneethan and B. Zhao, ―An 
Ultra-Low Power ECG Acquisition and Monitoring ASIC System for WBAN 
Applications,‖ IEEE Journal on Emerging and Selected Topics in Circuits and 
Systems, vol. 2, no. 1, pp. 60-70, March 2012. 
[130]. C. J. Deepu, X. Y. Xu, D. L. T. Wong, C. H. Heng and Y. Lian, ―A 2.3μW 
ECG-On-Chip for Wireless Wearable Sensors,‖ IEEE Transactions on Circuits 
and Systems II: Express Briefs, vol. 65, no. 10, pp. 1385-1389, Oct. 2018. 
[131]. S. A. H. Sabzevari, N. Ravanshad and H. Rezaee-Dehsorkh, ―An Ultra-Low-
Power QRS-Detection System Based on Level-Crossing Sampling,‖ Electrical 
Engineering (ICEE), Iranian Conference on, Mashhad, 2018, pp. 1456-1461. 
[132]. R. Xiao, M. Li, M. Law, P. Mak and R. P. Martin, ―Ultra-low power QRS 
detection using adaptive thresholding based on forward search interval 
  167 
technique,‖ 2017 International Conference on Electron Devices and Solid-
State Circuits (EDSSC), Hsinchu, 2017, pp. 1-2. 
[133]. Y. Zou, J. Han, X. Weng and X. Zeng, "An Ultra-Low Power QRS Complex 
Detection Algorithm Based on Down-Sampling Wavelet Transform," IEEE 
Signal Processing Letters, vol. 20, no. 5, pp. 515-518, May 2013. 
[134]. S. Kiranyaz, T. Ince and M. Gabbouj, ―Real-Time Patient-Specific ECG 
Classification by 1-D Convolutional Neural Networks,‖ IEEE Transactions on 
Biomedical Engineering, vol. 63, no. 3, pp. 664-675, March 2016. 
[135]. J. Paalasmaa, H. Toivonen and M. Partinen, ―Adaptive Heartbeat Modeling for 
Beat-to-Beat Heart Rate Measurement in Ballistocardiograms,‖ IEEE Journal 
of Biomedical and Health Informatics, vol. 19, no. 6, pp. 1945-1952, Nov. 
2015. 
[136]. H. Kim, R. F. Yazicioglu, P. Merken, C. Van Hoof and H. Yoo, ―ECG Signal 
Compression and Classification Algorithm With Quad Level Vector for ECG 
Holter System,‖ IEEE Transactions on Information Technology in 
Biomedicine, vol. 14, no. 1, pp. 93-100, Jan. 2010. 
[137]. H. Kim, R. F. Yazicioglu, T. Torfs, P. Merken, H. Yoo and C. Van Hoof, ―A 
low power ECG signal processor for ambulatory arrhythmia monitoring 
system,‖ 2010 Symposium on VLSI Circuits, Honolulu, HI, 2010, pp. 19-20. 
  168 
[138]. X. Liu et al., ―A 457 nW Near-Threshold Cognitive Multi-Functional ECG 
Processor for Long-Term Cardiac Monitoring,‖ IEEE Journal of Solid-State 
Circuits, vol. 49, no. 11, pp. 2422-2434, Nov. 2014. 
[139]. S. Hsu, Y. Ho, P. Chang, C. Su and C. Lee, ―A 48.6-to-105.2 µW Machine 
Learning Assisted Cardiac Sensor SoC for Mobile Healthcare Applications,‖ 
IEEE Journal of Solid-State Circuits, vol. 49, no. 4, pp. 801-811, April 2014. 
[140]. N. Bayasi, T. Tekeste, H. Saleh, B. Mohammad, A. Khandoker and M. Ismail, 
―Low-Power ECG-Based Processor for Predicting Ventricular Arrhythmia,‖ 
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, 
no. 5, pp. 1962-1974, May 2016. 
[141]. M. Yasin, T. Tekeste, H. Saleh, B. Mohammad, O. Sinanoglu and M. Ismail, 
―Ultra-Low Power, Secure IoT Platform for Predicting Cardiovascular 
Diseases,‖ IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 
64, no. 9, pp. 2624-2637, Sept. 2017. 
[142]. Y. Xu, Z. Chen, F. Li and J. Meng, ―A granular resampling method and 
adaptive speculative mechanism based energy-efficient architecture for multi-
class heartbeat classification,‖ IEEE Transactions on Computer-Aided Design 
of Integrated Circuits and Systems. [Early Access] 
[143]. S. K. Jain and B. Bhaumik, ―An Energy Efficient ECG Signal Processor 
Detecting Cardiovascular Diseases on Smartphone,‖ IEEE Transactions on 
Biomedical Circuits and Systems, vol. 11, no. 2, pp. 314-323, April 2017. 
  169 
[144]. Testing And Reporting Performance Results Of Cardiac Rhythm And ST 
Segment Measurement Algorithms, ANSI/AAMI Standard EC57, 2012. 
[145]. Yu Hen Hu, S. Palreddy and W. J. Tompkins, ―A patient-adaptable ECG beat 
classifier using a mixture of experts approach,‖ IEEE Transactions on 
Biomedical Engineering, vol. 44, no. 9, pp. 891-900, Sept. 1997. 
[146]. Philip de Chazal, M. O'Dwyer and R. B. Reilly, ―Automatic classification of 
heartbeats using ECG morphology and heartbeat interval features,‖ IEEE 
Transactions on Biomedical Engineering, vol. 51, no. 7, pp. 1196-1206, July 
2004. 
[147]. P. de Chazal and R. B. Reilly, ―A Patient-Adapting Heartbeat Classifier Using 
ECG Morphology and Heartbeat Interval Features,‖ IEEE Transactions on 
Biomedical Engineering, vol. 53, no. 12, pp. 2535-2543, Dec. 2006. 
[148]. W. Jiang and S. G. Kong, ―Block-Based Neural Networks for Personalized 
ECG Signal Classification,‖ IEEE Transactions on Neural Networks, vol. 18, 
no. 6, pp. 1750-1761, Nov. 2007. 
[149]. T. Ince*, S. Kiranyaz and M. Gabbouj, ―A Generic and Robust System for 
Automated Patient-Specific Classification of ECG Signals,‖ IEEE 
Transactions on Biomedical Engineering, vol. 56, no. 5, pp. 1415-1426, May 
2009. 
  170 
[150]. J. Mark and T. Todd, ―A Nonuniform Sampling Approach to Data 
Compression,‖ IEEE Transactions on Communications, vol. 29, no. 1, pp. 24-
32, Jan 1981. 
[151]. Beutler, Frederick J. ―Error-Free Recovery of Signals from Irregularly Spaced 
Samples.‖ SIAM Review, vol. 8, no. 3, 1966, pp. 328–335. 
[152]. H. Inose, T. Aoki and K. Watanabe, ―Asynchronous delta-modulation system,‖ 
in Electronics Letters, vol. 2, no. 3, pp. 95-96, March 1966. 
[153]. N. Sayiner, H. V. Sorensen and T. R. Viswanathan, ―A level-crossing sampling 
scheme for A/D conversion,‖ IEEE Transactions on Circuits and Systems II: 
Analog and Digital Signal Processing, vol. 43, no. 4, pp. 335-339, Apr 1996. 
[154]. K. M. Guan and A. C. Singer, ―Opportunistic Sampling of Bursty Signals by 
Level-Crossing - an Information Theoretical Approach,‖ 2007 41st Annual 
Conference on Information Sciences and Systems, Baltimore, MD, 2007, pp. 
701-707. 
[155]. M. Trakimas and S. R. Sonkusale, ―An Adaptive Resolution Asynchronous 
ADC Architecture for Data Compression in Energy Constrained Sensing 
Applications,‖ IEEE Transactions on Circuits and Systems I: Regular Papers, 
vol. 58, no. 5, pp. 921-934, May 2011. 
[156]. W. Kuang, "Correction and Comment on ―An Adaptive Resolution 
Asynchronous ADC Architecture for Data Compression in Energy Constrained 
  171 
Sensing Applications‖, IEEE Transactions on Circuits and Systems I: Regular 
Papers, vol. 60, no. 4, pp. 1097-1099, April 2013. 
[157]. F. Akopyan, R. Manohar and A. B. Apsel, ―A level-crossing flash 
asynchronous analog-to-digital converter,‖ 12th IEEE International 
Symposium on Asynchronous Circuits and Systems (ASYNC'06), Grenoble, 
2006, pp. 11 pp.-22. 
[158]. P. W. Jungwirth and A. D. Poularikas, ―Improved Sayiner level crossing 
ADC,‖ Thirty-Sixth Southeastern Symposium on System Theory, 2004. 
Proceedings of the, 2004, pp. 379-383. 
[159]. K. M. Guan and A. C. Singer, ―Sequential placement of reference levels in a 
level-crossing analog-to-digital converter,‖ 2008 42nd Annual Conference on 
Information Sciences and Systems, Princeton, NJ, 2008, pp. 464-469. 
[160]. K. Kozmin, J. Johansson and J. Delsing, ―Level-Crossing ADC Performance 
Evaluation Toward Ultrasound Application,‖ IEEE Transactions on Circuits 
and Systems I: Regular Papers, vol. 56, no. 8, pp. 1708-1719, Aug. 2009. 
[161]. E. Allier, G. Sicard, L. Fesquet and M. Renaudin, ―A new class of 
asynchronous A/D converters based on time quantization,‖ Proceedings of 
Ninth International Symposium on Asynchronous Circuits and Systems, 2003, 
pp. 196-205. 
  172 
[162]. Y. Li, D. Zhao and W. A. Serdijn, ―A Sub-Microwatt Asynchronous Level-
Crossing ADC for Biomedical Applications,‖ IEEE Transactions on 
Biomedical Circuits and Systems, vol. 7, no. 2, pp. 149-157, April 2013. 
[163]. Y. Li, A. L. Mansano, Y. Yuan, D. Zhao and W. A. Serdijn, ―An ECG 
Recording Front-End With Continuous-Time Level-Crossing Sampling,‖ IEEE 
Transactions on Biomedical Circuits and Systems, vol. 8, no. 5, pp. 626-635, 
Oct. 2014. 
[164]. C. Weltin-Wu and Y. Tsividis, ―An Event-driven Clockless Level-Crossing 
ADC With Signal-Dependent Adaptive Resolution,‖ IEEE Journal of Solid-
State Circuits, vol. 48, no. 9, pp. 2180-2190, Sept. 2013. 
[165]. B. Schell and Y. Tsividis, ―A Continuous-Time ADC/DSP/DAC System With 
No Clock and With Activity-Dependent Power Dissipation,‖ IEEE Journal of 
Solid-State Circuits, vol. 43, no. 11, pp. 2472-2481, Nov. 2008. 
[166]. T. Marisa et al., ―Pseudo Asynchronous Level Crossing adc for ecg Signal 
Acquisition,‖ IEEE Transactions on Biomedical Circuits and Systems, vol. 11, 
no. 2, pp. 267-278, April 2017. 
[167]. Y. Hou, K. Yousef, M. Atef, G. Wang and Y. Lian, ―A 1-to-1-kHz, 4.2-to-544-
nW, Multi-Level Comparator Based Level-Crossing ADC for IoT 
Applications,‖ IEEE Transactions on Circuits and Systems II: Express Briefs, 
vol. 65, no. 10, pp. 1390-1394, Oct. 2018. 
  173 
[168]. Y. Hou et al., "A 61-nW Level-Crossing ADC With Adaptive Sampling for 
Biomedical Applications," IEEE Transactions on Circuits and Systems II: 
Express Briefs, vol. 66, no. 1, pp. 56-60, Jan. 2019. 
[169]. E. Sackinger and W. Guggenbuhl, "A versatile building block: the CMOS 
differential difference amplifier," IEEE Journal of Solid-State Circuits, vol. 22, 
no. 2, pp. 287-294, April 1987. 
[170]. J. F. Duque-Carrillo, G. Torelli, R. Perez-Aloe, J. M. Valverde and F. 
Maloberti, "Fully differential basic building blocks based on fully differential 
difference amplifiers with unity-gain difference feedback," IEEE Transactions 
on Circuits and Systems I: Fundamental Theory and Applications, vol. 42, no. 
3, pp. 190-192, March 1995. 
[171]. E. Vittoz and J. Fellrath, ―CMOS analog integrated circuits based on weak 
inversion operations,‖ IEEE Journal of Solid-State Circuits, vol. 12, no. 3, pp. 
224-231, June 1977. 
[172]. Y. Chen, D. Jeon, Y. Lee, Y. Kim, Z. Foo, I. Lee, N. B. Langhals, G. kruger, 
H. Oral, O. Berenfeld, Z. Zhang, D. Blassuw, D. Sylvester, "An Injectable 64 
nW ECG mixed-signal SoC in 65 nm for arrhythmia monitoring," IEEE 
Journal of Solid-State Circuits, vol. 50, no. 1, pp. 375-390, Jan. 2015. 
[173]. J. Zhang, H. Zhang, Q. Sun and R. Zhang, "A low-noise, low-power amplifier 
with current-reused OTA for ECG recordings," IEEE Transactions on 
Biomedical Circuits and Systems, vol. 12, no. 3, pp. 700-708, June 2018. 
  174 
[174]. M. Flethøj, J. K. Kanters, P. J. Pedersen, M. M. Haugaard, H. Carstensen, L. H. 
Olsen, R. Buhl, ―Appropriate threshold levels of cardiac beat-to-beat variation 
in semi-automatic analysis of equine ECG recordings,‖ BMC Veterinary 
Research, vol. 12, no. 1. pp.266, Nov. 2016. 
[175]. C.-I. Ieong, P.-I. Mak, C.-P. Lam, C. Dong, M.-I. Vai, P.-U. Mak, S.-H. Pun, 
F. Wan, and R. P. Martins, ―A 0.83-uW QRS detection processor using 
quadratic spline wavelet transform for wireless ECG acquisition in 0.35-uM 
CMOS,‖ IEEE Transactions on Biomedical Circuits and Systems, vol. 6, no. 6. 
pp. 586–595, Dec. 2012. 
[176]. R. Poli, S. Cagnoni, and G. Valli, ―Genetic design of optimum linear and 
nonlinear QRS detectors,‖ IEEE Transactions on Biomedical Engineering, vol. 
42, no. 11. pp. 1137–1141, Nov. 1995. 
[177]. Y. Zhao, S. Lin, Z. Shang and Y. Lian, "Classification of Cardiac Arrhythmias 
Based on Artificial Neural Networks and Continuous-in-Time Discrete-in-
Amplitude Signal Flow," 2019 IEEE International Conference on Artificial 
Intelligence Circuits and Systems (AICAS), Hsinchu, Taiwan, 2019, pp. 175-
178. 
[178]. C. Fan et al., "An Improved Signed Digit Representation Approach for 
Constant Vector Multiplication," IEEE Transactions on Circuits and Systems 
II: Express Briefs, vol. 63, no. 10, pp. 999-1003, Oct. 2016. 
  175 
[179]. Jung-Yup Kang and J. -. Gaudiot, "A Simple High-Speed Multiplier Design," 
IEEE Transactions on Computers, vol. 55, no. 10, pp. 1253-1258, Oct. 2006. 
[180]. J. Fadavi-Ardekani, "M*N Booth encoded multiplier generator using 
optimized Wallace trees," IEEE Transactions on Very Large Scale Integration 
(VLSI) Systems, vol. 1, no. 2, pp. 120-125, June 1993. 
[181]. V. G. Oklobdzija, D. Villeger and S. S. Liu, "A method for speed optimized 
partial product reduction and generation of fast parallel multipliers using an 
algorithmic approach," IEEE Transactions on Computers, vol. 45, no. 3, pp. 
294-306, March 1996. 
[182]. Y. Zhao, Z. Shang and Y. Lian, "Auto Generation of High-Performance Fixed-
Point Multiplier for Artificial Neural Networks," 2019 IEEE International 
Conference on Artificial Intelligence Circuits and Systems (AICAS), Hsinchu, 
Taiwan, 2019, pp. 1-5. 
[183]. N. Weste and D. Harris, CMOS VLSI Design: A Circuits and Systems 
Perspective (4th ed.), Addison-Wesley Publishing Company, 2010, USA. 
[184]. A. Krogh, J. A. Hertz, ―A simple weight decay can improve generalization,‖  
Advances in Neural Information Processing Systems, Denver, Colorado, USA, 
Dec. 1991, pp. 950-957. 
[185]. D. E. Rumelhart, G. E. Hinton, and R. J. Willianms, ―Learning representations 
by back-propagating errors,‖ Nature, vol. 328, no. 6088, pp. 533-536, Oct. 
1986. 
