The electrocardiogram or ECG has been in use for over 100 years and remains
the most widely performed diagnostic test to characterize cardiac structure and
electrical activity. We hypothesized that parallel advances in computing power,
innovations in machine learning algorithms, and availability of large-scale
digitized ECG data would enable extending the utility of the ECG beyond its
current limitations, while at the same time preserving interpretability, which
is fundamental to medical decision-making. We identified 36,186 ECGs from the
UCSF database that were 1) in normal sinus rhythm and 2) would enable training
of specific models for estimation of cardiac structure or function or detection
of disease. We derived a novel model for ECG segmentation using convolutional
neural networks (CNN) and Hidden Markov Models (HMM) and evaluated its output
by comparing electrical interval estimates to 141,864 measurements from the
clinical workflow. We built a 725-element patient-level ECG profile using
downsampled segmentation data and trained machine learning models to estimate
left ventricular mass, left atrial volume, mitral annulus e' and to detect and
track four diseases: pulmonary arterial hypertension (PAH), hypertrophic
cardiomyopathy (HCM), cardiac amyloid (CA), and mitral valve prolapse (MVP).
CNN-HMM derived ECG segmentation agreed with clinical estimates, with median
absolute deviations (MAD) as a fraction of observed value of 0.6% for heart
rate and 4% for QT interval. Patient-level ECG profiles enabled quantitative
estimates of left ventricular and mitral annulus e' velocity with good
discrimination in binary classification models of left ventricular hypertrophy
and diastolic function. Models for disease detection ranged from AUROC of 0.94
to 0.77 for MVP. Top-ranked variables for all models included known ECG
characteristics along with novel predictors of these traits/diseases.Comment: 13 pages, 6 figures, 1 Table + Supplemen