3 research outputs found
Impact of ECG data format on the performance of machine learning models for the prediction of myocardial infarction
Background We aim to determine which electrocardiogram (ECG) data format is optimal for ML modelling, in the context of myocardial infarction prediction. We will also address the auxiliary objective of evaluating the viability of using digitised ECG signals for ML modelling. Methods Two ECG arrangements displaying 10s and 2.5 s of data for each lead were used. For each arrangement, conservative and speculative data cohorts were generated from the PTB-XL dataset. All ECGs were represented in three different data formats: Signal ECGs, Image ECGs, and Extracted Signal ECGs, with 8358 and 11,621 ECGs in the conservative and speculative cohorts, respectively. ML models were trained using the three data formats in both data cohorts. Results For ECGs that contained 10s of data, Signal and Extracted Signal ECGs were optimal and statistically similar, with AUCs [95% CI] of 0.971 [0.961, 0.981] and 0.974 [0.965, 0.984], respectively, for the conservative cohort; and 0.931 [0.918, 0.945] and 0.919 [0.903, 0.934], respectively, for the speculative cohort. For ECGs that contained 2.5 s of data, the Image ECG format was optimal, with AUCs of 0.960 [0.948, 0.973] and 0.903 [0.886, 0.920], for the conservative and speculative cohorts, respectively. Conclusion When available, the Signal ECG data should be preferred for ML modelling. If not, the optimal format depends on the data arrangement within the ECG: If the Image ECG contains 10s of data for each lead, the Extracted Signal ECG is optimal, however, if it only uses 2.5 s, then using the Image ECG data is optimal for ML performance.</p
Impact of ECG data format on the performance of machine learning models for the prediction of myocardial infarction
Background We aim to determine which electrocardiogram (ECG) data format is optimal for ML modelling, in the context of myocardial infarction prediction. We will also address the auxiliary objective of evaluating the viability of using digitised ECG signals for ML modelling. Methods Two ECG arrangements displaying 10s and 2.5 s of data for each lead were used. For each arrangement, conservative and speculative data cohorts were generated from the PTB-XL dataset. All ECGs were represented in three different data formats: Signal ECGs, Image ECGs, and Extracted Signal ECGs, with 8358 and 11,621 ECGs in the conservative and speculative cohorts, respectively. ML models were trained using the three data formats in both data cohorts. Results For ECGs that contained 10s of data, Signal and Extracted Signal ECGs were optimal and statistically similar, with AUCs [95% CI] of 0.971 [0.961, 0.981] and 0.974 [0.965, 0.984], respectively, for the conservative cohort; and 0.931 [0.918, 0.945] and 0.919 [0.903, 0.934], respectively, for the speculative cohort. For ECGs that contained 2.5 s of data, the Image ECG format was optimal, with AUCs of 0.960 [0.948, 0.973] and 0.903 [0.886, 0.920], for the conservative and speculative cohorts, respectively. Conclusion When available, the Signal ECG data should be preferred for ML modelling. If not, the optimal format depends on the data arrangement within the ECG: If the Image ECG contains 10s of data for each lead, the Extracted Signal ECG is optimal, however, if it only uses 2.5 s, then using the Image ECG data is optimal for ML performance
Extração de sinal digital de ECG utilizando técnicas de processamento de imagens
Trabalho de Conclusão de Curso (graduação)—Universidade de BrasÃlia, Faculdade UnB Gama, Engenharia Eletrônica, 2021.Segundo a Organização Mundial da Saúde (OMS), as doenças cardiovasculares são as
principais causas de morte no mundo. O diagnóstico de forma rápida e precisa dessas
doenças é de grande importância no tratamento dos pacientes e a análise do exame de
eletrocardiograma (ECG), desde sua invenção, é uma das ferramentas mais utilizadas
para a realização desse diagnóstico. Além disso, para fins de consulta, esses registros
precisam ser acessados de tempos em tempos por especialistas. Entretanto, a maioria
dos exames de ECG existentes ainda está disponÃvel somente no formato impresso, o
que dificulta a preservação, a análise e o compartilhamento das informações clÃnicas dos
pacientes. A criação de uma ferramenta capaz de obter o sinal do ECG a partir de uma
imagem digital seria de muita utilidade para clÃnicas de saúde e hospitais. Dito isto, este
trabalho propõe o desenvolvimento de uma ferramenta computacional capaz de extrair o
sinal digital a partir de imagens digitais contendo as derivações do ECG, desenvolvida em
Python, com o auxÃlio de bibliotecas abertas como OpenCV, SciPy e Pandas e técnicas de
processamento digital de imagens. O objetivo geral deste trabalho é a obtenção de um sinal
digital unidimensional, contendo tempo e amplitude, com base em imagens digitalizadas
das derivações do ECG. Também propõe-se uma forma de identificar os complexos QRS,
e consequentemente a frequência cardÃaca do indivÃduo, utilizando uma versão modificada
do algoritmo de Pan-Tompkins. Os testes para validação foram realizados num total de
180 imagens obtidas na base online PTB Diagnostic ECG Database através da ferramenta
online PhysioBank ATM. O algoritmo proposto obteve coeficiente de correlação linear
médio de 0.88, um erro médio absoluto de 0.0446 mV e foi capaz de identificar a frequência
cardÃaca dos indivÃduos com um erro percentual médio de 1.91% (0.68% desconsiderando
5 casos discrepantes) se comparados o sinal original com o sinal extraÃdo. Tomando como
base o erro percentual médio de 0.68%, o algoritmo obteve uma acurácia de 99.32% na
detecção da frequência cardÃaca dos indivÃduos, sendo equiparável (ou até superior) a
resultados reportados na literatura.According to the World Health Organization (WHO), cardiovascular diseases are the leading causes of death in the world. The fast and accurate diagnosis of these diseases is very
important in the treatment of patients and the analysis of the electrocardiogram (ECG),
since its invention, is one of the most used tools for this diagnosis. In addition, for consultation purposes, these records need to be accessed from time to time by specialists.
However, most existing ECG scans are still only available in printed form, which makes
it difficult to preserve, to analyze and to share patients’ clinical information. The creation of a tool capable of obtaining the ECG signal from a digital image would be very
useful for health clinics and hospitals. That being said, this work proposes the development of a computational tool capable of extracting the digital signal from digital images
containing the ECG leads, developed in Python with the help of open libraries such as
OpenCV, SciPy and Pandas and digital image processing techniques. The main goal of
this work is to obtain a one-dimensional digital signal, containing time and amplitude,
based on digital images of ECG leads. It is also proposed a way to identify the QRS
complexes, and consequently the heart rate of the individual, using a modified version of
Pan-Tompkins algorithm. Validation tests were performed on a total of 180 images obtained from the online PTB Diagnostic ECG Database using the online tool PhysioBank
ATM. The proposed algorithm obtained an average linear correlation coefficient of 0.88,
an average absolute error of 0.0446 mV and was able to identify the heart rate of individuals with an average percentage error of 1.91% (0.68% disregarding 5 outliers) when
comparing the original signal with the extracted signal. Based on the average percentage
error of 0.68%, the algorithm obtained an accuracy of 99.32% in detecting the heart rate
of the individuals, being comparable (or even better) than results found in the literature