3 research outputs found

    Impact of ECG data format on the performance of machine learning models for the prediction of myocardial infarction

    Get PDF
    Background We aim to determine which electrocardiogram (ECG) data format is optimal for ML modelling, in the context of myocardial infarction prediction. We will also address the auxiliary objective of evaluating the viability of using digitised ECG signals for ML modelling. Methods Two ECG arrangements displaying 10s and 2.5 s of data for each lead were used. For each arrangement, conservative and speculative data cohorts were generated from the PTB-XL dataset. All ECGs were represented in three different data formats: Signal ECGs, Image ECGs, and Extracted Signal ECGs, with 8358 and 11,621 ECGs in the conservative and speculative cohorts, respectively. ML models were trained using the three data formats in both data cohorts. Results For ECGs that contained 10s of data, Signal and Extracted Signal ECGs were optimal and statistically similar, with AUCs [95% CI] of 0.971 [0.961, 0.981] and 0.974 [0.965, 0.984], respectively, for the conservative cohort; and 0.931 [0.918, 0.945] and 0.919 [0.903, 0.934], respectively, for the speculative cohort. For ECGs that contained 2.5 s of data, the Image ECG format was optimal, with AUCs of 0.960 [0.948, 0.973] and 0.903 [0.886, 0.920], for the conservative and speculative cohorts, respectively. Conclusion When available, the Signal ECG data should be preferred for ML modelling. If not, the optimal format depends on the data arrangement within the ECG: If the Image ECG contains 10s of data for each lead, the Extracted Signal ECG is optimal, however, if it only uses 2.5 s, then using the Image ECG data is optimal for ML performance.</p

    Impact of ECG data format on the performance of machine learning models for the prediction of myocardial infarction

    Get PDF
    Background We aim to determine which electrocardiogram (ECG) data format is optimal for ML modelling, in the context of myocardial infarction prediction. We will also address the auxiliary objective of evaluating the viability of using digitised ECG signals for ML modelling. Methods Two ECG arrangements displaying 10s and 2.5 s of data for each lead were used. For each arrangement, conservative and speculative data cohorts were generated from the PTB-XL dataset. All ECGs were represented in three different data formats: Signal ECGs, Image ECGs, and Extracted Signal ECGs, with 8358 and 11,621 ECGs in the conservative and speculative cohorts, respectively. ML models were trained using the three data formats in both data cohorts. Results For ECGs that contained 10s of data, Signal and Extracted Signal ECGs were optimal and statistically similar, with AUCs [95% CI] of 0.971 [0.961, 0.981] and 0.974 [0.965, 0.984], respectively, for the conservative cohort; and 0.931 [0.918, 0.945] and 0.919 [0.903, 0.934], respectively, for the speculative cohort. For ECGs that contained 2.5 s of data, the Image ECG format was optimal, with AUCs of 0.960 [0.948, 0.973] and 0.903 [0.886, 0.920], for the conservative and speculative cohorts, respectively. Conclusion When available, the Signal ECG data should be preferred for ML modelling. If not, the optimal format depends on the data arrangement within the ECG: If the Image ECG contains 10s of data for each lead, the Extracted Signal ECG is optimal, however, if it only uses 2.5 s, then using the Image ECG data is optimal for ML performance

    Extração de sinal digital de ECG utilizando técnicas de processamento de imagens

    Get PDF
    Trabalho de Conclusão de Curso (graduação)—Universidade de Brasília, Faculdade UnB Gama, Engenharia Eletrônica, 2021.Segundo a Organização Mundial da Saúde (OMS), as doenças cardiovasculares são as principais causas de morte no mundo. O diagnóstico de forma rápida e precisa dessas doenças é de grande importância no tratamento dos pacientes e a análise do exame de eletrocardiograma (ECG), desde sua invenção, é uma das ferramentas mais utilizadas para a realização desse diagnóstico. Além disso, para fins de consulta, esses registros precisam ser acessados de tempos em tempos por especialistas. Entretanto, a maioria dos exames de ECG existentes ainda está disponível somente no formato impresso, o que dificulta a preservação, a análise e o compartilhamento das informações clínicas dos pacientes. A criação de uma ferramenta capaz de obter o sinal do ECG a partir de uma imagem digital seria de muita utilidade para clínicas de saúde e hospitais. Dito isto, este trabalho propõe o desenvolvimento de uma ferramenta computacional capaz de extrair o sinal digital a partir de imagens digitais contendo as derivações do ECG, desenvolvida em Python, com o auxílio de bibliotecas abertas como OpenCV, SciPy e Pandas e técnicas de processamento digital de imagens. O objetivo geral deste trabalho é a obtenção de um sinal digital unidimensional, contendo tempo e amplitude, com base em imagens digitalizadas das derivações do ECG. Também propõe-se uma forma de identificar os complexos QRS, e consequentemente a frequência cardíaca do indivíduo, utilizando uma versão modificada do algoritmo de Pan-Tompkins. Os testes para validação foram realizados num total de 180 imagens obtidas na base online PTB Diagnostic ECG Database através da ferramenta online PhysioBank ATM. O algoritmo proposto obteve coeficiente de correlação linear médio de 0.88, um erro médio absoluto de 0.0446 mV e foi capaz de identificar a frequência cardíaca dos indivíduos com um erro percentual médio de 1.91% (0.68% desconsiderando 5 casos discrepantes) se comparados o sinal original com o sinal extraído. Tomando como base o erro percentual médio de 0.68%, o algoritmo obteve uma acurácia de 99.32% na detecção da frequência cardíaca dos indivíduos, sendo equiparável (ou até superior) a resultados reportados na literatura.According to the World Health Organization (WHO), cardiovascular diseases are the leading causes of death in the world. The fast and accurate diagnosis of these diseases is very important in the treatment of patients and the analysis of the electrocardiogram (ECG), since its invention, is one of the most used tools for this diagnosis. In addition, for consultation purposes, these records need to be accessed from time to time by specialists. However, most existing ECG scans are still only available in printed form, which makes it difficult to preserve, to analyze and to share patients’ clinical information. The creation of a tool capable of obtaining the ECG signal from a digital image would be very useful for health clinics and hospitals. That being said, this work proposes the development of a computational tool capable of extracting the digital signal from digital images containing the ECG leads, developed in Python with the help of open libraries such as OpenCV, SciPy and Pandas and digital image processing techniques. The main goal of this work is to obtain a one-dimensional digital signal, containing time and amplitude, based on digital images of ECG leads. It is also proposed a way to identify the QRS complexes, and consequently the heart rate of the individual, using a modified version of Pan-Tompkins algorithm. Validation tests were performed on a total of 180 images obtained from the online PTB Diagnostic ECG Database using the online tool PhysioBank ATM. The proposed algorithm obtained an average linear correlation coefficient of 0.88, an average absolute error of 0.0446 mV and was able to identify the heart rate of individuals with an average percentage error of 1.91% (0.68% disregarding 5 outliers) when comparing the original signal with the extracted signal. Based on the average percentage error of 0.68%, the algorithm obtained an accuracy of 99.32% in detecting the heart rate of the individuals, being comparable (or even better) than results found in the literature
    corecore