224 research outputs found

    Deep Learning in Cardiology

    Full text link
    The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table

    Clinical applications of artificial intelligence in cardiology on the verge of the decade

    Get PDF
    Artificial intelligence (AI) has been hailed as the fourth industrial revolution and its influence on people’s lives is increasing. The research on AI applications in medicine is progressing rapidly. This revolution shows promise for more precise diagnoses, streamlined workflows, increased accessibility to healthcare services and new insights into ever-growing population-wide datasets. While some applications have already found their way into contemporary patient care, we are still in the early days of the AI-era in medicine. Despite the popularity of these new technologies, many practitioners lack an understanding of AI methods, their benefits, and pitfalls. This review aims to provide information about the general concepts of machine learning (ML) with special focus on the applications of such techniques in cardiovascular medicine. It also sets out the current trends in research related to medical applications of AI. Along with new possibilities, new threats arise — acknowledging and understanding them is as important as understanding the ML methodology itself. Therefore, attention is also paid to the current opinions and guidelines regarding the validation and safety of AI-powered tools

    Deep Risk Prediction and Embedding of Patient Data: Application to Acute Gastrointestinal Bleeding

    Get PDF
    Acute gastrointestinal bleeding is a common and costly condition, accounting for over 2.2 million hospital days and 19.2 billion dollars of medical charges annually. Risk stratification is a critical part of initial assessment of patients with acute gastrointestinal bleeding. Although all national and international guidelines recommend the use of risk-assessment scoring systems, they are not commonly used in practice, have sub-optimal performance, may be applied incorrectly, and are not easily updated. With the advent of widespread electronic health record adoption, longitudinal clinical data captured during the clinical encounter is now available. However, this data is often noisy, sparse, and heterogeneous. Unsupervised machine learning algorithms may be able to identify structure within electronic health record data while accounting for key issues with the data generation process: measurements missing-not-at-random and information captured in unstructured clinical note text. Deep learning tools can create electronic health record-based models that perform better than clinical risk scores for gastrointestinal bleeding and are well-suited for learning from new data. Furthermore, these models can be used to predict risk trajectories over time, leveraging the longitudinal nature of the electronic health record. The foundation of creating relevant tools is the definition of a relevant outcome measure; in acute gastrointestinal bleeding, a composite outcome of red blood cell transfusion, hemostatic intervention, and all-cause 30-day mortality is a relevant, actionable outcome that reflects the need for hospital-based intervention. However, epidemiological trends may affect the relevance and effectiveness of the outcome measure when applied across multiple settings and patient populations. Understanding the trends in practice, potential areas of disparities, and value proposition for using risk stratification in patients presenting to the Emergency Department with acute gastrointestinal bleeding is important in understanding how to best implement a robust, generalizable risk stratification tool. Key findings include a decrease in the rate of red blood cell transfusion since 2014 and disparities in access to upper endoscopy for patients with upper gastrointestinal bleeding by race/ethnicity across urban and rural hospitals. Projected accumulated savings of consistent implementation of risk stratification tools for upper gastrointestinal bleeding total approximately $1 billion 5 years after implementation. Most current risk scores were designed for use based on the location of the bleeding source: upper or lower gastrointestinal tract. However, the location of the bleeding source is not always clear at presentation. I develop and validate electronic health record based deep learning and machine learning tools for patients presenting with symptoms of acute gastrointestinal bleeding (e.g., hematemesis, melena, hematochezia), which is more relevant and useful in clinical practice. I show that they outperform leading clinical risk scores for upper and lower gastrointestinal bleeding, the Glasgow Blatchford Score and the Oakland score. While the best performing gradient boosted decision tree model has equivalent overall performance to the fully connected feedforward neural network model, at the very low risk threshold of 99% sensitivity the deep learning model identifies more very low risk patients. Using another deep learning model that can model longitudinal risk, the long-short-term memory recurrent neural network, need for transfusion of red blood cells can be predicted at every 4-hour interval in the first 24 hours of intensive care unit stay for high risk patients with acute gastrointestinal bleeding. Finally, for implementation it is important to find patients with symptoms of acute gastrointestinal bleeding in real time and characterize patients by risk using available data in the electronic health record. A decision rule-based electronic health record phenotype has equivalent performance as measured by positive predictive value compared to deep learning and natural language processing-based models, and after live implementation appears to have increased the use of the Acute Gastrointestinal Bleeding Clinical Care pathway. Patients with acute gastrointestinal bleeding but with other groups of disease concepts can be differentiated by directly mapping unstructured clinical text to a common ontology and treating the vector of concepts as signals on a knowledge graph; these patients can be differentiated using unbalanced diffusion earth mover’s distances on the graph. For electronic health record data with data missing not at random, MURAL, an unsupervised random forest-based method, handles data with missing values and generates visualizations that characterize patients with gastrointestinal bleeding. This thesis forms a basis for understanding the potential for machine learning and deep learning tools to characterize risk for patients with acute gastrointestinal bleeding. In the future, these tools may be critical in implementing integrated risk assessment to keep low risk patients out of the hospital and guide resuscitation and timely endoscopic procedures for patients at higher risk for clinical decompensation

    The Quality Application of Deep Learning in Clinical Outcome Predictions Using Electronic Health Record Data: A Systematic Review

    Get PDF
    Introduction: Electronic Health Record (EHR) is a significant source of medical data that can be used to develop predictive modelling with therapeutically useful outcomes. Predictive modelling using EHR data has been increasingly utilized in healthcare, achieving outstanding performance and improving healthcare outcomes. Objectives: The main goal of this review study is to examine different deep learning approaches and techniques used to EHR data processing. Methods: To find possibly pertinent articles that have used deep learning on EHR data, the PubMed database was searched. Using EHR data, we assessed and summarized deep learning performance in a number of clinical applications that focus on making specific predictions about clinical outcomes, and we compared the outcomes with those of conventional machine learning models. Results: For this study, a total of 57 papers were chosen. There have been five identified clinical outcome predictions: illness (n=33), intervention (n=6), mortality (n=5), Hospital readmission (n=7), and duration of stay (n=1). The majority of research (39 out of 57) used structured EHR data. RNNs were used as deep learning models the most frequently (LSTM: 17 studies, GRU: 6 research). The analysis shows that deep learning models have excelled when applied to a variety of clinical outcome predictions. While deep learning's application to EHR data has advanced rapidly, it's crucial that these models remain reliable, offering critical insights to assist clinicians in making informed decision. Conclusions: The findings demonstrate that deep learning can outperform classic machine learning techniques since it has the advantage of utilizing extensive and sophisticated datasets, such as longitudinal data seen in EHR. We think that deep learning will keep expanding because it has been quite successful in enhancing healthcare outcomes utilizing EHR data

    Explainable artificial intelligence model to predict acute critical illness from electronic health records

    Get PDF
    We developed an explainable artificial intelligence (AI) early warning score (xAI-EWS) system for early detection of acute critical illness. While maintaining a high predictive performance, our system explains to the clinician on which relevant electronic health records (EHRs) data the prediction is grounded. Acute critical illness is often preceded by deterioration of routinely measured clinical parameters, e.g., blood pressure and heart rate. Early clinical prediction is typically based on manually calculated screening metrics that simply weigh these parameters, such as Early Warning Scores (EWS). The predictive performance of EWSs yields a tradeoff between sensitivity and specificity that can lead to negative outcomes for the patient. Previous work on EHR-trained AI systems offers promising results with high levels of predictive performance in relation to the early, real-time prediction of acute critical illness. However, without insight into the complex decisions by such system, clinical translation is hindered. In this letter, we present our xAI-EWS system, which potentiates clinical translation by accompanying a prediction with information on the EHR data explaining it

    딥러닝 기반 생존분석이 적용된 심혈관질환 위험 평가 모델 성능 향상을 위한 콕스 모형과 결합된 하이브리드 접근법: 헬스케어-환경 연계 데이터 활용 연구

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 의과대학 의과학과, 2020. 8. 박상민 .Background and aims: The contribution of different cardiovascular disease (CVD) risk factors for the risk evaluation and predictive modeling for incident CVD is often debated. Also, to what extent data on CVD risk factors from multiple data categories should be collected for comprehensive risk assessment and predictive modeling for CVD risk using survival analysis is uncertain despite the increasing availability of the relevant data sources. This study aimed to evaluate the contribution of different data categories derived from integrated data on healthcare and environmental exposure to the risk evaluation and prediction models for CVD risk using deep learning based survival analysis in combination with Cox proportional hazards regression and Cox proportional hazards regression. Methods: Information on the comprehensive list of CVD risk factors were collected from systematic reviews of variables included in the conventional CVD risk assessment tools and observational studies from medical literature database (PubMed and Embase). Each risk factor was screened for availability in the National Health Insurance Service-National Sample Cohort (NHIS-NSC) linked to environmental exposure data on cumulative particulate matter and urban green space using residential area code. Individual records of 137,249 patients more than 40 years of age who underwent the biennial national health screening between 2009 and 2010 without previous history of CVD were followed up for incident CVD event from January 1, 2011 to December 31, 2013 in the NHIS-NSC with data linkage to environmental exposure. Statistics-based variable selection methods were implemented as follows: statistical significance, subset with the minimum (best) Akaike Information Criteria (AIC), variables selected from the regularized Cox proportional hazards regression with elastic net penalty, and finally a variable set that commonly meets all the criteria from the abovementioned statistical methods. Prediction models using Cox proportional hazards deep neural network (DeepSurv) and Cox proportional hazards regression were constructed in the training set (80% of the total sample) using input feature sets selected from the abovementioned strategies and progressively adding input features by data categories to examine the relative contribution of each data type to the predictive performance for CVD risk. Performance evaluations of the DeepSurv and Cox proportional hazards regression models for CVD risk were conducted in the test set (20% of the total sample) with Unos concordance statistics (C-index), which is the most up-to-date evaluation metrics for the survival models with right censored data. Results: After the comprehensive review, data synthesis, and availability check, a total of 31 risk factors in the categories of sociodemographic, clinical laboratory test and measurement, lifestyle behavior, family history, underlying medical conditions, dental health, medication, and environmental exposure were identified in the NHIS-NSC linked to environmental exposure data. Among the models constructed with different variable selection methods, using statistically significant variables for DeepSurv (Unos C-index: 0.7069) and all of the variables for Cox proportional hazards regression (Unos C-index: 0.7052) showed improved predictive performance for CVD risk, which was a statistically significant increase (p-value for difference in Unos C-index: <0.0001 for both comparisons) compared to the models with basic clinical factors (age, sex, and body mass index), respectively. When all and statistically significant variables in each data category from sociodemographic to environmental exposure were progressively added as input features into DeepSurv and Cox proportional hazards regression for predictive modeling for CVD risk, the DeepSurv model with statistically significant variables pertaining to the sociodemographic factors, clinical laboratory test and measurement, and lifestyle behavior data showed the notable performance that outperformed Cox proportional hazards regression model with statistically significant variables added up to the medication category. Extensive data linkage to environmental exposure on cumulative particulate matter and urban green space offered only marginal improvement for the predictive performance of DeepSurv and Cox proportional hazards regression models for CVD risk. Conclusion: To obtain the best predictive performance of DeepSurv model for CVD risk with minimum number of input features, information on sociodemographic, clinical laboratory test and measurement, and lifestyle behavior should be primarily collected and used as input features in the NHIS-NSC. Also, the overall performance of DeepSurv for CVD risk assessment was improved with a hybrid approach using statistically significant variables from Cox proportional hazards regression as input features. When all the data categories in the NHIS-NSC linked to environmental exposure data are available, progressively adding variables in each data category could incrementally increase the predictive performance of DeepSurv model for CVD risk with the hybrid approach. Data linkage to the environmental exposure with residential area code in the NHIS-NSC offered marginally improved performance for CVD risk in both DeepSurv model with the hybrid approach and Cox proportional hazards regression model.배경 및 목적: 심혈관질환 위험평가 및 예측모델링에서 다양한 심혈관질환 위험인자들의 모델 성능향상에 대한 기여도는 논란의 요지로 보고되어왔다. 또한, 지속적으로 증가하는 활용 가능한 심혈관질환 관련 데이터의 종류와 양에도 불구하고 포괄적인 심혈관질환 위험평가와 최적의 예측 모형 개발을 위해 데이터를 어느 범위와 수준까지 수집해야 하는지에 대한 근거는 부족한 현황이다. 본 연구에서는 콕스 모형과 결합된 딥러닝 기반 생존분석 접근법 및 콕스 모형을 활용한 심혈관질환 위험평가와 예측모델링에서 헬스케어-환경 연계 데이터 활용방법 및 범주에 따른 모델 성능향상에 대한 기여도를 평가하고자 하였다. 연구 방법: 전통적 심혈관질환 위험 평가 도구 및 관찰 연구들에 포함 된 심혈관질환 위험요인 관련 변수들을 체계적 문헌고찰 방법론을 활용하여 의학연구 문헌데이터베이스 (PubMed and Embase)에서 포괄적으로 정보를 수집하였다. 미세먼지 누적장기노출 및 도시녹지면적에 대한 환경 노출 데이터와 연계 된 국민건강보험공단 표본코호트, (National Health Insurance Service-National Sample Cohort, NHIS-NSC)에서 각 심혈관질환 위험인자들의 데이터 확보 가능성을 검토하였다. NHIS-NSC를 기준으로 2009년에서 2010년 사이에 국가건강검진을 받은 40세 이상 대상자 중 과거 심혈관질환 병력이 없는 대상자 137,249명의 환자에 대한 정보를 수집하여 2011 년 1 월 1 일부터 2013 년 12 월 31 일까지 신규 발생한 심혈관질환에 대해 시간 경과에 따라 추적 조사하였다. 통계 기반 변수선택 방법은 콕스비례위험모형에서 통계적 유의성, 최소 (최상의) Akaike Information Criteria (AIC)의 하위 집합, elastic net penalty로 정규화 된 콕스비례위험모형에서 선택된 변수 및 위에 언급된 모든 기준을 충족하는 변수 세트로 선정하였다. 위에 명시된 통계적 방법 외 모든 데이터 범주에 속한 변수 및 콕스비례위험모형에서 통계적으로 유의미한 변수 (하이브리드 접근법)를 점진적으로 입력 피쳐로 추가하는 전략으로 딥러닝 기반 생존분석 (Cox proportional hazards deep neural network, DeepSurv) 및 콕스비례위험모형에서 예측 모델들을 훈련 세트 (전체 샘플의 80 %)를 기반으로 개발하였다. DeepSurv 및 콕스비례 위험모형을 활용한 심혈관질환 예측 모델의 성능평가는 생존분석을 활용한 예측 모델링에 가장 적합한 평가지표로 알려진 Unos concordance statistics (C-index)를 사용하여 테스트 세트 (총 샘플의 20 %)에서 수행하였다. 결과: 체계적 문헌고찰, 데이터 취합 및 추출 가능성 검토 후, 인구사회학적 요인, 건강검진 및 측정 결과, 생활습관, 가족력, 건강상태, 구강건강, 약물 및 환경 노출 데이터 범주에서 총 31 개의 심혈관질환 위험인자가 지역환경 자료와 연계된 NHIS-NSC에서 확인되었다. 통계 기반 변수선택 방법으로 개발한 심혈관질환 예측 모델 중 콕스비례위험모형에서 통계적으로 유의미한 변수를 DeepSurv에 적용한 하이브리드 접근법이 Uno 's C-index 값 0.7069, 모든 변수를 콕스비례위험모형에 적용한 콕스비례위험모형이 Uno 's C-index 값 0.7052로 나타나 기본 임상 요인 (연령, 성별 및 체질량지수)이 포함된 예측 모델과 비교하여 통계적으로 유의미한 모델 예측력 증가를 보였다 (두 모델 모두 Unos C-index 차이에 대한 p-value : <0.0001). 인구사회학적 특성에서 환경 노출에 이르기까지 각 데이터 범주에서 모두 통계적으로 유의미한 변수들이 심혈관질환 예측 모델링을위한 DeepSurv 및 Cox 비례 위험 회귀에 입력 피쳐로 점진적으로 추가 된 경우, 인구사회학적 요인, 건강검진 및 측정 결과, 생활습관 요인 중 통계적으로 유의미한 변수들로 구성된 DeepSurv 모델이 의약품 사용까지 고려한 Cox 비례 위험 회귀를 기반으로 한 모델 보다 뛰어난 성능을 나타냈다. 미세먼지 및 도시녹지면적에 대한 환경 노출 데이터를 거주지를 기반으로 NHIS-NSC와 연계 후 점진적으로 입력 피쳐로 추가 시 DeepSurv 및 콕스비례위험모형을 활용한 심혈관질환 예측 모델링 성능을 통계적으로 유의미한 수준으로 개선하지 못했다. 결론: 최소 입력 피쳐를 갖춘 생존 분석 기반 심혈관질환 예측 모델에서 최상의 성능을 얻으려면 인구사회학적, 건강검진 및 측정 결과, 및 생활습관에 대한 정보를 NHIS-NSC에서 수집하여 DeepSurv의 입력 피쳐로 활용해야한다. 지역환경 자료와 연계된 NHIS-NSC에서 모든 데이터 범주를 사용할 수 있을 때 점진적으로 각 데이터 범주 중 콕스비례위험모형에서 통계적으로 유의미한 심혈관질환 위험인자를 점진적으로 입력 피쳐로 DeepSurv 모델에 추가하는 하이브리드 접근법에서 심혈관질환 예측 모델링 성능이 점차 향상 될 것으로 기대할 수 있다. 주거 지역 코드를 사용한 NHIS-NSC와 환경 노출 데이터 연계는 DeepSurv 및 콕스비례위험모형 모두에서 심혈관질환 예측 모델링 성능이 향상되었지만 통계적으로 유의미한 증가 수준은 아닌 것으로 나타나 환경 노출 데이터 연계 및 적용 시 검토가 필요할 것으로 추정된다.I. Introduction 1 1. Background 1 2. Research problem 4 3. Hypothesis and objective 6 3.1. Hypothesis 6 3.2. Objective 6 II. Materials and methods 8 1. Comprehensive review and identification of cardiovascular disease (CVD) risk factors 8 1.1. Systematic review on variables included in conventional CVD risk assessment tools 8 1.2. Systematic review on traditional and emerging CVD risk factors from observational studies 9 1.3. Integration of the comprehensive list of CVD risk factors 11 1.4. Screening for data availability 11 2. Cohort analysis for measuring strength of association between risk factors and incident cardiovascular disease 11 2.1 Study population and linkage to environmental exposure data 11 2.2. Variable selection and data processing 15 2.3. Population-based cohort analysis 17 3. Predictive modeling using survival analysis: DeepSurv and Cox proportional hazards regression 17 3.1. Model development 17 3.2. Evaluation of the predictive performance of the models 20 III. Results 21 1. Identification and categorization of cardiovascular disease risk factors 21 2. Magnitude of association between selected risk factors with cardiovascular disease 43 3. Model performance evaluation 56 VI. Discussion 68 1. Key findings and contributions 68 2. Comparison to other studies 69 3. Strengths and limitations 73 4. Implications 74 5. Future perspectives 75 V. Conclusion 77 Reference 78 국문초록 88Docto

    A CNN-LSTM for predicting mortality in the ICU

    Get PDF
    An accurate predicted mortality is crucial to healthcare as it provides an empirical risk estimate for prognostic decision making, patient stratification and hospital benchmarking. Current prediction methods in practice are severity of disease scoring systems that usually involve a fixed set of admission attributes and summarized physiological data. These systems are prone to bias and require substantial manual effort which necessitates an updated approach which can account for most shortcomings. Clinical observation notes allow for recording highly subjective data on the patient that can possibly facilitate higher discrimination. Moreover, deep learning models can automatically extract and select features without human input.This thesis investigates the potential of a combination of a deep learning model and notes for predicting mortality with a higher accuracy. A custom architecture, called CNN-LSTM, is conceptualized for mapping multiple notes compiled in a hospital stay to a mortality outcome. It employs both convolutional and recurrent layers with the former capturing semantic relationships in individual notes independently and the latter capturing temporal relationships between concurrent notes in a hospital stay. This approach is compared to three severity of disease scoring systems with a case study on the MIMIC-III dataset. Experiments are set up to assess the CNN-LSTM for predicting mortality using only the notes from the first 24, 12 and 48 hours of a patient stay. The model is trained using K-fold cross-validation with k=5 and the mortality probability calculated by the three severity scores on the held-out set is used as the baseline. It is found that the CNN-LSTM outperforms the baseline on all experiments which serves as a proof-of-concept of how notes and deep learning can better outcome prediction

    Contribuciones de las técnicas machine learning a la cardiología. Predicción de reestenosis tras implante de stent coronario

    Get PDF
    [ES]Antecedentes: Existen pocos temas de actualidad equiparables a la posibilidad de la tecnología actual para desarrollar las mismas capacidades que el ser humano, incluso en medicina. Esta capacidad de simular los procesos de inteligencia humana por parte de máquinas o sistemas informáticos es lo que conocemos hoy en día como inteligencia artificial. Uno de los campos de la inteligencia artificial con mayor aplicación a día de hoy en medicina es el de la predicción, recomendación o diagnóstico, donde se aplican las técnicas machine learning. Asimismo, existe un creciente interés en las técnicas de medicina de precisión, donde las técnicas machine learning pueden ofrecer atención médica individualizada a cada paciente. El intervencionismo coronario percutáneo (ICP) con stent se ha convertido en una práctica habitual en la revascularización de los vasos coronarios con enfermedad aterosclerótica obstructiva significativa. El ICP es asimismo patrón oro de tratamiento en pacientes con infarto agudo de miocardio; reduciendo las tasas de muerte e isquemia recurrente en comparación con el tratamiento médico. El éxito a largo plazo del procedimiento está limitado por la reestenosis del stent, un proceso patológico que provoca un estrechamiento arterial recurrente en el sitio de la ICP. Identificar qué pacientes harán reestenosis es un desafío clínico importante; ya que puede manifestarse como un nuevo infarto agudo de miocardio o forzar una nueva resvascularización del vaso afectado, y que en casos de reestenosis recurrente representa un reto terapéutico. Objetivos: Después de realizar una revisión de las técnicas de inteligencia artificial aplicadas a la medicina y con mayor profundidad, de las técnicas machine learning aplicadas a la cardiología, el objetivo principal de esta tesis doctoral ha sido desarrollar un modelo machine learning para predecir la aparición de reestenosis en pacientes con infarto agudo de miocardio sometidos a ICP con implante de un stent. Asimismo, han sido objetivos secundarios comparar el modelo desarrollado con machine learning con los scores clásicos de riesgo de reestenosis utilizados hasta la fecha; y desarrollar un software que permita trasladar esta contribución a la práctica clínica diaria de forma sencilla. Para desarrollar un modelo fácilmente aplicable, realizamos nuestras predicciones sin variables adicionales a las obtenidas en la práctica rutinaria. Material: El conjunto de datos, obtenido del ensayo GRACIA-3, consistió en 263 pacientes con características demográficas, clínicas y angiográficas; 23 de ellos presentaron reestenosis a los 12 meses después de la implantación del stent. Todos los desarrollos llevados a cabo se han hecho en Python y se ha utilizado computación en la nube, en concreto AWS (Amazon Web Services). Metodología: Se ha utilizado una metodología para trabajar con conjuntos de datos pequeños y no balanceados, siendo importante el esquema de validación cruzada anidada utilizado, así como la utilización de las curvas PR (precision-recall, exhaustividad-sensibilidad), además de las curvas ROC, para la interpretación de los modelos. Se han entrenado los algoritmos más habituales en la literatura para elegir el que mejor comportamiento ha presentado. Resultados: El modelo con mejores resultados ha sido el desarrollado con un clasificador extremely randomized trees; que superó significativamente (0,77; área bajo la curva ROC a los tres scores clínicos clásicos; PRESTO-1 (0,58), PRESTO-2 (0,58) y TLR (0,62). Las curvas exhaustividad sensibilidad ofrecieron una imagen más precisa del rendimiento del modelo extremely randomized trees que muestra un algoritmo eficiente (0,96) para no reestenosis, con alta exhaustividad y alta sensibilidad. Para un umbral considerado óptimo, de 1,000 pacientes sometidos a implante de stent, nuestro modelo machine learning predeciría correctamente 181 (18%) más casos en comparación con el mejor score de riesgo clásico (TLR). Las variables más importantes clasificadas según su contribución a las predicciones fueron diabetes, enfermedad coronaria en 2 ó más vasos, flujo TIMI post-ICP, plaquetas anormales, trombo post-ICP y colesterol anormal. Finalmente, se ha desarrollado una calculadora para trasladar el modelo a la práctica clínica. La calculadora permite estimar el riesgo individual de cada paciente y situarlo en una zona de riesgo, facilitando la toma de decisión al médico en cuanto al seguimiento adecuado para el mismo. Conclusiones: Aplicado inmediatamente después de la implantación del stent, un modelo machine learning diferencia mejor a aquellos pacientes que presentarán o no reestenosis respecto a los discriminadores clásicos actuales

    Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults

    Get PDF
    As global demographics change, ageing is a global phenomenon which is increasingly of interest in our modern and rapidly changing society. Thus, the application of proper prognostic indices in clinical decisions regarding mortality prediction has assumed a significant importance for personalized risk management (i.e., identifying patients who are at high or low risk of death) and to help ensure effective healthcare services to patients. Consequently, prognostic modelling expressed as all‐cause mortality prediction is an important step for effective patient management. Machine learning has the potential to transform prognostic modelling. In this paper, results on the development of machine learning models for all‐cause mortality prediction in a cohort of healthy older adults are reported. The models are based on features covering anthropometric variables, physical and lab examinations, questionnaires, and lifestyles, as well as wearable data collected in free‐living settings, obtained for the “Healthy Ageing Initiative” study conducted on 2291 recruited participants. Several machine learning techniques including feature engineering, feature selection, data augmentation and resampling were investigated for this purpose. A detailed empirical comparison of the impact of the different techniques is presented and discussed. The achieved performances were also compared with a standard epidemiological model. This investigation showed that, for the dataset under consideration, the best results were achieved with Random Under‐ Sampling in conjunction with Random Forest (either with or without probability calibration). However, while including probability calibration slightly reduced the average performance, it increased the model robustness, as indicated by the lower 95% confidence intervals. The analysis showed that machine learning models could provide comparable results to standard epidemiological models while being completely data‐driven and disease‐agnostic, thus demonstrating the opportunity for building machine learning models on health records data for research and clinical practice. However, further testing is required to significantly improve the model performance and its robustness

    Predictive model for acute myocardial infarction in working-age population: a machine learning approach

    Get PDF
    Cardiovascular diseases are the leading cause of mortality in Latin America, particularly acute myocardial infarction (AMI), which is the primary cause of atherosclerotic cardiovascular morbidity. This study aims to develop a predictive model for the probability of AMI occurrence in the working-age population, based on atherogenic indices, paraclinical variables, and anthropometric measures. The research conducted a cross-sectional study involving 427 workers aged 40 years or older in Popayán, Colombia. Out of this population, 202 individuals were screened with a 95% confidence interval and a 5% error margin. Epidemiological, anthropometric, and paraclinical data were collected. A binary logistic regression model was employed to identify variables directly associated with the probability of AMI. Predictive classification models were generated using statistical software JASP and the programming language Python. During the training stage, JASP produced a model with an accuracy of 87.5%, while Python generated a model with an accuracy of 90.2%. In the validation stage, JASP achieved an accuracy of 93%, and Python reached 95%. These results establish an effective model for predicting the probability of AMI in the working population
    corecore