33 research outputs found

    Deep learning for electronic health records: risk prediction, explainability, and uncertainty

    Get PDF
    Background: Risk models are essential for care planning and disease prevention. The unsatisfactory performance of the established clinical models has raised broad awareness and concerns. An accurate, explainable, and reliable risk model is highly beneficial but remains a challenge. Objective: This thesis aims to develop deep learning models that can make more accurate risk predictions with the provision of uncertainty estimation and the ability to provide medical explanations using a large and representative electronic health records (EHR) dataset. Methods: We investigated three directions in this thesis: risk prediction, explainability, and uncertainty estimation. For risk prediction, we investigated deep learning tools that can incorporate the minimal processed EHR for modelling and comprehensively compared them with the established machine learning and clinical models. Additionally, the post-hoc explanations were applied to deep learning models for medical information retrieval, and we specifically looked into explanations in risk association and counterfactual reasoning. Uncertainty estimation was qualitatively investigated using probabilistic modelling techniques. Our analyses relied on Clinical Practice Research Datalink, which contains anonymised EHR collected from primary care, secondary care, and death registration and is representative of the UK population. Results: We introduced a deep learning model, named BEHRT, that can incorporate minimal processed EHR for risk prediction. Without expert engagement, it learned meaningful representations that can automatically cluster highly correlated diseases. Compared to the established machine learning and clinical models that relied on expert- selected predictors, our proposed deep learning model showed superior performance on a wide range of risk prediction tasks and highlighted the necessity of recalibration when applying a risk model to a population with severe prior distribution shifts, and the importance of regular model updating to preserve the model’s discrimination performance under temporal data shifts. Additionally, we showed that the deep learning model explanation is an excellent tool for discovering risk factors. By explaining the deep learning model, we not only identified factors that were highly consistent with the established evidence but also those that have not been considered in expert-driven studies. Furthermore, the deep learning model also captured the interplay between risk and treated risk and the differential association of medications across different years, which would be difficult if the temporal context was not included in the modelling. Besides the explanations in terms of association, we introduced a framework that can achieve accurate risk prediction, while enabling counterfactual reasoning under hypothetical interventions. This offers counterfactual explanations that could inform clinicians for selection of those who will benefit the most. We demonstrated the benefit of the proposed framework using two exemplary case studies. Furthermore, transforming a deterministic deep learning model to probabilistic can make predictions with an uncertainty range. We showed that such information has many potential implications in practice, such as quantifying the confidence of a decision, indicating data insufficiency, distinguishing the correct and incorrect predictions, and indicating risk associations. Conclusions: Deep learning models led to substantially improved performance for risk prediction. The ability of uncertainty estimation can quantify the confidence of risk prediction to further inform clinical decision-making. Deep learning model explanation can generate hypotheses to guide medical research and provide counterfactual analysis to assist clinical decision-making. This encouraging evidence supports the great potential of incorporating deep learning methods into electronic health records to inform a wide range of health applications such as care planning, disease prevention, and medical study design

    Impact of heterogeneous reuse on heritage value under the perspective of scale subdivision - Two modern theatres in Nanjing as examples

    Get PDF
    Heterogeneous reuse is a type of reuse where the demand for new functional space differs significantly from the supply of the original space. It causes changes in the building morphology at all scale levels, which in turn has an impact on heritage value. Heterogeneous reuse is prevalent in the conservation of modern Chinese architecture, this article analyses the mechanism of heritage value change under different intervention methods, taking the Shengli Theatre and the Dahua Theatre in Nanjing as examples. Firstly, the value-bearing areas are identified by overall value assessment; secondly, the value of each value-bearing area at each scale level is determined by combining the theory of scale subdivision; Thirdly, influences of different interventions on the value of overall building and the value bearing parts are calculated by comparing the value changes before and after conservation and renovation. This research reveals that heterogeneous reuse often leads to a decline in the emotional and cultural value of built heritage, but enhances the use value. The overall value may also increase if done in appropriate ways. Through this article, the potential of typomorphology in the study of heterogeneous reuse is expanded, and through the integration of scale subdivision and value assessment, the specific heritage values at each scale level can be fine-expressed, while the effectiveness of various interventions can be reasonably evaluated

    GPNet: Simplifying Graph Neural Networks via Multi-channel Geometric Polynomials

    Full text link
    Graph Neural Networks (GNNs) are a promising deep learning approach for circumventing many real-world problems on graph-structured data. However, these models usually have at least one of four fundamental limitations: over-smoothing, over-fitting, difficult to train, and strong homophily assumption. For example, Simple Graph Convolution (SGC) is known to suffer from the first and fourth limitations. To tackle these limitations, we identify a set of key designs including (D1) dilated convolution, (D2) multi-channel learning, (D3) self-attention score, and (D4) sign factor to boost learning from different types (i.e. homophily and heterophily) and scales (i.e. small, medium, and large) of networks, and combine them into a graph neural network, GPNet, a simple and efficient one-layer model. We theoretically analyze the model and show that it can approximate various graph filters by adjusting the self-attention score and sign factor. Experiments show that GPNet consistently outperforms baselines in terms of average rank, average accuracy, complexity, and parameters on semi-supervised and full-supervised tasks, and achieves competitive performance compared to state-of-the-art model with inductive learning task.Comment: 15 pages, 15 figure

    A comparative study of model-centric and data-centric approaches in the development of cardiovascular disease risk prediction models in the UK Biobank

    Get PDF
    Aims A diverse set of factors influence cardiovascular diseases (CVDs), but a systematic investigation of the interplay between these determinants and the contribution of each to CVD incidence prediction is largely missing from the literature. In this study, we leverage one of the most comprehensive biobanks worldwide, the UK Biobank, to investigate the contribution of different risk factor categories to more accurate incidence predictions in the overall population, by sex, different age groups, and ethnicity. Methods and results The investigated categories include the history of medical events, behavioural factors, socioeconomic factors, environmental factors, and measurements. We included data from a cohort of 405 257 participants aged 37–73 years and trained various machine learning and deep learning models on different subsets of risk factors to predict CVD incidence. Each of the models was trained on the complete set of predictors and subsets where each category was excluded. The results were benchmarked against QRISK3. The findings highlight that (i) leveraging a more comprehensive medical history substantially improves model performance. Relative to QRISK3, the best performing models improved the discrimination by 3.78% and improved precision by 1.80%. (ii) Both model- and data-centric approaches are necessary to improve predictive performance. The benefits of using a comprehensive history of diseases were far more pronounced when a neural sequence model, BEHRT, was used. This highlights the importance of the temporality of medical events that existing clinical risk models fail to capture. (iii) Besides the history of diseases, socioeconomic factors and measurements had small but significant independent contributions to the predictive performance. Conclusion These findings emphasize the need for considering broad determinants and novel modelling approaches to enhance CVD incidence prediction

    How much lowering of blood pressure is required to prevent cardiovascular disease in patients with and without previous cardiovascular disease?

    Get PDF
    Purpose of Review To review the recent large-scale randomised evidence on pharmacologic reduction in blood pressure for the primary and secondary prevention of cardiovascular disease. Recent Findings Based on findings of the meta-analysis of individual participant-level data from 48 randomised clinical trials and involving 344,716 participants with mean age of 65 years, the relative reduction in the risk of developing major cardiovascular events was proportional to the magnitude of achieved reduction in blood pressure. For each 5-mmHg reduction in systolic blood pressure, the risk of developing cardiovascular events fell by 10% (hazard ratio [HR] (95% confidence interval [CI], 0.90 [0.88 to 0.92]). When participants were stratified by their history of cardiovascular disease, the HRs (95% CI) in those with and without previous cardiovascular disease were 0.89 (0.86 to 0.92) and 0.91 (0.89 to 0.94), respectively, with no significant heterogeneity in these effects (adjusted P for interaction = 1.0). When these patient groups were further stratified by their baseline systolic blood pressure in increments of 10 mmHg from  Summary Pharmacologic lowering of blood pressure was effective in preventing major cardiovascular disease events both in people with or without previous cardiovascular disease, which was not modified by their baseline blood pressure level. Treatment effects were shown to be proportional to the intensity of blood pressure reduction, but even modest blood pressure reduction, on average, can lead to meaningful gains in the prevention of incident or recurrent cardiovascular disease

    Systolic blood pressure, chronic obstructive pulmonary disease and cardiovascular risk

    Get PDF
    Objective In individuals with complex underlying health problems, the association between systolic blood pressure (SBP) and cardiovascular disease is less well recognised. The association between SBP and risk of cardiovascular events in patients with chronic obstructive pulmonary disease (COPD) was investigated. Methods and analysis In this cohort study, 39 602 individuals with a diagnosis of COPD aged 55–90 years between 1990 and 2009 were identified from validated electronic health records (EHR) in the UK. The association between SBP and risk of cardiovascular end points (composite of ischaemic heart disease, heart failure, stroke and cardiovascular death) was analysed using a deep learning approach. Results In the selected cohort (46.5% women, median age 69 years), 10 987 cardiovascular events were observed over a median follow-up period of 3.9 years. The association between SBP and risk of cardiovascular end points was found to be monotonic; the lowest SBP exposure group of <120 mm Hg presented nadir of risk. With respect to reference SBP (between 120 and 129 mm Hg), adjusted risk ratios for the primary outcome were 0.99 (95% CI 0.93 to 1.05) for SBP of <120 mm Hg, 1.02 (0.97 to 1.07) for SBP between 130 and 139 mm Hg, 1.07 (1.01 to 1.12) for SBP between 140 and 149 mm Hg, 1.11 (1.05 to 1.17) for SBP between 150 and 159 mm Hg and 1.16 (1.10 to 1.22) for SBP ≥160 mm Hg. Conclusion Using deep learning for modelling EHR, we identified a monotonic association between SBP and risk of cardiovascular events in patients with COPD. Data availability statement Data may be obtained from a third party and are not publicly available. More details of the data and data sharing is found on the CPRD website (https://www.cprd.com). Targeted-BEHRT source code can be found on the Deep Medicine GitHub site (https://github.com/deepmedicine/Targeted-BEHRT). Example code for conducting an observational study on mock data and estimating risk ratio can also be found in this code repository

    Systolic blood pressure and cardiovascular risk in patients with diabetes: a prospective cohort study

    Get PDF
    Background: Whether the association between systolic blood pressure (SBP) and risk of cardiovascular disease is monotonic or whether there is a nadir of optimal blood pressure remains controversial. We investigated the association between SBP and cardiovascular events in patients with diabetes across the full spectrum of SBP. Methods: A cohort of 49 000 individuals with diabetes aged 50 to 90 years between 1990 and 2005 was identified from linked electronic health records in the United Kingdom. Associations between SBP and cardiovascular outcomes (ischemic heart disease, heart failure, stroke, and cardiovascular death) were analyzed using a deep learning approach. Results: Over a median follow-up of 7.3 years, 16 378 cardiovascular events were observed. The relationship between SBP and cardiovascular events followed a monotonic pattern, with the group with the lowest baseline SBP of <120 mm Hg exhibiting the lowest risk of cardiovascular events. In comparison to the reference group with the lowest SBP (<120 mm Hg), the adjusted risk ratio for cardiovascular disease was 1.03 (95% CI, 0.97–1.10) for SBP between 120 and 129 mm Hg, 1.05 (0.99–1.11) for SBP between 130 and 139 mm Hg, 1.08 (1.01–1.15) for SBP between 140 and 149 mm Hg, 1.12 (1.03–1.20) for SBP between 150 and 159 mm Hg, and 1.19 (1.09–1.28) for SBP ≥160 mm Hg. Conclusions: Using deep learning modeling, we found a monotonic relationship between SBP and risk of cardiovascular outcomes in patients with diabetes, without evidence of a J-shaped relationship
    corecore