316 research outputs found

    Prediction of atrial fibrillation and stroke using machine learning models in UK Biobank

    Get PDF
    Objective: Atrial fibrillation (AF) is the most common cardiac arrythmia, and it is associated with increased risk for ischemic stroke, which is underestimated, as AF can be asymptomatic. The aim of this study was to develop optimal ML models for prediction of AF in the population, and secondly for ischemic stroke in AF patients. Methods: To develop ML models for prediction of 1) AF in the general population and 2) ischemic stroke in patients with AF we constructed XGBoost, LightGBM, Random Forest, Deep Neural Network, Support Vector Machine and Lasso penalised logistic regression models using UK-Biobank's extensive real-world clinical data, questionnaires, as well as biochemical and genetic data, and their predictive performances were compared. Ranking and contribution of the different features was assessed by SHapley Additive exPlanations (SHAP) analysis. The clinical tool CHA2DS2-VASc for prediction of ischemic stroke among AF patients, was used for comparison to the best performing ML model. Findings: The best performing model for AF prediction was LightGBM, with an area-under-the-roc-curve (AUROC) of 0.729 (95% confidence intervals (CI): 0.719, 0.738). The best performing model for ischemic stroke prediction in AF patients was XGBoost with AUROC of 0.631 (95% CI: 0.604, 0.657). The improved AUROC in the XGBoost model compared to CHA2DS2-VASc was statistically significant based on DeLong's test (p-value = 2.20E-06). In addition, the SHAP analysis showed that several peripheral blood biomarkers (e.g. creatinine, glycated haemoglobin, monocytes) were associated with ischemic stroke, which are not considered by CHA2DS2-VASc. Implications: The best performing ML models presented have the potential for clinical use, but further validation in independent studies is required. Our results endorse the incorporation of some routinely measured blood biomarkers for ischemic stroke prediction in AF patients

    Applications of machine and deep learning to thyroid cytology and histopathology: a review.

    Get PDF
    This review synthesises past research into how machine and deep learning can improve the cyto- and histopathology processing pipelines for thyroid cancer diagnosis. The current gold-standard preoperative technique of fine-needle aspiration cytology has high interobserver variability, often returns indeterminate samples and cannot reliably identify some pathologies; histopathology analysis addresses these issues to an extent, but it requires surgical resection of the suspicious lesions so cannot influence preoperative decisions. Motivated by these issues, as well as by the chronic shortage of trained pathologists, much research has been conducted into how artificial intelligence could improve current pipelines and reduce the pressure on clinicians. Many past studies have indicated the significant potential of automated image analysis in classifying thyroid lesions, particularly for those of papillary thyroid carcinoma, but these have generally been retrospective, so questions remain about both the practical efficacy of these automated tools and the realities of integrating them into clinical workflows. Furthermore, the nature of thyroid lesion classification is significantly more nuanced in practice than many current studies have addressed, and this, along with the heterogeneous nature of processing pipelines in different laboratories, means that no solution has proven itself robust enough for clinical adoption. There are, therefore, multiple avenues for future research: examine the practical implementation of these algorithms as pathologist decision-support systems; improve interpretability, which is necessary for developing trust with clinicians and regulators; and investigate multiclassification on diverse multicentre datasets, aiming for methods that demonstrate high performance in a process- and equipment-agnostic manner

    COVID-19 susceptibility variants associate with blood clots, thrombophlebitis and circulatory diseases.

    Get PDF
    Epidemiological studies suggest that individuals with comorbid conditions including diabetes, chronic lung, inflammatory and vascular disease, are at higher risk of adverse COVID-19 outcomes. Genome-wide association studies have identified several loci associated with increased susceptibility and severity for COVID-19. However, it is not clear whether these associations are genetically determined or not. We used a Phenome-Wide Association (PheWAS) approach to investigate the role of genetically determined COVID-19 susceptibility on disease related outcomes. PheWAS analyses were performed in order to identify traits and diseases related to COVID-19 susceptibility and severity, evaluated through a predictive COVID-19 risk score. We utilised phenotypic data in up to 400,000 individuals from the UK Biobank, including Hospital Episode Statistics and General Practice data. We identified a spectrum of associations between both genetically determined COVID-19 susceptibility and severity with a number of traits. COVID-19 risk was associated with increased risk for phlebitis and thrombophlebitis (OR = 1.11, p = 5.36e-08). We also identified significant signals between COVID-19 susceptibility with blood clots in the leg (OR = 1.1, p = 1.66e-16) and with increased risk for blood clots in the lung (OR = 1.12, p = 1.45 e-10). Our study identifies significant association of genetically determined COVID-19 with increased blood clot events in leg and lungs. The reported associations between both COVID-19 susceptibility and severity and other diseases adds to the identification and stratification of individuals at increased risk, adverse outcomes and long-term effects

    No Clinically Relevant Effect of Heart Rate Increase and Heart Rate Recovery During Exercise on Cardiovascular Disease: A Mendelian Randomization Analysis

    Get PDF
    Background: Reduced heart rate (HR) increase (HRI), recovery (HRR), and higher resting HR are associated with cardiovascular (CV) disease, but causal inferences have not been deduced. We investigated causal effects of HRI, HRR, and resting HR on CV risk, all-cause mortality (ACM), atrial fibrillation (AF), coronary artery disease (CAD), and ischemic stroke (IS) using Mendelian Randomization. Methods: 11 variants for HRI, 11 for HRR, and two sets of 46 and 414 variants for resting HR were obtained from four genome-wide association studies (GWASs) on UK Biobank. We performed a lookup on GWASs for CV risk and ACM in UK Biobank (N = 375,367, 5.4% cases and N = 393,165, 4.4% cases, respectively). For CAD, AF, and IS, we used publicly available summary statistics. We used a random-effects inverse-variance weighted (IVW) method and sensitivity analyses to estimate causality. Results: IVW showed a nominally significant effect of HRI on CV events (odds ratio [OR] = 1.0012, P = 4.11 × 10–2) and on CAD and AF. Regarding HRR, IVW was not significant for any outcome. The IVW method indicated statistically significant associations of resting HR with AF (OR = 0.9825, P = 9.8 × 10–6), supported by all sensitivity analyses, and a nominally significant association with IS (OR = 0.9926, P = 9.82 × 10–3). Conclusion: Our findings suggest no strong evidence of an association between HRI and HRR and any outcome and confirm prior work reporting a highly significant effect of resting HR on AF. Future research is required to explore HRI and HRR associations further using more powerful predictors, when available

    The role of thyroid function in borderline personality disorder and schizophrenia: a Mendelian Randomisation study.

    Get PDF
    BACKGROUND: Genome-wide association studies have reported a genetic overlap between borderline personality disorder (BPD) and schizophrenia (SCZ). Epidemiologically, the direction and causality of the association between thyroid function and risk of BPD and SCZ are unclear. We aim to test whether genetically predicted variations in TSH and FT4 levels or hypothyroidism are associated with the risk of BPD and SCZ. METHODS: We employed Mendelian Randomisation (MR) analyses using genetic instruments associated with TSH and FT4 levels as well as hypothyroidism to examine the effects of genetically predicted thyroid function on BPD and SCZ risk. Bidirectional MR analyses were employed to investigate a potential reverse causal association. RESULTS: Genetically predicted higher FT4 was not associated with the risk of BPD (OR: 1.18; P = 0.60, IVW) or the risk of SCZ (OR: 0.93; P = 0.19, IVW). Genetically predicted higher TSH was not associated with the risk of BPD (OR: 1.11; P = 0.51, IVW) or SCZ (OR: 0.98, P = 0.55, IVW). Genetically predicted hypothyroidism was not associated with BPD or SCZ. We found no evidence for a reverse causal effect between BPD or SCZ on thyroid function. CONCLUSIONS: We report evidence for a null association between genetically predicted FT4, TSH or hypothyroidism with BPD or SCZ risk. There was no evidence for reverse causality

    Identification and analysis of individuals who deviate from their genetically-predicted phenotype

    Get PDF
    This is the final version. Available from Public Library of Science via the DOI in this record. Data Availability: The research utilised data from the UK Biobank resource carried out under UK Biobank application number 9072. UK Biobank protocols were approved by the National Research Ethics Service Committee. Individual-level data cannot be shared publicly because of data access policies of the UK Biobank. Data are available from the UK Biobank for researchers who meet the criteria for access to datasets to UK Biobank (http://www.ukbiobank.ac.uk). The weights used to calculate the polygenic score for height is available in Table C in S1 Data. The weights used to calculate the polygenic score for LDL-cholesterol, calculated in a meta-analysis excluding UK Biobank, are available from the Global Lipids Genetics Consortium at https://csg.sph.umich.edu/willer/public/glgc-lipids2021/.Findings from genome-wide association studies have facilitated the generation of genetic predictors for many common human phenotypes. Stratifying individuals misaligned to a genetic predictor based on common variants may be important for follow-up studies that aim to identify alternative causal factors. Using genome-wide imputed genetic data, we aimed to classify 158,951 unrelated individuals from the UK Biobank as either concordant or deviating from two well-measured phenotypes. We first applied our methods to standing height: our primary analysis classified 244 individuals (0.15%) as misaligned to their genetically predicted height. We show that these individuals are enriched for self-reporting being shorter or taller than average at age 10, diagnosed congenital malformations, and rare loss-of-function variants in genes previously catalogued as causal for growth disorders. Secondly, we apply our methods to LDL cholesterol (LDL-C). We classified 156 (0.12%) individuals as misaligned to their genetically predicted LDL-C and show that these individuals were enriched for both clinically actionable cardiovascular risk factors and rare genetic variants in genes previously shown to be involved in metabolic processes. Individuals whose LDL-C was higher than expected based on the genetic predictor were also at higher risk of developing coronary artery disease and type-two diabetes, even after adjustment for measured LDL-C, BMI and age, suggesting upward deviation from genetically predicted LDL-C is indicative of generally poor health. Our results remained broadly consistent when performing sensitivity analysis based on a variety of parametric and non-parametric methods to define individuals deviating from polygenic expectation. Our analyses demonstrate the potential importance of quantitatively identifying individuals for further follow-up based on deviation from genetic predictions.Innovative Medicines Initiative 2 Joint UndertakingAcademy of Medical SciencesMedical Research Council (MRC)Australian Research Council (ARC

    A saturated map of common genetic variants associated with human height

    Get PDF
    Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes(1). Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel(2)) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries. A large genome-wide association study of more than 5 million individuals reveals that 12,111 single-nucleotide polymorphisms account for nearly all the heritability of height attributable to common genetic variants.Peer reviewe
    • …
    corecore