37 research outputs found

    Identifying patients with undiagnosed small intestinal neuroendocrine tumours in primary care using statistical and machine learning: model development and validation study

    Get PDF
    Background: Neuroendocrine tumours (NETs) are increasing in incidence, often diagnosed at advanced stages, and individuals may experience years of diagnostic delay, particularly when arising from the small intestine (SI). Clinical prediction models could present novel opportunities for case finding in primary care. Methods: An open cohort of adults (18+ years) contributing data to the Optimum Patient Care Research Database between 1st Jan 2000 and 30th March 2023 was identified. This database collects de-identified data from general practices in the UK. Model development approaches comprised logistic regression, penalised regression, and XGBoost. Performance (discrimination and calibration) was assessed using internal-external cross-validation. Decision analysis curves compared clinical utility. Results: Of 11.7 million individuals, 382 had recorded SI NET diagnoses (0.003%). The XGBoost model had the highest AUC (0.869, 95% confidence interval [CI]: 0.841–0.898) but was mildly miscalibrated (slope 1.165, 95% CI: 1.088–1.243; calibration-in-the-large 0.010, 95% CI: −0.164 to 0.185). Clinical utility was similar across all models. Discussion: Multivariable prediction models may have clinical utility in identifying individuals with undiagnosed SI NETs using information in their primary care records. Further evaluation including external validation and health economics modelling may identify cost-effective strategies for case finding for this uncommon tumour

    Widespread genomic influences on phenotype in Dravet syndrome, a ‘monogenic’ condition

    Get PDF
    Dravet syndrome is an archetypal rare severe epilepsy, considered “monogenic”, typically caused by loss-of-function SCN1A variants. Despite a recognisable core phenotype, its marked phenotypic heterogeneity is incompletely explained by differences in the causal SCN1A variant or clinical factors. In 34 adults with SCN1A-related Dravet syndrome, we show additional genomic variation beyond SCN1A contributes to phenotype and its diversity, with an excess of rare variants in epilepsy-related genes as a set and examples of blended phenotypes, including one individual with an ultra-rare DEPDC5 variant and focal cortical dysplasia. Polygenic risk scores for intelligence are lower, and for longevity, higher, in Dravet syndrome than in epilepsy controls. The causal, major-effect, SCN1A variant may need to act against a broadly compromised genomic background to generate the full Dravet syndrome phenotype, whilst genomic resilience may help to ameliorate the risk of premature mortality in adult Dravet syndrome survivors

    Supplementary data for "The diagnostic odyssey in children and adolescents with X-linked hypophosphataemia: population-based, case-control study"

    No full text
    This study explored the recording of clinical features and the diagnostic odyssey of children and adolescents with X-linked hypophosphataemia in primary care electronic healthcare records in the United Kingdom.</p

    A machine learning algorithm for the detection of paroxysmal nocturnal haemoglobinuria (PNH) in UK primary care electronic health records

    No full text
    Abstract Background Paroxysmal Nocturnal Haemoglobinuria (PNH) is an ultra-rare, acquired disorder that is challenging to diagnose due to varied symptoms, heterogeneous patient presentations, and lack of awareness among healthcare professionals. This leads to frequent misdiagnosis and delays in diagnosis. This study evaluated the feasibility of a machine learning model to identify undiagnosed PNH patients using structured electronic health records. Methods The study used data from the Optimum Patient Care Research Database, which contains electronic health records from general practitioner (GP) practices across the United Kingdom. PNH patients were identified by the presence, and control patients by the absence of a PNH diagnosis code in their records. Clinical features (symptoms, diagnoses, healthcare utilisation) from 131 patients in the PNH group and 593,838 patients in the control group, were inputted to a tree-based XGBoost machine learning model to classify patients as either “positive” or “negative” for PNH suspicion. The algorithm was finalised after additional exclusions and inclusions applied. Performance was assessed using positive predictive value (PPV), recall and specificity. As the sample used to develop the algorithm was not representative of the true population prevalence, PPV was additionally adjusted to reflect performance in the wider population. Results Of all the patients in the PNH group, 27% were classified as positive (recall). 99.99% of the control group were classified as negative (specificity). Of all the patients classified as positive, 60.4% had a diagnosis of PNH in their record (PPV). The PPV adjusted for the population prevalence of PNH was 19.59 suggesting nearly 1 in 5 patients flagged may warrant further PNH investigation. The key clinical features in the model were aplastic anaemia, pancytopenia, haemolytic anaemia, myelodysplastic syndrome, and Budd-Chiari syndrome. Conclusion This is the first study to combine clinical understanding of PNH with machine learning, demonstrating the ability to discriminate between PNH and control patients in retrospective electronic health records. With further investigation and validation, this algorithm could be deployed on live health data, potentially leading to earlier diagnosis for patients who currently experience long diagnostic delays or remain undiagnosed

    Whole genome sequences discriminate hereditary hemorrhagic telangiectasia phenotypes by non-HHT deleterious DNA variation

    Get PDF
    The abnormal vascular structures of hereditary hemorrhagic telangiectasia (HHT) often cause severe anemia due to recurrent hemorrhage, but HHT causal genes do not predict the severity of hematological complications. We tested for chance inheritance and clinical associations of rare deleterious variants in which loss-of-function causes bleeding or hemolytic disorders in the general population. In double-blinded analyses, all 104 patients with HHT from a single reference center recruited to the 100 000 Genomes Project were categorized on new MALO (more/as-expected/less/opposite) sub-phenotype severity scales, and whole genome sequencing data were tested for high impact variants in 75 HHT-independent genes encoding coagulation factors, or platelet, hemoglobin, erythrocyte enzyme, and erythrocyte membrane constituents. Rare variants (all gnomAD allele frequencies 15 were supported by gene-level mutation significance cutoff scores. CADD >15 variants were identified in 38/104 (36.5%) patients with HHT, found for 1 in 10 patients within platelet genes; 1 in 8 within coagulation genes; and 1 in 4 within erythrocyte hemolytic genes. In blinded analyses, patients with greater hemorrhagic severity that had been attributed solely to HHT vessels had more CADD-deleterious variants in platelet (Spearman ρ = 0.25; P = .008) and coagulation (Spearman ρ = 0.21; P = .024) genes. However, the HHT cohort had 60% fewer deleterious variants in platelet and coagulation genes than expected (Mann-Whitney test P = .021). In conclusion, patients with HHT commonly have rare variants in genes of relevance to their phenotype, offering new therapeutic targets and opportunities for informed, personalized medicine strategies

    Whole genome sequences discriminate hereditary hemorrhagic telangiectasia phenotypes by non-HHT deleterious DNA variation

    Full text link
    AbstractThe abnormal vascular structures of hereditary hemorrhagic telangiectasia (HHT) often cause severe anemia due to recurrent hemorrhage, but HHT causal genes do not predict the severity of hematological complications. We tested for chance inheritance and clinical associations of rare deleterious variants in which loss-of-function causes bleeding or hemolytic disorders in the general population. In double-blinded analyses, all 104 patients with HHT from a single reference center recruited to the 100 000 Genomes Project were categorized on new MALO (more/as-expected/less/opposite) sub-phenotype severity scales, and whole genome sequencing data were tested for high impact variants in 75 HHT-independent genes encoding coagulation factors, or platelet, hemoglobin, erythrocyte enzyme, and erythrocyte membrane constituents. Rare variants (all gnomAD allele frequencies &amp;lt;0.003) were identified in 56 (75%) of these 75 HHT-unrelated genes. Deleteriousness assignments by Combined Annotation Dependent Depletion (CADD) scores &amp;gt;15 were supported by gene-level mutation significance cutoff scores. CADD &amp;gt;15 variants were identified in 38/104 (36.5%) patients with HHT, found for 1 in 10 patients within platelet genes; 1 in 8 within coagulation genes; and 1 in 4 within erythrocyte hemolytic genes. In blinded analyses, patients with greater hemorrhagic severity that had been attributed solely to HHT vessels had more CADD-deleterious variants in platelet (Spearman ρ = 0.25; P = .008) and coagulation (Spearman ρ = 0.21; P = .024) genes. However, the HHT cohort had 60% fewer deleterious variants in platelet and coagulation genes than expected (Mann-Whitney test P = .021). In conclusion, patients with HHT commonly have rare variants in genes of relevance to their phenotype, offering new therapeutic targets and opportunities for informed, personalized medicine strategies.</jats:p
    corecore