69 research outputs found

    Large Language Models to Identify Social Determinants of Health in Electronic Health Records

    Full text link
    Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHR). This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented, and explored the role of synthetic clinical text for improving the extraction of these scarcely documented, yet extremely valuable, clinical data. 800 patient notes were annotated for SDoH categories, and several transformer-based models were evaluated. The study also experimented with synthetic data generation and assessed for algorithmic bias. Our best-performing models were fine-tuned Flan-T5 XL (macro-F1 0.71) for any SDoH, and Flan-T5 XXL (macro-F1 0.70). The benefit of augmenting fine-tuning with synthetic data varied across model architecture and size, with smaller Flan-T5 models (base and large) showing the greatest improvements in performance (delta F1 +0.12 to +0.23). Model performance was similar on the in-hospital system dataset but worse on the MIMIC-III dataset. Our best-performing fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models for both tasks. These fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p<0.05). At the patient-level, our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. Our method can effectively extracted SDoH information from clinic notes, performing better compare to GPT zero- and few-shot settings. These models could enhance real-world evidence on SDoH and aid in identifying patients needing social support.Comment: 38 pages, 5 figures, 5 tables in main, submitted for revie

    Cytotoxic mAb from Rheumatic Carditis Recognizes Heart Valves and Laminin

    Get PDF
    Anti-streptococcal antibodies cross-reactive with N-acetyl-bD-glucosamine (GlcNAc) and myosin are present in the sera of patients with rheumatic fever (RF). However, their role in tissue injury is not clear. In this study, we show that anti-GlcNAc/anti-myosin mAb 3.B6 from a rheumatic carditis patient was cytotoxic for human endothelial cell lines and reacted with human valvular endothelium and underlying basement membrane. Reactivity of mAb 3.B6 with the valve was inhibited by human cardiac myosin \u3e laminin \u3e GlcNAc. The mAb 3.B6 epitopes were localized in fragments of human cardiac myosin, including heavy meromyosin (HMM), the S1 subfragment, and two light meromyosin (LMM) peptides containing amino acid sequences KEALISSLTRGKLTYTQQ (LMM 1) and SERVQLLHSQNTSLINQK (LMM 33). A novel feature of mAb 3.B6 was its reactivity with the extracellular matrix protein laminin, which may explain its reactivity with the valve surface. A laminin A-chain peptide (HTQNT) that includes homology to LMM33 inhibited the reactivity of mAb 3.B6 with human valve. These data support the hypothesis that cross-reactive antibodies in rheumatic carditis cause injury at the endothelium and underlying matrix of the valve

    Ancient genomes reveal complex patterns of population movement, interaction, and replacement in sub-Saharan Africa

    Get PDF
    Africa hosts the greatest human genetic diversity globally, but legacies of ancient population interactions and dispersals across the continent remain understudied. Here, we report genome-wide data from 20 ancient sub-Saharan African individuals, including the first reported ancient DNA from the DRC, Uganda, and Botswana. These data demonstrate the contraction of diverse, once contiguous hunter-gatherer populations, and suggest the resistance to interaction with incoming pastoralists of delayed-return foragers in aquatic environments. We refine models for the spread of food producers into eastern and southern Africa, demonstrating more complex trajectories of admixture than previously suggested. In Botswana, we show that Bantu ancestry post-dates admixture between pastoralists and foragers, suggesting an earlier spread of pastoralism than farming to southern Africa. Our findings demonstrate how processes of migration and admixture have markedly reshaped the genetic map of sub-Saharan Africa in the past few millennia and highlight the utility of combined archaeological and archaeogenetic approaches

    Clinical research without consent in adults in the emergency setting: a review of patient and public views

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In emergency research, obtaining informed consent can be problematic. Research to develop and improve treatments for patients admitted to hospital with life-threatening and debilitating conditions is much needed yet the issue of research without consent (RWC) raises concerns about unethical practices and the loss of individual autonomy. Consistent with the policy and practice turn towards greater patient and public involvement in health care decisions, in the US, Canada and EU, guidelines and legislation implemented to protect patients and facilitate acute research with adults who are unable to give consent have been developed with little involvement of the lay public. This paper reviews research examining public opinion regarding RWC for research in emergency situations, and whether the rules and regulations permitting research of this kind are in accordance with the views of those who ultimately may be the most affected.</p> <p>Methods</p> <p>Seven electronic databases were searched: Medline, Embase, CINAHL, Cochrane Database of Systematic Reviews, Philosopher's Index, Age Info, PsychInfo, Sociological Abstracts and Web of Science. Only those articles pertaining to the views of the public in the US, Canada and EU member states were included. Opinion pieces and those not published in English were excluded.</p> <p>Results</p> <p>Considering the wealth of literature on the perspectives of professionals, there was relatively little information about public attitudes. Twelve studies employing a range of research methods were identified. In five of the six questionnaire surveys around half the sample did <it>not </it>agree generally with RWC, though paradoxically, a higher percentage would <it>personally </it>take part in such a study. Unfortunately most of the studies were not designed to investigate individuals' views in any depth. There also appears to be a level of mistrust of medical research and some patients were more likely to accept an experimental treatment 'outside' of a research protocol.</p> <p>Conclusion</p> <p>There are too few data to evaluate whether the rules and regulations permitting RWC protects – or is acceptable to – the public. However, any attempts to engage the public should take place in the context of findings from further basic research to attend to the apparently paradoxical findings of some of the current surveys.</p

    Mortality Among Adults With Cancer Undergoing Chemotherapy or Immunotherapy and Infected With COVID-19

    Get PDF
    Importance: Large cohorts of patients with active cancers and COVID-19 infection are needed to provide evidence of the association of recent cancer treatment and cancer type with COVID-19 mortality. // Objective: To evaluate whether systemic anticancer treatments (SACTs), tumor subtypes, patient demographic characteristics (age and sex), and comorbidities are associated with COVID-19 mortality. // Design, Setting, and Participants: The UK Coronavirus Cancer Monitoring Project (UKCCMP) is a prospective cohort study conducted at 69 UK cancer hospitals among adult patients (≥18 years) with an active cancer and a clinical diagnosis of COVID-19. Patients registered from March 18 to August 1, 2020, were included in this analysis. // Exposures: SACT, tumor subtype, patient demographic characteristics (eg, age, sex, body mass index, race and ethnicity, smoking history), and comorbidities were investigated. // Main Outcomes and Measures: The primary end point was all-cause mortality within the primary hospitalization. // Results: Overall, 2515 of 2786 patients registered during the study period were included; 1464 (58%) were men; and the median (IQR) age was 72 (62-80) years. The mortality rate was 38% (966 patients). The data suggest an association between higher mortality in patients with hematological malignant neoplasms irrespective of recent SACT, particularly in those with acute leukemias or myelodysplastic syndrome (OR, 2.16; 95% CI, 1.30-3.60) and myeloma or plasmacytoma (OR, 1.53; 95% CI, 1.04-2.26). Lung cancer was also significantly associated with higher COVID-19–related mortality (OR, 1.58; 95% CI, 1.11-2.25). No association between higher mortality and receiving chemotherapy in the 4 weeks before COVID-19 diagnosis was observed after correcting for the crucial confounders of age, sex, and comorbidities. An association between lower mortality and receiving immunotherapy in the 4 weeks before COVID-19 diagnosis was observed (immunotherapy vs no cancer therapy: OR, 0.52; 95% CI, 0.31-0.86). // Conclusions and Relevance: The findings of this study of patients with active cancer suggest that recent SACT is not associated with inferior outcomes from COVID-19 infection. This has relevance for the care of patients with cancer requiring treatment, particularly in countries experiencing an increase in COVID-19 case numbers. Important differences in outcomes among patients with hematological and lung cancers were observed

    SARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissues

    Get PDF
    Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types

    Deep Phenotyping of Post-infectious Myalgic Encephalomyelitis/Chronic Fatigue Syndrome

    Get PDF
    Post-infectious myalgic encephalomyelitis/chronic fatigue syndrome (PI-ME/CFS) is a disabling disorder, yet the clinical phenotype is poorly defined, the pathophysiology is unknown, and no disease-modifying treatments are available. We used rigorous criteria to recruit PI-ME/CFS participants with matched controls to conduct deep phenotyping. Among the many physical and cognitive complaints, one defining feature of PI-ME/CFS was an alteration of effort preference, rather than physical or central fatigue, due to dysfunction of integrative brain regions potentially associated with central catechol pathway dysregulation, with consequences on autonomic functioning and physical conditioning. Immune profiling suggested chronic antigenic stimulation with increase in naïve and decrease in switched memory B-cells. Alterations in gene expression profiles of peripheral blood mononuclear cells and metabolic pathways were consistent with cellular phenotypic studies and demonstrated differences according to sex. Together these clinical abnormalities and biomarker differences provide unique insight into the underlying pathophysiology of PI-ME/CFS, which may guide future intervention

    DataSHIELD: taking the analysis to the data, not the data to the analysis

    Get PDF
    Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling of information from individuals in a central database that may be queried by researchers raises important ethico-legal questions and can be controversial. In the UK this has been highlighted by recent debate and controversy relating to the UK's proposed 'care.data' initiative, and these issues reflect important societal and professional concerns about privacy, confidentiality and intellectual property. DataSHIELD provides a novel technological solution that can circumvent some of the most basic challenges in facilitating the access of researchers and other healthcare professionals to individual-level data. Commands are sent from a central analysis computer (AC) to several data computers (DCs) storing the data to be co-analysed. The data sets are analysed simultaneously but in parallel. The separate parallelized analyses are linked by non-disclosive summary statistics and commands transmitted back and forth between the DCs and the AC. This paper describes the technical implementation of DataSHIELD using a modified R statistical environment linked to an Opal database deployed behind the computer firewall of each DC. Analysis is controlled through a standard R environment at the AC. Based on this Opal/R implementation, DataSHIELD is currently used by the Healthy Obese Project and the Environmental Core Project (BioSHaRE-EU) for the federated analysis of 10 data sets across eight European countries, and this illustrates the opportunities and challenges presented by the DataSHIELD approach. DataSHIELD facilitates important research in settings where: (i) a co-analysis of individual-level data from several studies is scientifically necessary but governance restrictions prohibit the release or sharing of some of the required data, and/or render data access unacceptably slow; (ii) a research group (e.g. in a developing nation) is particularly vulnerable to loss of intellectual property-the researchers want to fully share the information held in their data with national and international collaborators, but do not wish to hand over the physical data themselves; and (iii) a data set is to be included in an individual-level co-analysis but the physical size of the data precludes direct transfer to a new site for analysis
    corecore