69 research outputs found

    In the Name of Fairness: Assessing the Bias in Clinical Record De-identification

    Full text link
    Data sharing is crucial for open science and reproducible research, but the legal sharing of clinical data requires the removal of protected health information from electronic health records. This process, known as de-identification, is often achieved through the use of machine learning algorithms by many commercial and open-source systems. While these systems have shown compelling results on average, the variation in their performance across different demographic groups has not been thoroughly examined. In this work, we investigate the bias of de-identification systems on names in clinical notes via a large-scale empirical analysis. To achieve this, we create 16 name sets that vary along four demographic dimensions: gender, race, name popularity, and the decade of popularity. We insert these names into 100 manually curated clinical templates and evaluate the performance of nine public and private de-identification methods. Our findings reveal that there are statistically significant performance gaps along a majority of the demographic dimensions in most methods. We further illustrate that de-identification quality is affected by polysemy in names, gender context, and clinical note characteristics. To mitigate the identified gaps, we propose a simple and method-agnostic solution by fine-tuning de-identification methods with clinical context and diverse names. Overall, it is imperative to address the bias in existing methods immediately so that downstream stakeholders can build high-quality systems to serve all demographic parties fairly.Comment: Accepted by FAccT 2023; updated appendix with the de-identification performance of GPT-

    The eICU Collaborative Research Database, a freely available multi-center database for critical care research

    Get PDF
    Critical care patients are monitored closely through the course of their illness. As a result of this monitoring, large amounts of data are routinely collected for these patients. Philips Healthcare has developed a telehealth system, the eICU Program, which leverages these data to support management of critically ill patients. Here we describe the eICU Collaborative Research Database, a multi-center intensive care unit (ICU)database with high granularity data for over 200,000 admissions to ICUs monitored by eICU Programs across the United States. The database is deidentified, and includes vital sign measurements, care plan documentation, severity of illness measures, diagnosis information, treatment information, and more. Data are publicly available after registration, including completion of a training course in research with human subjects and signing of a data use agreement mandating responsible handling of the data and adhering to the principle of collaborative research. The freely available nature of the data will support a number of applications including the development of machine learning algorithms, decision support tools, and clinical research

    Datathons and Software to Promote Reproducible Research

    Get PDF
    Background: Datathons facilitate collaboration between clinicians, statisticians, and data scientists in order to answer important clinical questions. Previous datathons have resulted in numerous publications of interest to the critical care community and serve as a viable model for interdisciplinary collaboration. Objective: We report on an open-source software called Chatto that was created by members of our group, in the context of the second international Critical Care Datathon, held in September 2015. Methods: Datathon participants formed teams to discuss potential research questions and the methods required to address them. They were provided with the Chatto suite of tools to facilitate their teamwork. Each multidisciplinary team spent the next 2 days with clinicians working alongside data scientists to write code, extract and analyze data, and reformulate their queries in real time as needed. All projects were then presented on the last day of the datathon to a panel of judges that consisted of clinicians and scientists. Results: Use of Chatto was particularly effective in the datathon setting, enabling teams to reduce the time spent configuring their research environments to just a few minutes—a process that would normally take hours to days. Chatto continued to serve as a useful research tool after the conclusion of the datathon. Conclusions: This suite of tools fulfills two purposes: (1) facilitation of interdisciplinary teamwork through archiving and version control of datasets, analytical code, and team discussions, and (2) advancement of research reproducibility by functioning postpublication as an online environment in which independent investigators can rerun or modify analyses with relative ease. With the introduction of Chatto, we hope to solve a variety of challenges presented by collaborative data mining projects while improving research reproducibility

    The association between the neutrophil-to-lymphocyte ratio and mortality in critical illness: an observational cohort study

    Get PDF
    Introduction The neutrophil-to-lymphocyte ratio (NLR) is a biological marker that has been shown to be associated with outcomes in patients with a number of different malignancies. The objective of this study was to assess the relationship between NLR and mortality in a population of adult critically ill patients. Methods We performed an observational cohort study of unselected intensive care unit (ICU) patients based on records in a large clinical database. We computed individual patient NLR and categorized patients by quartile of this ratio. The association of NLR quartiles and 28-day mortality was assessed using multivariable logistic regression. Secondary outcomes included mortality in the ICU, in-hospital mortality and 1-year mortality. An a priori subgroup analysis of patients with versus without sepsis was performed to assess any differences in the relationship between the NLR and outcomes in these cohorts. Results A total of 5,056 patients were included. Their 28-day mortality rate was 19%. The median age of the cohort was 65 years, and 47% were female. The median NLR for the entire cohort was 8.9 (interquartile range, 4.99 to 16.21). Following multivariable adjustments, there was a stepwise increase in mortality with increasing quartiles of NLR (first quartile: reference category; second quartile odds ratio (OR) = 1.32; 95% confidence interval (CI), 1.03 to 1.71; third quartile OR = 1.43; 95% CI, 1.12 to 1.83; 4th quartile OR = 1.71; 95% CI, 1.35 to 2.16). A similar stepwise relationship was identified in the subgroup of patients who presented without sepsis. The NLR was not associated with 28-day mortality in patients with sepsis. Increasing quartile of NLR was statistically significantly associated with secondary outcome. Conclusion The NLR is associated with outcomes in unselected critically ill patients. In patients with sepsis, there was no statistically significant relationship between NLR and mortality. Further investigation is required to increase understanding of the pathophysiology of this relationship and to validate these findings with data collected prospectively.National Institutes of Health (U.S.) (Grant R01 EB017205-01A1

    Evolution of long-term vaccine-induced and hybrid immunity in healthcare workers after different COVID-19 vaccine regimens

    Get PDF
    BACKGROUND: Both infection and vaccination, alone or in combination, generate antibody and T cell responses against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, the maintenance of such responses-and hence protection from disease-requires careful characterization. In a large prospective study of UK healthcare workers (HCWs) (Protective Immunity from T Cells in Healthcare Workers [PITCH], within the larger SARS-CoV-2 Immunity and Reinfection Evaluation [SIREN] study), we previously observed that prior infection strongly affected subsequent cellular and humoral immunity induced after long and short dosing intervals of BNT162b2 (Pfizer/BioNTech) vaccination. METHODS: Here, we report longer follow-up of 684 HCWs in this cohort over 6-9 months following two doses of BNT162b2 or AZD1222 (Oxford/AstraZeneca) vaccination and up to 6 months following a subsequent mRNA booster vaccination. FINDINGS: We make three observations: first, the dynamics of humoral and cellular responses differ; binding and neutralizing antibodies declined, whereas T and memory B cell responses were maintained after the second vaccine dose. Second, vaccine boosting restored immunoglobulin (Ig) G levels; broadened neutralizing activity against variants of concern, including Omicron BA.1, BA.2, and BA.5; and boosted T cell responses above the 6-month level after dose 2. Third, prior infection maintained its impact driving larger and broader T cell responses compared with never-infected people, a feature maintained until 6 months after the third dose. CONCLUSIONS: Broadly cross-reactive T cell responses are well maintained over time-especially in those with combined vaccine and infection-induced immunity ("hybrid" immunity)-and may contribute to continued protection against severe disease

    Surface Generated Acoustic Wave Biosensors for the Detection of Pathogens: A Review

    Get PDF
    This review presents a deep insight into the Surface Generated Acoustic Wave (SGAW) technology for biosensing applications, based on more than 40 years of technological and scientific developments. In the last 20 years, SGAWs have been attracting the attention of the biochemical scientific community, due to the fact that some of these devices - Shear Horizontal Surface Acoustic Wave (SH-SAW), Surface Transverse Wave (STW), Love Wave (LW), Flexural Plate Wave (FPW), Shear Horizontal Acoustic Plate Mode (SH-APM) and Layered Guided Acoustic Plate Mode (LG-APM) - have demonstrated a high sensitivity in the detection of biorelevant molecules in liquid media. In addition, complementary efforts to improve the sensing films have been done during these years. All these developments have been made with the aim of achieving, in a future, a highly sensitive, low cost, small size, multi-channel, portable, reliable and commercially established SGAW biosensor. A setup with these features could significantly contribute to future developments in the health, food and environmental industries. The second purpose of this work is to describe the state-of-the-art of SGAW biosensors for the detection of pathogens, being this topic an issue of extremely importance for the human health. Finally, the review discuses the commercial availability, trends and future challenges of the SGAW biosensors for such applications

    A communal catalogue reveals Earth’s multiscale microbial diversity

    Get PDF
    Our growing awareness of the microbial world’s importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth’s microbial diversity

    A communal catalogue reveals Earth's multiscale microbial diversity

    Get PDF
    Our growing awareness of the microbial world's importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth's microbial diversity.Peer reviewe

    Health, education, and social care provision after diagnosis of childhood visual disability

    Get PDF
    Aim: To investigate the health, education, and social care provision for children newly diagnosed with visual disability.Method: This was a national prospective study, the British Childhood Visual Impairment and Blindness Study 2 (BCVIS2), ascertaining new diagnoses of visual impairment or severe visual impairment and blindness (SVIBL), or equivalent vi-sion. Data collection was performed by managing clinicians up to 1-year follow-up, and included health and developmental needs, and health, education, and social care provision.Results: BCVIS2 identified 784 children newly diagnosed with visual impairment/SVIBL (313 with visual impairment, 471 with SVIBL). Most children had associated systemic disorders (559 [71%], 167 [54%] with visual impairment, and 392 [84%] with SVIBL). Care from multidisciplinary teams was provided for 549 children (70%). Two-thirds (515) had not received an Education, Health, and Care Plan (EHCP). Fewer children with visual impairment had seen a specialist teacher (SVIBL 35%, visual impairment 28%, χ2p < 0.001), or had an EHCP (11% vs 7%, χ2p < 0 . 01).Interpretation: Families need additional support from managing clinicians to access recommended complex interventions such as the use of multidisciplinary teams and educational support. This need is pressing, as the population of children with visual impairment/SVIBL is expected to grow in size and complexity.This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore