54 research outputs found
ssROC: Semi-Supervised ROC Analysis for Reliable and Streamlined Evaluation of Phenotyping Algorithms
High-throughput phenotyping will accelerate the use of
electronic health records (EHRs) for translational research. A critical
roadblock is the extensive medical supervision required for phenotyping
algorithm (PA) estimation and evaluation. To address this challenge, numerous
weakly-supervised learning methods have been proposed to estimate PAs. However,
there is a paucity of methods for reliably evaluating the predictive
performance of PAs when a very small proportion of the data is labeled. To fill
this gap, we introduce a semi-supervised approach (ssROC) for estimation of the
receiver operating characteristic (ROC) parameters of PAs (e.g., sensitivity,
specificity).
ssROC uses a small labeled dataset to
nonparametrically impute missing labels. The imputations are then used for ROC
parameter estimation to yield more precise estimates of PA performance relative
to classical supervised ROC analysis (supROC) using only labeled data. We
evaluated ssROC through in-depth simulation studies and an extensive evaluation
of eight PAs from Mass General Brigham.
In both simulated and real data, ssROC produced ROC
parameter estimates with significantly lower variance than supROC for a given
amount of labeled data. For the eight PAs, our results illustrate that ssROC
achieves similar precision to supROC, but with approximately 60% of the amount
of labeled data on average.
ssROC enables precise evaluation of PA performance to
increase trust in observational health research without demanding large volumes
of labeled data. ssROC is also easily implementable in open-source
software.
When used in conjunction with weakly-supervised PAs,
ssROC facilitates the reliable and streamlined phenotyping necessary for
EHR-based research
TAXN: Translate Align Extract Normalize, a multilingual extraction tool for clinical texts
International audienceSeveral studies have shown that about 80% of the medical information in an electronic health record is only available through unstructured data. Resources such as medical terminologies in languages other than English are limited and restrain the NLP tools. We propose here to leverage English based resources in other languages using a combination of translation, word alignment, entity extraction and term normalization (TAXN). We implement this extraction pipeline in an opensource library called "medkit". We demonstrate the interest of this approach through a specific use-case: enriching a phenotypic dictionary for post-acute sequelae in COVID-19 (PASC). TAXN proved to be efficient to propose new synonyms of UMLS terms using a corpus of 70 articles in French with 356 terms enriched with at least one validated new synonym. This study was based on freely available deeplearning models
Hospitalizations Associated With Mental Health Conditions Among Adolescents in the US and France During the COVID-19 Pandemic
[EN] Importance
The COVID-19 pandemic has been associated with an increase in mental health diagnoses among adolescents, though the extent of the increase, particularly for severe cases requiring hospitalization, has not been well characterized. Large-scale federated informatics approaches provide the ability to efficiently and securely query health care data sets to assess and monitor hospitalization patterns for mental health conditions among adolescents.
Objective
To estimate changes in the proportion of hospitalizations associated with mental health conditions among adolescents following onset of the COVID-19 pandemic.
Design, Setting, and Participants
This retrospective, multisite cohort study of adolescents 11 to 17 years of age who were hospitalized with at least 1 mental health condition diagnosis between February 1, 2019, and April 30, 2021, used patient-level data from electronic health records of 8 children¿s hospitals in the US and France.
Main Outcomes and Measures
Change in the monthly proportion of mental health condition¿associated hospitalizations between the prepandemic (February 1, 2019, to March 31, 2020) and pandemic (April 1, 2020, to April 30, 2021) periods using interrupted time series analysis.
Results
There were 9696 adolescents hospitalized with a mental health condition during the prepandemic period (5966 [61.5%] female) and 11¿101 during the pandemic period (7603 [68.5%] female). The mean (SD) age in the prepandemic cohort was 14.6 (1.9) years and in the pandemic cohort, 14.7 (1.8) years. The most prevalent diagnoses during the pandemic were anxiety (6066 [57.4%]), depression (5065 [48.0%]), and suicidality or self-injury (4673 [44.2%]). There was an increase in the proportions of monthly hospitalizations during the pandemic for anxiety (0.55%; 95% CI, 0.26%-0.84%), depression (0.50%; 95% CI, 0.19%-0.79%), and suicidality or self-injury (0.38%; 95% CI, 0.08%-0.68%). There was an estimated 0.60% increase (95% CI, 0.31%-0.89%) overall in the monthly proportion of mental health¿associated hospitalizations following onset of the pandemic compared with the prepandemic period.
Conclusions and Relevance
In this cohort study, onset of the COVID-19 pandemic was associated with increased hospitalizations with mental health diagnoses among adolescents. These findings support the need for greater resources within children¿s hospitals to care for adolescents with mental health conditions during the pandemic and beyond.Ms Hutch is supported by grant NLM 5T32LM012203-05 from the National Library of Medicine. Dr Aronow is supported by U24 HL148865 from the National Heart, Lung, and Blood Institute (NHLBI), NIH. Dr Cai is supported by R01 HL089778 from the NHLBI, NIH. Dr Hanauer is supported by UL1TR002240 from the National Center for Advancing Translational Sciences (NCATS), NIH. Dr Luo is supported by U01TR003528 from the NCATS, NIH, and 1R01LM013337 from the National Library of Medicine. Dr Sanchez-Pinto is supported by R01HD105939 from the National Institute of Child Health and Human Development, NIH. Dr South is supported by K23HL148394 and L40HL148910 from the NHLBI, NIH, and UL1TR001420 from the NCATS, NIH. Dr Visweswaran is supported by UL1TR001857 from the NCATS, NIH. Dr Xia is supported by R01NS098023 and R01NS124882 from the National Institute of Neurological Disorders and Stroke, NIH.Gutiérrez-Sacristán, A.; Serret-Larmande, A.; Hutch, MR.; Sáez Silvestre, C.; Aronow, BJ.; Bhatnagar, S.; Bonzel, C.... (2022). Hospitalizations Associated With Mental Health Conditions Among Adolescents in the US and France During the COVID-19 Pandemic. Jama Network Open. 5(12):1-12. https://doi.org/10.1001/jamanetworkopen.2022.4654811251
Recommended from our members
Improved Appropriateness of Advanced Diagnostic Imaging After Implementation of Clinical Decision Support Mechanism
The Protecting Access to Medicare Act (PAMA) mandates clinical decision support mechanism (CDSM) consultation for all advanced imaging. There are a growing number of studies examining the association of CDSM use with imaging appropriateness, but a paucity of multicenter data. This observational study evaluates the association between changes in advanced imaging appropriateness scores with increasing provider exposure to CDSM. Each provider's first 200 consecutive anonymized requisitions for advanced imaging (CT, MRI, ultrasound, nuclear medicine) using a single CDSM (CareSelect, Change Healthcare) between January 1, 2017 and December 31, 2019 were collected from 288 US institutions. Changes in imaging requisition proportions among four appropriateness categories ("usually appropriate" [green], "may be appropriate" [yellow], "usually not appropriate" [red], and unmapped [gray]) were evaluated in relation to the chronological order of the requisition for each provider and total provider exposure to CDSM using logistic regression fits and Wald tests. The number of providers and requisitions included was 244,158 and 7,345,437, respectively. For 10,123 providers with ≥ 200 requisitions (2,024,600 total requisitions), the fraction of green, yellow, and red requisitions among the last 10 requisitions changed by +3.0% (95% confidence interval +2.6% to +3.4%), -0.8% (95% CI -0.5% to -1.1%), and -3.0% (95% CI 3.3% to -2.7%) in comparison with the first 10, respectively. Providers with > 190 requisitions had 8.5% (95% CI 6.3% to 10.7%) more green requisitions, 2.3% (0.7% to 3.9%) fewer yellow requisitions, and 0.5% (95% CI -1.0% to 2.0%) fewer red (not statistically significant) requisitions relative to providers with ≤ 10 requisitions. Increasing provider exposure to CDSM is associated with improved appropriateness scores for advanced imaging requisitions
Generate Analysis-Ready Data for Real-world Evidence: Tutorial for Harnessing Electronic Health Records With Advanced Informatic Technologies
Although randomized controlled trials (RCTs) are the gold standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data has been vital in postapproval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of real-world data is electronic health records (EHRs), which contain detailed information on patient care in both structured (eg, diagnosis codes) and unstructured (eg, clinical notes and images) forms. Despite the granularity of the data available in EHRs, the critical variables required to reliably assess the relationship between a treatment and clinical outcome are challenging to extract. To address this fundamental challenge and accelerate the reliable use of EHRs for RWE, we introduce an integrated data curation and modeling pipeline consisting of 4 modules that leverage recent advances in natural language processing, computational phenotyping, and causal modeling techniques with noisy data. Module 1 consists of techniques for data harmonization. We use natural language processing to recognize clinical variables from RCT design documents and map the extracted variables to EHR features with description matching and knowledge networks. Module 2 then develops techniques for cohort construction using advanced phenotyping algorithms to both identify patients with diseases of interest and define the treatment arms. Module 3 introduces methods for variable curation, including a list of existing tools to extract baseline variables from different sources (eg, codified, free text, and medical imaging) and end points of various types (eg, death, binary, temporal, and numerical). Finally, module 4 presents validation and robust modeling methods, and we propose a strategy to create gold-standard labels for EHR variables of interest to validate data curation quality and perform subsequent causal modeling for RWE. In addition to the workflow proposed in our pipeline, we also develop a reporting guideline for RWE that covers the necessary information to facilitate transparent reporting and reproducibility of results. Moreover, our pipeline is highly data driven, enhancing study data with a rich variety of publicly available information and knowledge sources. We also showcase our pipeline and provide guidance on the deployment of relevant tools by revisiting the emulation of the Clinical Outcomes of Surgical Therapy Study Group Trial on laparoscopy-assisted colectomy versus open colectomy in patients with early-stage colon cancer. We also draw on existing literature on EHR emulation of RCTs together with our own studies with the Mass General Brigham EHR
Heterogeneous associations between interleukin-6 receptor variants and phenotypes across ancestries and implications for therapy.
The Phenome-Wide Association Study (PheWAS) is increasingly used to broadly screen for potential treatment effects, e.g., IL6R variant as a proxy for IL6R antagonists. This approach offers an opportunity to address the limited power in clinical trials to study differential treatment effects across patient subgroups. However, limited methods exist to efficiently test for differences across subgroups in the thousands of multiple comparisons generated as part of a PheWAS. In this study, we developed an approach that maximizes the power to test for heterogeneous genotype-phenotype associations and applied this approach to an IL6R PheWAS among individuals of African (AFR) and European (EUR) ancestries. We identified 29 traits with differences in IL6R variant-phenotype associations, including a lower risk of type 2 diabetes in AFR (OR 0.96) vs EUR (OR 1.0, p-value for heterogeneity = 8.5 × 10-3), and higher white blood cell count (p-value for heterogeneity = 8.5 × 10-131). These data suggest a more salutary effect of IL6R blockade for T2D among individuals of AFR vs EUR ancestry and provide data to inform ongoing clinical trials targeting IL6 for an expanding number of conditions. Moreover, the method to test for heterogeneity of associations can be applied broadly to other large-scale genotype-phenotype screens in diverse populations
Acute respiratory distress syndrome after SARS-CoV-2 infection on young adult population: International observational federated study based on electronic health records through the 4CE consortium.
PurposeIn young adults (18 to 49 years old), investigation of the acute respiratory distress syndrome (ARDS) after severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has been limited. We evaluated the risk factors and outcomes of ARDS following infection with SARS-CoV-2 in a young adult population.MethodsA retrospective cohort study was conducted between January 1st, 2020 and February 28th, 2021 using patient-level electronic health records (EHR), across 241 United States hospitals and 43 European hospitals participating in the Consortium for Clinical Characterization of COVID-19 by EHR (4CE). To identify the risk factors associated with ARDS, we compared young patients with and without ARDS through a federated analysis. We further compared the outcomes between young and old patients with ARDS.ResultsAmong the 75,377 hospitalized patients with positive SARS-CoV-2 PCR, 1001 young adults presented with ARDS (7.8% of young hospitalized adults). Their mortality rate at 90 days was 16.2% and they presented with a similar complication rate for infection than older adults with ARDS. Peptic ulcer disease, paralysis, obesity, congestive heart failure, valvular disease, diabetes, chronic pulmonary disease and liver disease were associated with a higher risk of ARDS. We described a high prevalence of obesity (53%), hypertension (38%- although not significantly associated with ARDS), and diabetes (32%).ConclusionTrough an innovative method, a large international cohort study of young adults developing ARDS after SARS-CoV-2 infection has been gather. It demonstrated the poor outcomes of this population and associated risk factor
Proportion and associated risk ratio of complication classes for the young compared to old adult with ARDS.
Proportion and associated risk ratio of complication classes for the young compared to old adult with ARDS.</p
Name, city, country, number of hospitals per HS, number of beds and inpatient discharges/year per HS.
Name, city, country, number of hospitals per HS, number of beds and inpatient discharges/year per HS.</p
- …