Search CORE

54 research outputs found

ssROC: Semi-Supervised ROC Analysis for Reliable and Streamlined Evaluation of Phenotyping Algorithms

Author: Bonzel Clara-Lea
Gao Jianhui
Gronsbell Jessica
Hong Chuan
Varghese Paul
Zakir Karim
Publication venue
Publication date: 16/06/2023
Field of study

\textbf{Objective:}

High-throughput phenotyping will accelerate the use of electronic health records (EHRs) for translational research. A critical roadblock is the extensive medical supervision required for phenotyping algorithm (PA) estimation and evaluation. To address this challenge, numerous weakly-supervised learning methods have been proposed to estimate PAs. However, there is a paucity of methods for reliably evaluating the predictive performance of PAs when a very small proportion of the data is labeled. To fill this gap, we introduce a semi-supervised approach (ssROC) for estimation of the receiver operating characteristic (ROC) parameters of PAs (e.g., sensitivity, specificity).

\textbf{Materials and Methods:}

ssROC uses a small labeled dataset to nonparametrically impute missing labels. The imputations are then used for ROC parameter estimation to yield more precise estimates of PA performance relative to classical supervised ROC analysis (supROC) using only labeled data. We evaluated ssROC through in-depth simulation studies and an extensive evaluation of eight PAs from Mass General Brigham.

\textbf{Results:}

In both simulated and real data, ssROC produced ROC parameter estimates with significantly lower variance than supROC for a given amount of labeled data. For the eight PAs, our results illustrate that ssROC achieves similar precision to supROC, but with approximately 60% of the amount of labeled data on average.

\textbf{Discussion:}

ssROC enables precise evaluation of PA performance to increase trust in observational health research without demanding large volumes of labeled data. ssROC is also easily implementable in open-source

\texttt{R}

software.

\textbf{Conclusion:}

When used in conjunction with weakly-supervised PAs, ssROC facilitates the reliable and streamlined phenotyping necessary for EHR-based research

arXiv.org e-Print Archive

TAXN: Translate Align Extract Normalize, a multilingual extraction tool for clinical texts

Author: Arias Camila
Birot Olivier
Bonzel Clara, Lea
Cai Tianxi
Coulet Adrien
Han Larry
Huynh Kim, Tam
Lerner Ivan
Neuraz Antoine
Publication venue: HAL CCSD
Publication date: 08/07/2023
Field of study

International audienceSeveral studies have shown that about 80% of the medical information in an electronic health record is only available through unstructured data. Resources such as medical terminologies in languages other than English are limited and restrain the NLP tools. We propose here to leverage English based resources in other languages using a combination of translation, word alignment, entity extraction and term normalization (TAXN). We implement this extraction pipeline in an opensource library called "medkit". We demonstrate the interest of this approach through a specific use-case: enriching a phenotypic dictionary for post-acute sequelae in COVID-19 (PASC). TAXN proved to be efficient to propose new synonyms of UMLS terms using a corpus of 70 articles in French with 356 terms enriched with at least one validated new synonym. This study was based on freely available deeplearning models

INRIA a CCSD electronic archive server

Hospitalizations Associated With Mental Health Conditions Among Adolescents in the US and France During the COVID-19 Pandemic

Author: Aronow Bruce J.
Bhatnagar Surbhi
Bonzel Clara-Lea
Cai Tianxi
Devkota Bastal
Gutiérrez-Sacristán Alba
Hanauer David A.
Hutch Meghan R.
Luo Yuan
Moal Bertrand
Mohseni Ahooyi Taha
Njoroge Wanjiku F. M.
Serret-Larmande Arnaud
Sáez Silvestre Carlos
Will Loh Ne Hooi
Publication venue: American Medical Association
Publication date: 13/12/2022
Field of study

[EN] Importance The COVID-19 pandemic has been associated with an increase in mental health diagnoses among adolescents, though the extent of the increase, particularly for severe cases requiring hospitalization, has not been well characterized. Large-scale federated informatics approaches provide the ability to efficiently and securely query health care data sets to assess and monitor hospitalization patterns for mental health conditions among adolescents. Objective To estimate changes in the proportion of hospitalizations associated with mental health conditions among adolescents following onset of the COVID-19 pandemic. Design, Setting, and Participants This retrospective, multisite cohort study of adolescents 11 to 17 years of age who were hospitalized with at least 1 mental health condition diagnosis between February 1, 2019, and April 30, 2021, used patient-level data from electronic health records of 8 children¿s hospitals in the US and France. Main Outcomes and Measures Change in the monthly proportion of mental health condition¿associated hospitalizations between the prepandemic (February 1, 2019, to March 31, 2020) and pandemic (April 1, 2020, to April 30, 2021) periods using interrupted time series analysis. Results There were 9696 adolescents hospitalized with a mental health condition during the prepandemic period (5966 [61.5%] female) and 11¿101 during the pandemic period (7603 [68.5%] female). The mean (SD) age in the prepandemic cohort was 14.6 (1.9) years and in the pandemic cohort, 14.7 (1.8) years. The most prevalent diagnoses during the pandemic were anxiety (6066 [57.4%]), depression (5065 [48.0%]), and suicidality or self-injury (4673 [44.2%]). There was an increase in the proportions of monthly hospitalizations during the pandemic for anxiety (0.55%; 95% CI, 0.26%-0.84%), depression (0.50%; 95% CI, 0.19%-0.79%), and suicidality or self-injury (0.38%; 95% CI, 0.08%-0.68%). There was an estimated 0.60% increase (95% CI, 0.31%-0.89%) overall in the monthly proportion of mental health¿associated hospitalizations following onset of the pandemic compared with the prepandemic period. Conclusions and Relevance In this cohort study, onset of the COVID-19 pandemic was associated with increased hospitalizations with mental health diagnoses among adolescents. These findings support the need for greater resources within children¿s hospitals to care for adolescents with mental health conditions during the pandemic and beyond.Ms Hutch is supported by grant NLM 5T32LM012203-05 from the National Library of Medicine. Dr Aronow is supported by U24 HL148865 from the National Heart, Lung, and Blood Institute (NHLBI), NIH. Dr Cai is supported by R01 HL089778 from the NHLBI, NIH. Dr Hanauer is supported by UL1TR002240 from the National Center for Advancing Translational Sciences (NCATS), NIH. Dr Luo is supported by U01TR003528 from the NCATS, NIH, and 1R01LM013337 from the National Library of Medicine. Dr Sanchez-Pinto is supported by R01HD105939 from the National Institute of Child Health and Human Development, NIH. Dr South is supported by K23HL148394 and L40HL148910 from the NHLBI, NIH, and UL1TR001420 from the NCATS, NIH. Dr Visweswaran is supported by UL1TR001857 from the NCATS, NIH. Dr Xia is supported by R01NS098023 and R01NS124882 from the National Institute of Neurological Disorders and Stroke, NIH.Gutiérrez-Sacristán, A.; Serret-Larmande, A.; Hutch, MR.; Sáez Silvestre, C.; Aronow, BJ.; Bhatnagar, S.; Bonzel, C.... (2022). Hospitalizations Associated With Mental Health Conditions Among Adolescents in the US and France During the COVID-19 Pandemic. Jama Network Open. 5(12):1-12. https://doi.org/10.1001/jamanetworkopen.2022.4654811251

RiuNet

Recommended from our members

Improved Appropriateness of Advanced Diagnostic Imaging After Implementation of Clinical Decision Support Mechanism

Author: Anderson Dan
Bonzel Clara-Lea
Cai Tianxi
Chepelev Leonid L
Gold Benjamin
Lindaman Jared
Mahoney Mary C
Mitsouras Dimitrios
Mogel Greg
Rybicki Jr Frank
Rybicki Frank J
Sheikh Adnan
Uyeda Jennifer W
Wang Xuan
Publication venue: eScholarship, University of California
Publication date: 01/04/2021
Field of study

The Protecting Access to Medicare Act (PAMA) mandates clinical decision support mechanism (CDSM) consultation for all advanced imaging. There are a growing number of studies examining the association of CDSM use with imaging appropriateness, but a paucity of multicenter data. This observational study evaluates the association between changes in advanced imaging appropriateness scores with increasing provider exposure to CDSM. Each provider's first 200 consecutive anonymized requisitions for advanced imaging (CT, MRI, ultrasound, nuclear medicine) using a single CDSM (CareSelect, Change Healthcare) between January 1, 2017 and December 31, 2019 were collected from 288 US institutions. Changes in imaging requisition proportions among four appropriateness categories ("usually appropriate" [green], "may be appropriate" [yellow], "usually not appropriate" [red], and unmapped [gray]) were evaluated in relation to the chronological order of the requisition for each provider and total provider exposure to CDSM using logistic regression fits and Wald tests. The number of providers and requisitions included was 244,158 and 7,345,437, respectively. For 10,123 providers with ≥ 200 requisitions (2,024,600 total requisitions), the fraction of green, yellow, and red requisitions among the last 10 requisitions changed by +3.0% (95% confidence interval +2.6% to +3.4%), -0.8% (95% CI -0.5% to -1.1%), and -3.0% (95% CI 3.3% to -2.7%) in comparison with the first 10, respectively. Providers with > 190 requisitions had 8.5% (95% CI 6.3% to 10.7%) more green requisitions, 2.3% (0.7% to 3.9%) fewer yellow requisitions, and 0.5% (95% CI -1.0% to 2.0%) fewer red (not statistically significant) requisitions relative to providers with ≤ 10 requisitions. Increasing provider exposure to CDSM is associated with improved appropriateness scores for advanced imaging requisitions

eScholarship - University of California

Generate Analysis-Ready Data for Real-world Evidence: Tutorial for Harnessing Electronic Health Records With Advanced Informatic Technologies

Author: Brett K Beaulieu-Jones
Chuan Hong
Clara-Lea Bonzel
Griffin M Weber
Jessica Gronsbell
Jue Hou
Jun Wen
Kai-Li Liaw
Katherine Liao
Qingyi Zeng
Rachel Zhao
Shuyan Sabrina Wan
Sinian Zhang
Thomas Jemielita
Tianrun Cai
Tianxi Cai
Vidul Ayakulangara Panickan
Yucong Lin
Publication venue: JMIR Publications
Publication date: 01/05/2023
Field of study

Although randomized controlled trials (RCTs) are the gold standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data has been vital in postapproval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of real-world data is electronic health records (EHRs), which contain detailed information on patient care in both structured (eg, diagnosis codes) and unstructured (eg, clinical notes and images) forms. Despite the granularity of the data available in EHRs, the critical variables required to reliably assess the relationship between a treatment and clinical outcome are challenging to extract. To address this fundamental challenge and accelerate the reliable use of EHRs for RWE, we introduce an integrated data curation and modeling pipeline consisting of 4 modules that leverage recent advances in natural language processing, computational phenotyping, and causal modeling techniques with noisy data. Module 1 consists of techniques for data harmonization. We use natural language processing to recognize clinical variables from RCT design documents and map the extracted variables to EHR features with description matching and knowledge networks. Module 2 then develops techniques for cohort construction using advanced phenotyping algorithms to both identify patients with diseases of interest and define the treatment arms. Module 3 introduces methods for variable curation, including a list of existing tools to extract baseline variables from different sources (eg, codified, free text, and medical imaging) and end points of various types (eg, death, binary, temporal, and numerical). Finally, module 4 presents validation and robust modeling methods, and we propose a strategy to create gold-standard labels for EHR variables of interest to validate data curation quality and perform subsequent causal modeling for RWE. In addition to the workflow proposed in our pipeline, we also develop a reporting guideline for RWE that covers the necessary information to facilitate transparent reporting and reproducibility of results. Moreover, our pipeline is highly data driven, enhancing study data with a rich variety of publicly available information and knowledge sources. We also showcase our pipeline and provide guidance on the deployment of relevant tools by revisiting the emulation of the Clinical Outcomes of Surgical Therapy Study Group Trial on laparoscopy-assisted colectomy versus open colectomy in patients with early-stage colon cancer. We also draw on existing literature on EHR emulation of RCTs together with our own studies with the Mass General Brigham EHR

Directory of Open Access Journals

Heterogeneous associations between interleukin-6 receptor variants and phenotypes across ancestries and implications for therapy.

Author: Chuan Hong
Clara-Lea Bonzel
Harrison Zhang
Isabelle-Emmanuella Nogues
J. Michael Gaziano
Jing Cui
Katherine P. Liao
Kelly Cho
Kumar Dahal
Lauren Costa
Molei Liu
Seoyoung C. Kim
Tianxi Cai
Tony Chen
VA Million Veteran Program
Xin Xiong
Xuan Wang
Yin Xia
Yuk-Lam Ho
Publication venue: eScholarship, University of California
Publication date: 01/04/2024
Field of study

The Phenome-Wide Association Study (PheWAS) is increasingly used to broadly screen for potential treatment effects, e.g., IL6R variant as a proxy for IL6R antagonists. This approach offers an opportunity to address the limited power in clinical trials to study differential treatment effects across patient subgroups. However, limited methods exist to efficiently test for differences across subgroups in the thousands of multiple comparisons generated as part of a PheWAS. In this study, we developed an approach that maximizes the power to test for heterogeneous genotype-phenotype associations and applied this approach to an IL6R PheWAS among individuals of African (AFR) and European (EUR) ancestries. We identified 29 traits with differences in IL6R variant-phenotype associations, including a lower risk of type 2 diabetes in AFR (OR 0.96) vs EUR (OR 1.0, p-value for heterogeneity = 8.5 × 10-3), and higher white blood cell count (p-value for heterogeneity = 8.5 × 10-131). These data suggest a more salutary effect of IL6R blockade for T2D among individuals of AFR vs EUR ancestry and provide data to inform ongoing clinical trials targeting IL6 for an expanding number of conditions. Moreover, the method to test for heterogeneity of associations can be applied broadly to other large-scale genotype-phenotype screens in diverse populations

Directory of Open Access Journals

eScholarship - University of California

Acute respiratory distress syndrome after SARS-CoV-2 infection on young adult population: International observational federated study based on electronic health records through the 4CE consortium.

PurposeIn young adults (18 to 49 years old), investigation of the acute respiratory distress syndrome (ARDS) after severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has been limited. We evaluated the risk factors and outcomes of ARDS following infection with SARS-CoV-2 in a young adult population.MethodsA retrospective cohort study was conducted between January 1st, 2020 and February 28th, 2021 using patient-level electronic health records (EHR), across 241 United States hospitals and 43 European hospitals participating in the Consortium for Clinical Characterization of COVID-19 by EHR (4CE). To identify the risk factors associated with ARDS, we compared young patients with and without ARDS through a federated analysis. We further compared the outcomes between young and old patients with ARDS.ResultsAmong the 75,377 hospitalized patients with positive SARS-CoV-2 PCR, 1001 young adults presented with ARDS (7.8% of young hospitalized adults). Their mortality rate at 90 days was 16.2% and they presented with a similar complication rate for infection than older adults with ARDS. Peptic ulcer disease, paralysis, obesity, congestive heart failure, valvular disease, diabetes, chronic pulmonary disease and liver disease were associated with a higher risk of ARDS. We described a high prevalence of obesity (53%), hypertension (38%- although not significantly associated with ARDS), and diabetes (32%).ConclusionTrough an innovative method, a large international cohort study of young adults developing ARDS after SARS-CoV-2 infection has been gather. It demonstrated the poor outcomes of this population and associated risk factor

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

eScholarship - University of California

Oskar Bordeaux