27 research outputs found
Contextual Variation of Clinical Notes induced by EHR Migration
The structure and semantics of clinical notes vary considerably across different Electronic Health Record (EHR) systems, sites, and institutions. Such heterogeneity hampers the portability of natural language processing (NLP) models in extracting information from the text for clinical research or practice. In this study, we evaluate the contextual variation of clinical notes by measuring the semantic and syntactic similarity of the notes of two sets of physicians comprising four medical specialties across EHR migrations at two Mayo Clinic sites. We find significant semantic and syntactic variation imposed by the context of the EHR system and between medical specialties whereas only minor variation is caused by variation of spatial context across sites. Our findings suggest that clinical language models need to account for process differences at the specialty sublanguage level to be generalizable
Digital Solutions Observed in Clinical Trials: A Formative Feasibility Scoping Review
Growing digital access accelerates digital transformation of clinical trials where digital solutions (DSs) are increasingly and widely leveraged for improving trial efficiency, effectiveness, and accessibility. Many factors impact DS success including technology barriers, privacy concerns, or user engagement activities. It is unclear how those factors are considered or reported in the literature. Here, we perform a formative feasibility scoping review to identify gaps impacting DS quality and reproducibility in trials. Articles containing digital terms published in English from 2009 to 2022 were collected (n=4,167). 130 articles published between 2016 and 2022 were randomly selected for full-text review. Eligible articles (n=100) were sorted into four identified categories: 16% Education, 59% Intervention, 8% Patient, 17% Treatment. Initial findings about DS trends and reporting practices inform protocol development for a large-scale study urging the generation of fundamental knowledge on reporting standardization, best practice guidelines, and evaluation methodologies related to DS for clinical trials
Considerations for Quality Control Monitoring of Machine Learning Models in Clinical Practice
Integrating machine learning (ML) models into clinical practice presents a challenge of maintaining their efficacy over time. While existing literature offers valuable strategies for detecting declining model performance, there is a need to document the broader challenges and solutions associated with the real-world development and integration of model monitoring solutions. This work details the development and use of a platform for monitoring the performance of a production-level ML model operating in Mayo Clinic. In this paper, we aimed to provide a series of considerations and guidelines necessary for integrating such a platform into a team\u27s technical infrastructure and workflow. We have documented our experiences with this integration process, discussed the broader challenges encountered with real-world implementation and maintenance, and included the source code for the platform. Our monitoring platform was built as an R shiny application, developed and implemented over the course of 6 months. The platform has been used and maintained for 2 years and is still in use as of July 2023. The considerations necessary for the implementation of the monitoring platform center around 4 pillars: feasibility (what resources can be used for platform development?); design (through what statistics or models will the model be monitored, and how will these results be efficiently displayed to the end user?); implementation (how will this platform be built, and where will it exist within the IT ecosystem?); and policy (based on monitoring feedback, when and what actions will be taken to fix problems, and how will these problems be translated to clinical staff?). While much of the literature surrounding ML performance monitoring emphasizes methodological approaches for capturing changes in performance, there remains a battery of other challenges and considerations that must be addressed for successful real-world implementation
Dementia Prediction in Older Adults Using Sex-Specific Health Trajectory Clustering
With increasing number of people living with dementia, the problem of late diagnosis significantly impacts a person\u27s quality of life while early signs of dementia may provide useful insights to facilitate better treatment plans. With time, this progressive neurodegenerative syndrome could progress from mild cognitive impairment to dementia. A pattern of health conditions can be characterized in unsupervised manner to help predict this progress. As a significant extension to our previous work with streaming clustering model, we consider additional information for predicting dementia onset. With empirical observations, we discover the importance of examining sex and age to predict dementia onset. To this end, we propose a sex-specific model with age-constraint for predicting dementia onset and validate the effectiveness of our models using data from Mayo Clinic Study of Aging (MCSA). The proposed sex-specific models for older adult populations (\u3e=65 years of age) outperformed the previous models with F-score of 77% and 78% for male-specific and female-specific models, respectively. Our experiments of sex-specific temporal clustering of features in older adults demonstrate the potential of more personalized models for early alerts of dementia
Robotic proctectomy for rectal cancer: analysis of 71 patients from a single institution
BackgroundDespite increasing use of robotic surgery for rectal cancer, few series have been published from the practice of generalizable US surgeons.MethodsA retrospective chart review was performed for 71 consecutive patients who underwent robotic low anterior resection (LAR) or abdominoperineal resection (APR) for rectal adenocarcinoma between 2010 and 2014.Results46 LARs (65%) and 25 APRs (35%) were identified. Median procedure time was 219 minutes (IQR 184–275) and mean blood loss 164.9 cc (SD 155.9 cc). Radial margin was negative in 70/71 (99%) patients. Total mesorectal excision integrity was complete/near complete in 38/39 (97%) of graded specimens. A mean of 16.8 (SD+/− 8.9) lymph nodes were retrieved. At median follow‐up of 21.9 months, there were no local recurrences.ConclusionsRobotic proctectomy for rectal cancer was introduced into typical colorectal surgery practice by a single surgeon, with a low conversion rate, low complication rate, and satisfactory oncologic outcomes.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/139933/1/rcs1841_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/139933/2/rcs1841.pd
Characterizing Performance Gaps of a Code-Based Dementia Algorithm in a Population-Based Cohort of Cognitive Aging
BACKGROUND: Multiple algorithms with variable performance have been developed to identify dementia using combinations of billing codes and medication data that are widely available from electronic health records (EHR). If the characteristics of misclassified patients are clearly identified, modifying existing algorithms to improve performance may be possible.
OBJECTIVE: To examine the performance of a code-based algorithm to identify dementia cases in the population-based Mayo Clinic Study of Aging (MCSA) where dementia diagnosis (i.e., reference standard) is actively assessed through routine follow-up and describe the characteristics of persons incorrectly categorized.
METHODS: There were 5,316 participants (age at baseline (mean (SD)): 73.3 (9.68) years; 50.7% male) without dementia at baseline and available EHR data. ICD-9/10 codes and prescription medications for dementia were extracted between baseline and one year after an MCSA dementia diagnosis or last follow-up. Fisher\u27s exact or Kruskal-Wallis tests were used to compare characteristics between groups.
RESULTS: Algorithm sensitivity and specificity were 0.70 (95% CI: 0.67, 0.74) and 0.95 (95% CI: 0.95, 0.96). False positives (i.e., participants falsely diagnosed with dementia by the algorithm) were older, with higher Charlson comorbidity index, more likely to have mild cognitive impairment (MCI), and longer follow-up (versus true negatives). False negatives (versus true positives) were older, more likely to have MCI, or have more functional limitations.
CONCLUSIONS: We observed a moderate-high performance of the code-based diagnosis method against the population-based MCSA reference standard dementia diagnosis. Older participants and those with MCI at baseline were more likely to be misclassified
Automatic Uncovering of Patient Primary Concerns in Portal Messages Using a Fusion Framework of Pretrained Language models.automatic Uncovering of Patient Primary Concerns in Portal Messages Using a Fusion Framework of Pretrained Language Models
OBJECTIVES: The surge in patient portal messages (PPMs) with increasing needs and workloads for efficient PPM triage in healthcare settings has spurred the exploration of AI-driven solutions to streamline the healthcare workflow processes, ensuring timely responses to patients to satisfy their healthcare needs. However, there has been less focus on isolating and understanding patient primary concerns in PPMs-a practice which holds the potential to yield more nuanced insights and enhances the quality of healthcare delivery and patient-centered care.
MATERIALS AND METHODS: We propose a fusion framework to leverage pretrained language models (LMs) with different language advantages via a Convolution Neural Network for precise identification of patient primary concerns via multi-class classification. We examined 3 traditional machine learning models, 9 BERT-based language models, 6 fusion models, and 2 ensemble models.
RESULTS: The outcomes of our experimentation underscore the superior performance achieved by BERT-based models in comparison to traditional machine learning models. Remarkably, our fusion model emerges as the top-performing solution, delivering a notably improved accuracy score of 77.67 ± 2.74% and an F1 score of 74.37 ± 3.70% in macro-average.
DISCUSSION: This study highlights the feasibility and effectiveness of multi-class classification for patient primary concern detection and the proposed fusion framework for enhancing primary concern detection.
CONCLUSIONS: The use of multi-class classification enhanced by a fusion of multiple pretrained LMs not only improves the accuracy and efficiency of patient primary concern identification in PPMs but also aids in managing the rising volume of PPMs in healthcare, ensuring critical patient communications are addressed promptly and accurately
Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers-Assisted Sublanguage Analysis
BACKGROUND: A patient\u27s family history (FH) information significantly influences downstream clinical care. Despite this importance, there is no standardized method to capture FH information in electronic health records and a substantial portion of FH information is frequently embedded in clinical notes. This renders FH information difficult to use in downstream data analytics or clinical decision support applications. To address this issue, a natural language processing system capable of extracting and normalizing FH information can be used.
OBJECTIVE: In this study, we aimed to construct an FH lexical resource for information extraction and normalization.
METHODS: We exploited a transformer-based method to construct an FH lexical resource leveraging a corpus consisting of clinical notes generated as part of primary care. The usability of the lexicon was demonstrated through the development of a rule-based FH system that extracts FH entities and relations as specified in previous FH challenges. We also experimented with a deep learning-based FH system for FH information extraction. Previous FH challenge data sets were used for evaluation.
RESULTS: The resulting lexicon contains 33,603 lexicon entries normalized to 6408 concept unique identifiers of the Unified Medical Language System and 15,126 codes of the Systematized Nomenclature of Medicine Clinical Terms, with an average number of 5.4 variants per concept. The performance evaluation demonstrated that the rule-based FH system achieved reasonable performance. The combination of the rule-based FH system with a state-of-the-art deep learning-based FH system can improve the recall of FH information evaluated using the BioCreative/N2C2 FH challenge data set, with the F1 score varied but comparable.
CONCLUSIONS: The resulting lexicon and rule-based FH system are freely available through the Open Health Natural Language Processing GitHub