56 research outputs found
Harnessing Uncertainty in Radiotherapy Auto-Segmentation Quality Assurance
One of the key contributions of this study is the reappropriation of standard DL outputs as a quality indicator to identify cases that clinicians should review further. The authors achieve this by applying an empirically derived threshold to the softmax output of their DL network, computing the mean of the thresholded score map (termed the HiS metric), and correlating it with standard geometric quality indices. When juxtaposed with a mean entropy — a commonly used measure of model output uncertainty — HiS consistently demonstrated a stronger correlation with the geometric indices, suggesting its superior ability to stratify cases needing additional review. We applaud the authors\u27 efforts for their novel contributions and would like to note some potential caveats that could pave the way for future research directions
Evolving Horizons in Radiation Therapy Auto-Contouring: Distilling Insights, Embracing Data-Centric Frameworks, and Moving Beyond Geometric Quantification
Historically, clinician-derived contouring of tumors and healthy tissues has been crucial for radiation therapy (RT) planning. In recent years, advances in artificial intelligence (AI), predominantly in deep learning (DL), have rapidly improved automated contouring for RT applications, particularly for routine organs-at-risk.1, 2, 3 Despite research efforts actively promoting its broader acceptance, clinical adoption of auto-contouring is not yet standard practice. Notably, within several AI communities, there has been growing enthusiasm to shift from conventional “model-centric” AI approaches (ie, improving a model while keeping the data fixed), to “data-centric” AI approaches (ie, improving the data while keeping a model fixed).4 Although balancing both approaches is typically ideal for crafting the optimal solution for specific-use cases, most research in RT auto-contouring has prioritized algorithmic modifications aimed at enhancing quantitative contouring performance based on geometric (ie, structural overlap) indices5—a clear testament to the “model-centric” AI paradigm. In this editorial, aimed at clinician end-users and multidisciplinary research teams, we harmonize key insights in contemporary RT auto-contouring algorithmic development to promote the adoption of data-centric AI frameworks for impactful future research directions that would further facilitate clinical acceptance. Of note, the discussion herein draws primarily from literature related to head and neck cancer (HNC), showcasing it as a representative example of a complex disease site. However, these insights apply broadly to auto-contouring across disease sites
Large Language Models to Identify Social Determinants of Health in Electronic Health Records
Social determinants of health (SDoH) have an important impact on patient
outcomes but are incompletely collected from the electronic health records
(EHR). This study researched the ability of large language models to extract
SDoH from free text in EHRs, where they are most commonly documented, and
explored the role of synthetic clinical text for improving the extraction of
these scarcely documented, yet extremely valuable, clinical data. 800 patient
notes were annotated for SDoH categories, and several transformer-based models
were evaluated. The study also experimented with synthetic data generation and
assessed for algorithmic bias. Our best-performing models were fine-tuned
Flan-T5 XL (macro-F1 0.71) for any SDoH, and Flan-T5 XXL (macro-F1 0.70). The
benefit of augmenting fine-tuning with synthetic data varied across model
architecture and size, with smaller Flan-T5 models (base and large) showing the
greatest improvements in performance (delta F1 +0.12 to +0.23). Model
performance was similar on the in-hospital system dataset but worse on the
MIMIC-III dataset. Our best-performing fine-tuned models outperformed zero- and
few-shot performance of ChatGPT-family models for both tasks. These fine-tuned
models were less likely than ChatGPT to change their prediction when
race/ethnicity and gender descriptors were added to the text, suggesting less
algorithmic bias (p<0.05). At the patient-level, our models identified 93.8% of
patients with adverse SDoH, while ICD-10 codes captured 2.0%. Our method can
effectively extracted SDoH information from clinic notes, performing better
compare to GPT zero- and few-shot settings. These models could enhance
real-world evidence on SDoH and aid in identifying patients needing social
support.Comment: 38 pages, 5 figures, 5 tables in main, submitted for revie
The impact of responding to patient messages with large language model assistance
Documentation burden is a major contributor to clinician burnout, which is
rising nationally and is an urgent threat to our ability to care for patients.
Artificial intelligence (AI) chatbots, such as ChatGPT, could reduce clinician
burden by assisting with documentation. Although many hospitals are actively
integrating such systems into electronic medical record systems, AI chatbots
utility and impact on clinical decision-making have not been studied for this
intended use. We are the first to examine the utility of large language models
in assisting clinicians draft responses to patient questions. In our two-stage
cross-sectional study, 6 oncologists responded to 100 realistic synthetic
cancer patient scenarios and portal messages developed to reflect common
medical situations, first manually, then with AI assistance.
We find AI-assisted responses were longer, less readable, but provided
acceptable drafts without edits 58% of time. AI assistance improved efficiency
77% of time, with low harm risk (82% safe). However, 7.7% unedited AI responses
could severely harm. In 31% cases, physicians thought AI drafts were
human-written. AI assistance led to more patient education recommendations,
fewer clinical actions than manual responses. Results show promise for AI to
improve clinician efficiency and patient care through assisting documentation,
if used judiciously. Monitoring model outputs and human-AI interaction remains
crucial for safe implementation.Comment: 4 figures and tables in main, submitted for revie
Tumor Angiogenesis Phenotyping by Nanoparticle-facilitated Magnetic Resonance and Near-infrared Fluorescence Molecular Imaging
AbstractOne of the challenges of tailored antiangiogenic therapy is the ability to adequately monitor the angiogenic activity of a malignancy in response to treatment. The αvβ3 integrin, highly overexpressed on newly formed tumor vessels, has been successfully used as a target for Arg-Gly-Asp (RGD)-functionalized nanoparticle contrast agents. In the present study, an RGD-functionalized nanocarrier was used to image ongoing angiogenesis in two different xenograft tumor models with varying intensities of angiogenesis (LS174T > EW7). To that end, iron oxide nanocrystals were included in the core of the nanoparticles to provide contrast for T2*-weighted magnetic resonance imaging (MRI), whereas the fluorophore Cy7 was attached to the surface to enable near-infrared fluorescence (NIRF) imaging. The mouse tumor models were used to test the potential of the nanoparticle probe in combination with dual modality imaging for in vivo detection of tumor angiogenesis. Pre-contrast and post-contrast images (4 hours) were acquired at a 9.4-T MRI system and revealed significant differences in the nanoparticle accumulation patterns between the two tumor models. In the case of the highly vascularized LS174T tumors, the accumulation was more confined to the periphery of the tumors, where angiogenesis is predominantly occurring. NIRF imaging revealed significant differences in accumulation kinetics between the models. In conclusion, this technology can serve as an in vivo biomarker for antiangiogenesis treatment and angiogenesis phenotyping
Heavy element production in a compact object merger observed by JWST
The mergers of binary compact objects such as neutron stars and black holes are of central interest to several areas of astrophysics, including as the progenitors of gamma-ray bursts (GRBs) 1, sources of high-frequency gravitational waves (GWs) 2 and likely production sites for heavy-element nucleosynthesis by means of rapid neutron capture (the r-process) 3. Here we present observations of the exceptionally bright GRB 230307A. We show that GRB 230307A belongs to the class of long-duration GRBs associated with compact object mergers 4–6 and contains a kilonova similar to AT2017gfo, associated with the GW merger GW170817 (refs. 7–12). We obtained James Webb Space Telescope (JWST) mid-infrared imaging and spectroscopy 29 and 61 days after the burst. The spectroscopy shows an emission line at 2.15 microns, which we interpret as tellurium (atomic mass A = 130) and a very red source, emitting most of its light in the mid-infrared owing to the production of lanthanides. These observations demonstrate that nucleosynthesis in GRBs can create r-process elements across a broad atomic mass range and play a central role in heavy-element nucleosynthesis across the Universe
Artificial intelligence for clinical oncology
Clinical oncology is experiencing rapid growth in data that are collected to enhance cancer care. With recent advances in the field of Artificial Intelligence (AI), there is now a computational basis to integrate and synthesize this growing body of multi-dimensional data, deduce patterns, and predict outcomes to improve shared patient and clinician decision-making. While there is high potential, significant challenges remain. In this perspective, we propose a pathway of clinical, cancer care touchpoints for narrow-task AI applications and review a selection of applications. We describe the challenges faced in the clinical translation of AI and propose solutions. We also suggest paths forward in weaving AI into individualized patient care, with an emphasis on clinical validity, utility, and usability. By illuminating these issues in the context of current AI applications for clinical oncology, we hope to help advance meaningful investigations that will ultimately translate to real-world clinical use
SegmentationReview:A Slicer3D extension for fast review of AI-generated segmentations
SegmentationReview is a package developed in Python for fast review and editing of biomedical image segmentations. Biomedical imaging segmentation quality assessment is a crucial part of the development medical artificial intelligence (AI) algorithms but is time-consuming and labor-intensive. SegmentationReview has several components that facilitate efficient segmentation review, including automated importing of lists of images and segmentations into Slicer3D, a user-friendly graphical user interface for reviewing and assessing the quality of the segmentation, and automated tabular data-saving. The package has been tested and released as an open-source extension for Slicer3D. It enables fast, user-friendly review and editing for biomedical image segmentations
Randomized clinical trials of machine learning interventions in health care: a systematic review
Importance: Despite the potential of machine learning to improve multiple aspects of patient care, barriers to clinical adoption remain. Randomized clinical trials (RCTs) are often a prerequisite to large-scale clinical adoption of an intervention, and important questions remain regarding how machine learning interventions are being incorporated into clinical trials in health care. Objective: To systematically examine the design, reporting standards, risk of bias, and inclusivity of RCTs for medical machine learning interventions. Evidence Review: In this systematic review, the Cochrane Library, Google Scholar, Ovid Embase, Ovid MEDLINE, PubMed, Scopus, and Web of Science Core Collection online databases were searched and citation chasing was done to find relevant articles published from the inception of each database to October 15, 2021. Search terms for machine learning, clinical decision-making, and RCTs were used. Exclusion criteria included implementation of a non-RCT design, absence of original data, and evaluation of nonclinical interventions. Data were extracted from published articles. Trial characteristics, including primary intervention, demographics, adherence to the CONSORT-AI reporting guideline, and Cochrane risk of bias were analyzed. Findings: Literature search yielded 19737 articles, of which 41 RCTs involved a median of 294 participants (range, 17-2488 participants). A total of 16 RCTS (39%) were published in 2021, 21 (51%) were conducted at single sites, and 15 (37%) involved endoscopy. No trials adhered to all CONSORT-AI standards. Common reasons for nonadherence were not assessing poor-quality or unavailable input data (38 trials [93%]), not analyzing performance errors (38 [93%]), and not including a statement regarding code or algorithm availability (37 [90%]). Overall risk of bias was high in 7 trials (17%). Of 11 trials (27%) that reported race and ethnicity data, the median proportion of participants from underrepresented minority groups was 21% (range, 0%-51%). Conclusions and Relevance: This systematic review found that despite the large number of medical machine learning-based algorithms in development, few RCTs for these technologies have been conducted. Among published RCTs, there was high variability in adherence to reporting standards and risk of bias and a lack of participants from underrepresented minority groups. These findings merit attention and should be considered in future RCT design and reporting.Published versionThis study was supported by grants K23-DK125718 (Dr Shung) and K08-DE030216 (Dr Kann) from the National Institutes of Health, grant T32GM007753 from the National Institute of General Medical Sciences (Ms Plana), and grant F30-CA260780 from the National Cancer Institute (Ms Plana)
- …