15 research outputs found
Text2Cohort: Facilitating Intuitive Access to Biomedical Data with Natural Language Cohort Discovery
The Imaging Data Commons (IDC) is a cloud-based database that provides
researchers with open access to cancer imaging data, with the goal of
facilitating collaboration. However, cohort discovery within the IDC database
has a significant technical learning curve. Recently, large language models
(LLM) have demonstrated exceptional utility for natural language processing
tasks. We developed Text2Cohort, a LLM-powered toolkit to facilitate
user-friendly natural language cohort discovery in the IDC. Our method
translates user input into IDC queries using grounding techniques and returns
the query's response. We evaluate Text2Cohort on 50 natural language inputs,
from information extraction to cohort discovery. Our toolkit successfully
generated responses with an 88% accuracy and 0.94 F1 score. We demonstrate that
Text2Cohort can enable researchers to discover and curate cohorts on IDC with
high levels of accuracy using natural language in a more intuitive and
user-friendly way.Comment: 5 pages, 3 figures, 2 table
SegViz: A federated-learning based framework for multi-organ segmentation on heterogeneous data sets with partial annotations
Segmentation is one of the most primary tasks in deep learning for medical
imaging, owing to its multiple downstream clinical applications. However,
generating manual annotations for medical images is time-consuming, requires
high skill, and is an expensive effort, especially for 3D images. One potential
solution is to aggregate knowledge from partially annotated datasets from
multiple groups to collaboratively train global models using Federated
Learning. To this end, we propose SegViz, a federated learning-based framework
to train a segmentation model from distributed non-i.i.d datasets with partial
annotations. The performance of SegViz was compared against training individual
models separately on each dataset as well as centrally aggregating all the
datasets in one place and training a single model. The SegViz framework using
FedBN as the aggregation strategy demonstrated excellent performance on the
external BTCV set with dice scores of 0.93, 0.83, 0.55, and 0.75 for
segmentation of liver, spleen, pancreas, and kidneys, respectively,
significantly () better (except spleen) than the dice scores of 0.87,
0.83, 0.42, and 0.48 for the baseline models. In contrast, the central
aggregation model significantly () performed poorly on the test dataset
with dice scores of 0.65, 0, 0.55, and 0.68. Our results demonstrate the
potential of the SegViz framework to train multi-task models from distributed
datasets with partial labels. All our implementations are open-source and
available at https://anonymous.4open.science/r/SegViz-B74
A framework for dynamically training and adapting deep reinforcement learning models to different, low-compute, and continuously changing radiology deployment environments
While Deep Reinforcement Learning has been widely researched in medical
imaging, the training and deployment of these models usually require powerful
GPUs. Since imaging environments evolve rapidly and can be generated by edge
devices, the algorithm is required to continually learn and adapt to changing
environments, and adjust to low-compute devices. To this end, we developed
three image coreset algorithms to compress and denoise medical images for
selective experience replayed-based lifelong reinforcement learning. We
implemented neighborhood averaging coreset, neighborhood sensitivity-based
sampling coreset, and maximum entropy coreset on full-body DIXON water and
DIXON fat MRI images. All three coresets produced 27x compression with
excellent performance in localizing five anatomical landmarks: left knee, right
trochanter, left kidney, spleen, and lung across both imaging environments.
Maximum entropy coreset obtained the best performance of
average distance error, compared to the conventional lifelong learning
framework's
Multi-environment lifelong deep reinforcement learning for medical imaging
Deep reinforcement learning(DRL) is increasingly being explored in medical
imaging. However, the environments for medical imaging tasks are constantly
evolving in terms of imaging orientations, imaging sequences, and pathologies.
To that end, we developed a Lifelong DRL framework, SERIL to continually learn
new tasks in changing imaging environments without catastrophic forgetting.
SERIL was developed using selective experience replay based lifelong learning
technique for the localization of five anatomical landmarks in brain MRI on a
sequence of twenty-four different imaging environments. The performance of
SERIL, when compared to two baseline setups: MERT(multi-environment-best-case)
and SERT(single-environment-worst-case) demonstrated excellent performance with
an average distance of pixels from the desired landmark across
all 120 tasks, compared to for MERT and for
SERT(), demonstrating the excellent potential for continuously learning
multiple tasks across dynamically changing imaging environments
ISLE: An Intelligent Streaming Framework for High-Throughput AI Inference in Medical Imaging
As the adoption of Artificial Intelligence (AI) systems within the clinical
environment grows, limitations in bandwidth and compute can create
communication bottlenecks when streaming imaging data, leading to delays in
patient care and increased cost. As such, healthcare providers and AI vendors
will require greater computational infrastructure, therefore dramatically
increasing costs. To that end, we developed ISLE, an intelligent streaming
framework for high-throughput, compute- and bandwidth- optimized, and cost
effective AI inference for clinical decision making at scale. In our
experiments, ISLE on average reduced data transmission by 98.02% and decoding
time by 98.09%, while increasing throughput by 2,730%. We show that ISLE
results in faster turnaround times, and reduced overall cost of data,
transmission, and compute, without negatively impacting clinical decision
making using AI systems.Comment: 5 pages, 3 figures, 3 table