5 research outputs found
Med-Flamingo: a Multimodal Medical Few-shot Learner
Medicine, by its nature, is a multifaceted domain that requires the synthesis
of information across various modalities. Medical generative vision-language
models (VLMs) make a first step in this direction and promise many exciting
clinical applications. However, existing models typically have to be fine-tuned
on sizeable down-stream datasets, which poses a significant limitation as in
many medical applications data is scarce, necessitating models that are capable
of learning from few examples in real-time. Here we propose Med-Flamingo, a
multimodal few-shot learner adapted to the medical domain. Based on
OpenFlamingo-9B, we continue pre-training on paired and interleaved medical
image-text data from publications and textbooks. Med-Flamingo unlocks few-shot
generative medical visual question answering (VQA) abilities, which we evaluate
on several datasets including a novel challenging open-ended VQA dataset of
visual USMLE-style problems. Furthermore, we conduct the first human evaluation
for generative medical VQA where physicians review the problems and blinded
generations in an interactive app. Med-Flamingo improves performance in
generative medical VQA by up to 20\% in clinician's rating and firstly enables
multimodal medical few-shot adaptations, such as rationale generation. We
release our model, code, and evaluation app under
https://github.com/snap-stanford/med-flamingo.Comment: Preprin
Almanac: Retrieval-Augmented Language Models for Clinical Medicine
Large-language models have recently demonstrated impressive zero-shot
capabilities in a variety of natural language tasks such as summarization,
dialogue generation, and question-answering. Despite many promising
applications in clinical medicine, adoption of these models in real-world
settings has been largely limited by their tendency to generate incorrect and
sometimes even toxic statements. In this study, we develop Almanac, a large
language model framework augmented with retrieval capabilities for medical
guideline and treatment recommendations. Performance on a novel dataset of
clinical scenarios (n = 130) evaluated by a panel of 5 board-certified and
resident physicians demonstrates significant increases in factuality (mean of
18% at p-value < 0.05) across all specialties, with improvements in
completeness and safety. Our results demonstrate the potential for large
language models to be effective tools in the clinical decision-making process,
while also emphasizing the importance of careful testing and deployment to
mitigate their shortcomings
A Generalizable Deep Learning System for Cardiac MRI
Cardiac MRI allows for a comprehensive assessment of myocardial structure,
function, and tissue characteristics. Here we describe a foundational vision
system for cardiac MRI, capable of representing the breadth of human
cardiovascular disease and health. Our deep learning model is trained via
self-supervised contrastive learning, by which visual concepts in cine-sequence
cardiac MRI scans are learned from the raw text of the accompanying radiology
reports. We train and evaluate our model on data from four large academic
clinical institutions in the United States. We additionally showcase the
performance of our models on the UK BioBank, and two additional publicly
available external datasets. We explore emergent zero-shot capabilities of our
system, and demonstrate remarkable performance across a range of tasks;
including the problem of left ventricular ejection fraction regression, and the
diagnosis of 35 different conditions such as cardiac amyloidosis and
hypertrophic cardiomyopathy. We show that our deep learning system is capable
of not only understanding the staggering complexity of human cardiovascular
disease, but can be directed towards clinical problems of interest yielding
impressive, clinical grade diagnostic accuracy with a fraction of the training
data typically required for such tasks.Comment: 21 page main manuscript, 4 figures. Supplementary Appendix and code
will be made available on publicatio
Predicting Environmental Chemical Toxicity using a New Hybrid Deep Machine Learning Method
Humans are exposed to thousands of potentially toxic chemicals including environmental chemicals such as industrial wastes, food products, solvents, air pollutants, fertilizers, pesticides, insecticides, carcinogens, drugs, metals/metalloids, and other industrial chemicals. Approximately 300,000 such chemicals currently in use, unfortunately little is known about their potential toxicity. Determining human toxicity potential of chemicals remains a challenge due to a substantial resource required to assess a chemical in-vivo, and only a few thousand single chemicals in commercial use has been evaluated. In this study, to predict the environmental chemical toxicity, we developed a new hybrid neural network (HNN) deep learning model consisting of a Convolutional Neural Network (CNN) and multilayer perceptron (MLP) type feed forward neural network (FFNN). Our HNN deep learning model trained based on thousands of chemicals, presented the best performance for majority of the cases. Taken together, our hybrid HNN deep learning models has a wide applicability in the prediction of toxicity of any chemical category and its mixtures
Predicting Dose-Range Chemical Toxicity using Novel Hybrid Deep Machine-Learning Method
Humans are exposed to thousands of chemicals, including environmental chemicals. Unfortunately, little is known about their potential toxicity, as determining the toxicity remains challenging due to the substantial resources required to assess a chemical in vivo. Here, we present a novel hybrid neural network (HNN) deep learning method, called HNN-Tox, to predict chemical toxicity at different doses. To develop a hybrid HNN-Tox method, we combined two neural network frameworks, the Convolutional Neural Network (CNN) and the multilayer perceptron (MLP)-type feed-forward neural network (FFNN). Combining the CNN and FCNN in the field of environmental chemical toxicity prediction is a novel approach. We developed several binary and multiclass classification models to assess dose-range chemical toxicity that is trained based on thousands of chemicals with known toxicity. The performance of the HNN-Tox was compared with other machine-learning methods, including Random Forest (RF), Bootstrap Aggregation (Bagging), and Adaptive Boosting (AdaBoost). We also analyzed the model performance dependency on varying features, descriptors, dataset size, route of exposure, and toxic dose. The HNN-Tox model, trained on 59,373 chemicals annotated with known LD50 and routes of exposure, maintained its predictive ability with an accuracy of 84.9% and 84.1%, even after reducing the descriptor size from 318 to 51, and the area under the ROC curve (AUC) was 0.89 and 0.88, respectively. Further, we validated the HNN-Tox with several external toxic chemical datasets on a large scale. The HNN-Tox performed optimally or better than the other machine-learning methods for diverse chemicals. This study is the first to report a large-scale prediction of dose-range chemical toxicity with varying features. The HNN-Tox has broad applicability in predicting toxicity for diverse chemicals and could serve as an alternative methodology approach to animal-based toxicity assessment