87 research outputs found
CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison
Large, labeled datasets have driven deep learning methods to achieve
expert-level performance on a variety of medical imaging tasks. We present
CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240
patients. We design a labeler to automatically detect the presence of 14
observations in radiology reports, capturing uncertainties inherent in
radiograph interpretation. We investigate different approaches to using the
uncertainty labels for training convolutional neural networks that output the
probability of these observations given the available frontal and lateral
radiographs. On a validation set of 200 chest radiographic studies which were
manually annotated by 3 board-certified radiologists, we find that different
uncertainty approaches are useful for different pathologies. We then evaluate
our best model on a test set composed of 500 chest radiographic studies
annotated by a consensus of 5 board-certified radiologists, and compare the
performance of our model to that of 3 additional radiologists in the detection
of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the
model ROC and PR curves lie above all 3 radiologist operating points. We
release the dataset to the public as a standard benchmark to evaluate
performance of chest radiograph interpretation models.
The dataset is freely available at
https://stanfordmlgroup.github.io/competitions/chexpert .Comment: Published in AAAI 201
Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting
Automatically generated reports from medical images promise to improve the
workflow of radiologists. Existing methods consider an image-to-report modeling
task by directly generating a fully-fledged report from an image. However, this
conflates the content of the report (e.g., findings and their attributes) with
its style (e.g., format and choice of words), which can lead to clinically
inaccurate reports. To address this, we propose a two-step approach for
radiology report generation. First, we extract the content from an image; then,
we verbalize the extracted content into a report that matches the style of a
specific radiologist. For this, we leverage RadGraph -- a graph representation
of reports -- together with large language models (LLMs). In our quantitative
evaluations, we find that our approach leads to beneficial performance. Our
human evaluation with clinical raters highlights that the AI-generated reports
are indistinguishably tailored to the style of individual radiologist despite
leveraging only a few examples as context.Comment: Accepted to Findings of EMNLP 202
Exploring the Boundaries of GPT-4 in Radiology
The recent success of general-domain large language models (LLMs) has
significantly changed the natural language processing paradigm towards a
unified foundation model across domains and applications. In this paper, we
focus on assessing the performance of GPT-4, the most capable LLM so far, on
the text-based applications for radiology reports, comparing against
state-of-the-art (SOTA) radiology-specific models. Exploring various prompting
strategies, we evaluated GPT-4 on a diverse range of common radiology tasks and
we found GPT-4 either outperforms or is on par with current SOTA radiology
models. With zero-shot prompting, GPT-4 already obtains substantial gains
( 10% absolute improvement) over radiology models in temporal sentence
similarity classification (accuracy) and natural language inference ().
For tasks that require learning dataset-specific style or schema (e.g. findings
summarisation), GPT-4 improves with example-based prompting and matches
supervised SOTA. Our extensive error analysis with a board-certified
radiologist shows GPT-4 has a sufficient level of radiology knowledge with only
occasional errors in complex context that require nuanced domain knowledge. For
findings summarisation, GPT-4 outputs are found to be overall comparable with
existing manually-written impressions.Comment: EMNLP 2023 mai
Clinical practice: The bleeding child. Part II: Disorders of secondary hemostasis and fibrinolysis
Bleeding complications in children may be caused by disorders of secondary hemostasis or fibrinolysis. Characteristic features in medical history and physical examination, especially of hemophilia, are palpable deep hematomas, bleeding in joints and muscles, and recurrent bleedings. A detailed medical and family history combined with a thorough physical examination is essential to distinguish abnormal from normal bleeding and to decide whether it is necessary to perform diagnostic laboratory evaluation. Initial laboratory tests include prothrombin time and activated partial thromboplastin time. Knowledge of the classical coagulation cascade with its intrinsic, extrinsic, and common pathways, is useful to identify potential defects in the coagulation in order to decide which additional coagulation tests should be performed
Objective assessment of stored blood quality by deep learning
Stored red blood cells (RBCs) are needed for life-saving blood transfusions, but they undergo continuous degradation. RBC storage lesions are often assessed by microscopic examination or biochemical and biophysical assays, which are complex, time-consuming, and destructive to fragile cells. Here we demonstrate the use of label-free imaging flow cytometry and deep learning to characterize RBC lesions. Using brightfield images, a trained neural network achieved 76.7% agreement with experts in classifying seven clinically relevant RBC morphologies associated with storage lesions, comparable to 82.5% agreement between different experts. Given that human observation and classification may not optimally discern RBC quality, we went further and eliminated subjective human annotation in the training step by training a weakly supervised neural network using only storage duration times. The feature space extracted by this network revealed a chronological progression of morphological changes that better predicted blood quality, as measured by physiological hemolytic assay readouts, than the conventional expert-assessed morphology classification system. With further training and clinical testing across multiple sites, protocols, and instruments, deep learning and label-free imaging flow cytometry might be used to routinely and objectively assess RBC storage lesions. This would automate a complex protocol, minimize laboratory sample handling and preparation, and reduce the impact of procedural errors and discrepancies between facilities and blood donors. The chronology-based machine-learning approach may also improve upon humansâ assessment of morphological changes in other biomedically important progressions, such as differentiation and metastasis
Germline mutations in ETV6 are associated with thrombocytopenia, red cell macrocytosis and predisposition to lymphoblastic leukemia
Some familial platelet disorders are associated with predisposition to leukemia, myelodysplastic syndrome (MDS) or dyserythropoietic anemia. We identified a family with autosomal dominant thrombocytopenia, high erythrocyte mean corpuscular volume (MCV) and two occurrences of B cell-precursor acute lymphoblastic leukemia (ALL). Whole-exome sequencing identified a heterozygous single-nucleotide change in ETV6 (ets variant 6), c.641C>T, encoding a p.Pro214Leu substitution in the central domain, segregating with thrombocytopenia and elevated MCV. A screen of 23 families with similar phenotypes identified 2 with ETV6 mutations. One family also had a mutation encoding p.Pro214Leu and one individual with ALL. The other family had a c.1252A>G transition producing a p.Arg418Gly substitution in the DNA-binding domain, with alternative splicing and exon skipping. Functional characterization of these mutations showed aberrant cellular localization of mutant and endogenous ETV6, decreased transcriptional repression and altered megakaryocyte maturation. Our findings underscore a key role for ETV6 in platelet formation and leukemia predisposition
- âŠ