602 research outputs found
Data efficient deep learning for medical image analysis: A survey
The rapid evolution of deep learning has significantly advanced the field of
medical image analysis. However, despite these achievements, the further
enhancement of deep learning models for medical image analysis faces a
significant challenge due to the scarcity of large, well-annotated datasets. To
address this issue, recent years have witnessed a growing emphasis on the
development of data-efficient deep learning methods. This paper conducts a
thorough review of data-efficient deep learning methods for medical image
analysis. To this end, we categorize these methods based on the level of
supervision they rely on, encompassing categories such as no supervision,
inexact supervision, incomplete supervision, inaccurate supervision, and only
limited supervision. We further divide these categories into finer
subcategories. For example, we categorize inexact supervision into multiple
instance learning and learning with weak annotations. Similarly, we categorize
incomplete supervision into semi-supervised learning, active learning, and
domain-adaptive learning and so on. Furthermore, we systematically summarize
commonly used datasets for data efficient deep learning in medical image
analysis and investigate future research directions to conclude this survey.Comment: Under Revie
MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology
In this paper, we consider enhancing medical visual-language pre-training
(VLP) with domain-specific knowledge, by exploiting the paired image-text
reports from the radiological daily practice. In particular, we make the
following contributions: First, unlike existing works that directly process the
raw reports, we adopt a novel triplet extraction module to extract the
medical-related information, avoiding unnecessary complexity from language
grammar and enhancing the supervision signals; Second, we propose a novel
triplet encoding module with entity translation by querying a knowledge base,
to exploit the rich domain knowledge in medical field, and implicitly build
relationships between medical entities in the language embedding space; Third,
we propose to use a Transformer-based fusion model for spatially aligning the
entity description with visual signals at the image patch level, enabling the
ability for medical diagnosis; Fourth, we conduct thorough experiments to
validate the effectiveness of our architecture, and benchmark on numerous
public benchmarks, e.g., ChestX-ray14, RSNA Pneumonia, SIIM-ACR Pneumothorax,
COVIDx CXR-2, COVID Rural, and EdemaSeverity. In both zero-shot and fine-tuning
settings, our model has demonstrated strong performance compared with the
former methods on disease classification and grounding
Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives
Deep learning has demonstrated remarkable performance across various tasks in
medical imaging. However, these approaches primarily focus on supervised
learning, assuming that the training and testing data are drawn from the same
distribution. Unfortunately, this assumption may not always hold true in
practice. To address these issues, unsupervised domain adaptation (UDA)
techniques have been developed to transfer knowledge from a labeled domain to a
related but unlabeled domain. In recent years, significant advancements have
been made in UDA, resulting in a wide range of methodologies, including feature
alignment, image translation, self-supervision, and disentangled representation
methods, among others. In this paper, we provide a comprehensive literature
review of recent deep UDA approaches in medical imaging from a technical
perspective. Specifically, we categorize current UDA research in medical
imaging into six groups and further divide them into finer subcategories based
on the different tasks they perform. We also discuss the respective datasets
used in the studies to assess the divergence between the different domains.
Finally, we discuss emerging areas and provide insights and discussions on
future research directions to conclude this survey.Comment: Under Revie
Recent Progress in Transformer-based Medical Image Analysis
The transformer is primarily used in the field of natural language
processing. Recently, it has been adopted and shows promise in the computer
vision (CV) field. Medical image analysis (MIA), as a critical branch of CV,
also greatly benefits from this state-of-the-art technique. In this review, we
first recap the core component of the transformer, the attention mechanism, and
the detailed structures of the transformer. After that, we depict the recent
progress of the transformer in the field of MIA. We organize the applications
in a sequence of different tasks, including classification, segmentation,
captioning, registration, detection, enhancement, localization, and synthesis.
The mainstream classification and segmentation tasks are further divided into
eleven medical image modalities. A large number of experiments studied in this
review illustrate that the transformer-based method outperforms existing
methods through comparisons with multiple evaluation metrics. Finally, we
discuss the open challenges and future opportunities in this field. This
task-modality review with the latest contents, detailed information, and
comprehensive comparison may greatly benefit the broad MIA community.Comment: Computers in Biology and Medicine Accepte
Knowledge-enhanced Visual-Language Pre-training on Chest Radiology Images
While multi-modal foundation models pre-trained on large-scale data have been
successful in natural language understanding and vision recognition, their use
in medical domains is still limited due to the fine-grained nature of medical
tasks and the high demand for domain knowledge. To address this challenge, we
propose a novel approach called Knowledge-enhanced Auto Diagnosis (KAD) which
leverages existing medical domain knowledge to guide vision-language
pre-training using paired chest X-rays and radiology reports. We evaluate KAD
on {four} external X-ray datasets and demonstrate that its zero-shot
performance is not only comparable to that of fully-supervised models, but also
superior to the average of three expert radiologists for three (out of five)
pathologies with statistical significance. Moreover, when few-shot annotation
is available, KAD outperforms all existing approaches in fine-tuning settings,
demonstrating its potential for application in different clinical scenarios
Self-supervised learning methods and applications in medical imaging analysis: A survey
The scarcity of high-quality annotated medical imaging datasets is a major
problem that collides with machine learning applications in the field of
medical imaging analysis and impedes its advancement. Self-supervised learning
is a recent training paradigm that enables learning robust representations
without the need for human annotation which can be considered an effective
solution for the scarcity of annotated medical data. This article reviews the
state-of-the-art research directions in self-supervised learning approaches for
image data with a concentration on their applications in the field of medical
imaging analysis. The article covers a set of the most recent self-supervised
learning methods from the computer vision field as they are applicable to the
medical imaging analysis and categorize them as predictive, generative, and
contrastive approaches. Moreover, the article covers 40 of the most recent
research papers in the field of self-supervised learning in medical imaging
analysis aiming at shedding the light on the recent innovation in the field.
Finally, the article concludes with possible future research directions in the
field
Deep Learning of Unified Region, Edge, and Contour Models for Automated Image Segmentation
Image segmentation is a fundamental and challenging problem in computer
vision with applications spanning multiple areas, such as medical imaging,
remote sensing, and autonomous vehicles. Recently, convolutional neural
networks (CNNs) have gained traction in the design of automated segmentation
pipelines. Although CNN-based models are adept at learning abstract features
from raw image data, their performance is dependent on the availability and
size of suitable training datasets. Additionally, these models are often unable
to capture the details of object boundaries and generalize poorly to unseen
classes. In this thesis, we devise novel methodologies that address these
issues and establish robust representation learning frameworks for
fully-automatic semantic segmentation in medical imaging and mainstream
computer vision. In particular, our contributions include (1) state-of-the-art
2D and 3D image segmentation networks for computer vision and medical image
analysis, (2) an end-to-end trainable image segmentation framework that unifies
CNNs and active contour models with learnable parameters for fast and robust
object delineation, (3) a novel approach for disentangling edge and texture
processing in segmentation networks, and (4) a novel few-shot learning model in
both supervised settings and semi-supervised settings where synergies between
latent and image spaces are leveraged to learn to segment images given limited
training data.Comment: PhD dissertation, UCLA, 202
- …