78 research outputs found
Semi-Supervised Self-Taught Deep Learning for Finger Bones Segmentation
Segmentation stands at the forefront of many high-level vision tasks. In this
study, we focus on segmenting finger bones within a newly introduced
semi-supervised self-taught deep learning framework which consists of a student
network and a stand-alone teacher module. The whole system is boosted in a
life-long learning manner wherein each step the teacher module provides a
refinement for the student network to learn with newly unlabeled data.
Experimental results demonstrate the superiority of the proposed method over
conventional supervised deep learning methods.Comment: IEEE BHI 2019 accepte
Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching
Geometric knowledge has been shown to be beneficial for the stereo matching
task. However, prior attempts to integrate geometric insights into stereo
matching algorithms have largely focused on geometric knowledge from single
images while crucial cross-view factors such as occlusion and matching
uniqueness have been overlooked. To address this gap, we propose a novel
Intra-view and Cross-view Geometric knowledge learning Network (ICGNet),
specifically crafted to assimilate both intra-view and cross-view geometric
knowledge. ICGNet harnesses the power of interest points to serve as a channel
for intra-view geometric understanding. Simultaneously, it employs the
correspondences among these points to capture cross-view geometric
relationships. This dual incorporation empowers the proposed ICGNet to leverage
both intra-view and cross-view geometric knowledge in its learning process,
substantially improving its ability to estimate disparities. Our extensive
experiments demonstrate the superiority of the ICGNet over contemporary leading
models.Comment: Accepted to CVPR202
MS-MT: Multi-Scale Mean Teacher with Contrastive Unpaired Translation for Cross-Modality Vestibular Schwannoma and Cochlea Segmentation
Domain shift has been a long-standing issue for medical image segmentation.
Recently, unsupervised domain adaptation (UDA) methods have achieved promising
cross-modality segmentation performance by distilling knowledge from a
label-rich source domain to a target domain without labels. In this work, we
propose a multi-scale self-ensembling based UDA framework for automatic
segmentation of two key brain structures i.e., Vestibular Schwannoma (VS) and
Cochlea on high-resolution T2 images. First, a segmentation-enhanced
contrastive unpaired image translation module is designed for image-level
domain adaptation from source T1 to target T2. Next, multi-scale deep
supervision and consistency regularization are introduced to a mean teacher
network for self-ensemble learning to further close the domain gap.
Furthermore, self-training and intensity augmentation techniques are utilized
to mitigate label scarcity and boost cross-modality segmentation performance.
Our method demonstrates promising segmentation performance with a mean Dice
score of 83.8% and 81.4% and an average asymmetric surface distance (ASSD) of
0.55 mm and 0.26 mm for the VS and Cochlea, respectively in the validation
phase of the crossMoDA 2022 challenge.Comment: Accepted by BrainLes MICCAI proceedings (5th solution for MICCAI 2022
Cross-Modality Domain Adaptation (crossMoDA) Challenge
Learn to Optimize Denoising Scores for 3D Generation: A Unified and Improved Diffusion Prior on NeRF and 3D Gaussian Splatting
We propose a unified framework aimed at enhancing the diffusion priors for 3D
generation tasks. Despite the critical importance of these tasks, existing
methodologies often struggle to generate high-caliber results. We begin by
examining the inherent limitations in previous diffusion priors. We identify a
divergence between the diffusion priors and the training procedures of
diffusion models that substantially impairs the quality of 3D generation. To
address this issue, we propose a novel, unified framework that iteratively
optimizes both the 3D model and the diffusion prior. Leveraging the different
learnable parameters of the diffusion prior, our approach offers multiple
configurations, affording various trade-offs between performance and
implementation complexity. Notably, our experimental results demonstrate that
our method markedly surpasses existing techniques, establishing new
state-of-the-art in the realm of text-to-3D generation. Furthermore, our
approach exhibits impressive performance on both NeRF and the newly introduced
3D Gaussian Splatting backbones. Additionally, our framework yields insightful
contributions to the understanding of recent score distillation methods, such
as the VSD and DDS loss
- …