Although the Segment Anything Model (SAM) has demonstrated impressive
performance in 2D natural image segmentation, its application to 3D volumetric
medical images reveals significant shortcomings, namely suboptimal performance
and unstable prediction, necessitating an excessive number of prompt points to
attain the desired outcomes. These issues can hardly be addressed by
fine-tuning SAM on medical data because the original 2D structure of SAM
neglects 3D spatial information. In this paper, we introduce SAM-Med3D, the
most comprehensive study to modify SAM for 3D medical images. Our approach is
characterized by its comprehensiveness in two primary aspects: firstly, by
comprehensively reformulating SAM to a thorough 3D architecture trained on a
comprehensively processed large-scale volumetric medical dataset; and secondly,
by providing a comprehensive evaluation of its performance. Specifically, we
train SAM-Med3D with over 131K 3D masks and 247 categories. Our SAM-Med3D
excels at capturing 3D spatial information, exhibiting competitive performance
with significantly fewer prompt points than the top-performing fine-tuned SAM
in the medical domain. We then evaluate its capabilities across 15 datasets and
analyze it from multiple perspectives, including anatomical structures,
modalities, targets, and generalization abilities. Our approach, compared with
SAM, showcases pronouncedly enhanced efficiency and broad segmentation
capabilities for 3D volumetric medical images. Our code is released at
https://github.com/uni-medical/SAM-Med3D