Background: Automated segmentation of spinal MR images plays a vital role
both scientifically and clinically. However, accurately delineating posterior
spine structures presents challenges.
Methods: This retrospective study, approved by the ethical committee,
involved translating T1w and T2w MR image series into CT images in a total of
n=263 pairs of CT/MR series. Landmark-based registration was performed to align
image pairs. We compared 2D paired (Pix2Pix, denoising diffusion implicit
models (DDIM) image mode, DDIM noise mode) and unpaired (contrastive unpaired
translation, SynDiff) image-to-image translation using "peak signal to noise
ratio" (PSNR) as quality measure. A publicly available segmentation network
segmented the synthesized CT datasets, and Dice scores were evaluated on
in-house test sets and the "MRSpineSeg Challenge" volumes. The 2D findings were
extended to 3D Pix2Pix and DDIM.
Results: 2D paired methods and SynDiff exhibited similar translation
performance and Dice scores on paired data. DDIM image mode achieved the
highest image quality. SynDiff, Pix2Pix, and DDIM image mode demonstrated
similar Dice scores (0.77). For craniocaudal axis rotations, at least two
landmarks per vertebra were required for registration. The 3D translation
outperformed the 2D approach, resulting in improved Dice scores (0.80) and
anatomically accurate segmentations in a higher resolution than the original MR
image.
Conclusion: Two landmarks per vertebra registration enabled paired
image-to-image translation from MR to CT and outperformed all unpaired
approaches. The 3D techniques provided anatomically correct segmentations,
avoiding underprediction of small structures like the spinous process.Comment: 35 pages, 7 figures, Code and a model weights available
https://doi.org/10.5281/zenodo.8221159 and
https://doi.org/10.5281/zenodo.819869