A systematic review of automated segmentation of 3D computed-tomography scans for volumetric body composition analysis

Abstract

INTRODUCTION Automated CT scan segmentation (labelling of pixels according to tissue type) is now possible. This technique is being adapted to achieve three-dimensional (3D) segmentation of computed tomography (CT) scans, opposed to single L3-slice alone. This systematic review evaluates feasibility and accuracy of automated segmentation of 3D CT scans for volumetric body composition (BC) analysis, as well as current limitations and pitfalls clinicians and researchers should be aware of. METHODS OVID Medline, Embase and grey literature databases up to October 2021 were searched. Original studies investigating automated SM, visceral and subcutaneous AT segmentation from CT were included. RESULTS Seven of 92 studies met inclusion criteria. Variation existed in expertise and numbers of humans performing ground-truth segmentations used to train algorithms. There was heterogeneity in patient characteristics, pathology and CT phases that segmentation algorithms were developed upon. Reporting of anatomical CT coverage varied, with confusing terminology. Six studies covered volumetric regional slabs rather than the whole body. One study stated the use of whole-body CT but it was not clear whether this truly meant head-to-fingertip-to-toe. Two studies used conventional computer algorithms. The latter five used deep learning (DL), an artificial intelligence technique where algorithms are similarly organised to brain neuronal pathways. Six of seven reported excellent segmentation performance (Dice similarity coefficients > 0.9 per tissue). Internal testing on unseen scans was performed for only four of seven algorithms, whilst only three were tested externally. Trained DL algorithms achieved full CT segmentation in 12 to 75 seconds versus 25 minutes for non-DL techniques. CONCLUSION Deep learning enables opportunistic, rapid, and automated volumetric BC analysis of CT performed for clinical indications. However, most CT scans do not cover head-to-fingertip-to-toe; further research must validate using common CT regions to estimate true whole-body BC, with direct comparison to single lumbar slice. Due to successes of DL, we expect progressive numbers of algorithms to materialize in addition to the seven discussed in this paper. Researchers and clinicians in the field of BC must therefore be aware of pitfalls. High Dice similarity coefficients do not inform the degree to which BC tissues may be under- or overestimated, and nor does it inform on algorithm precision. Consensus is needed fto define accuracy and precision standards for ground-truth labelling. Creation of a large international, multicentre common CT dataset with BC ground-truth labels from multiple experts could be a robust solution

    Similar works