Cardiac structure segmentation from echocardiogram videos plays a crucial
role in diagnosing heart disease. The combination of multi-view echocardiogram
data is essential to enhance the accuracy and robustness of automated methods.
However, due to the visual disparity of the data, deriving cross-view context
information remains a challenging task, and unsophisticated fusion strategies
can even lower performance. In this study, we propose a novel Gobal-Local
fusion (GL-Fusion) network to jointly utilize multi-view information globally
and locally that improve the accuracy of echocardiogram analysis. Specifically,
a Multi-view Global-based Fusion Module (MGFM) is proposed to extract global
context information and to explore the cyclic relationship of different
heartbeat cycles in an echocardiogram video. Additionally, a Multi-view
Local-based Fusion Module (MLFM) is designed to extract correlations of cardiac
structures from different views. Furthermore, we collect a multi-view
echocardiogram video dataset (MvEVD) to evaluate our method. Our method
achieves an 82.29% average dice score, which demonstrates a 7.83% improvement
over the baseline method, and outperforms other existing state-of-the-art
methods. To our knowledge, this is the first exploration of a multi-view method
for echocardiogram video segmentation. Code available at:
https://github.com/xmed-lab/GL-FusionComment: Accepted By MICCAI 202