1 research outputs found
Fast Pixel-Matching for Video Object Segmentation
Video object segmentation, aiming to segment the foreground objects given the
annotation of the first frame, has been attracting increasing attentions. Many
state-of-the-art approaches have achieved great performance by relying on
online model updating or mask-propagation techniques. However, most online
models require high computational cost due to model fine-tuning during
inference. Most mask-propagation based models are faster but with relatively
low performance due to failure to adapt to object appearance variation. In this
paper, we are aiming to design a new model to make a good balance between speed
and performance. We propose a model, called NPMCA-net, which directly localizes
foreground objects based on mask-propagation and non-local technique by
matching pixels in reference and target frames. Since we bring in information
of both first and previous frames, our network is robust to large object
appearance variation, and can better adapt to occlusions. Extensive experiments
show that our approach can achieve a new state-of-the-art performance with a
fast speed at the same time (86.5% IoU on DAVIS-2016 and 72.2% IoU on
DAVIS-2017, with speed of 0.11s per frame) under the same level comparison.
Source code is available at https://github.com/siyueyu/NPMCA-net.Comment: Accepted by Signal Processing: Image Communicatio