4 research outputs found
Rethinking Context Aggregation in Natural Image Matting
For natural image matting, context information plays a crucial role in
estimating alpha mattes especially when it is challenging to distinguish
foreground from its background. Exiting deep learning-based methods exploit
specifically designed context aggregation modules to refine encoder features.
However, the effectiveness of these modules has not been thoroughly explored.
In this paper, we conduct extensive experiments to reveal that the context
aggregation modules are actually not as effective as expected. We also
demonstrate that when learned on large image patches, basic encoder-decoder
networks with a larger receptive field can effectively aggregate context to
achieve better performance.Upon the above findings, we propose a simple yet
effective matting network, named AEMatter, which enlarges the receptive field
by incorporating an appearance-enhanced axis-wise learning block into the
encoder and adopting a hybrid-transformer decoder. Experimental results on four
datasets demonstrate that our AEMatter significantly outperforms
state-of-the-art matting methods (e.g., on the Adobe Composition-1K dataset,
\textbf{25\%} and \textbf{40\%} reduction in terms of SAD and MSE,
respectively, compared against MatteFormer). The code and model are available
at \url{https://github.com/QLYoo/AEMatter}
Motion-aware KNN laplacian for video matting
This paper demonstrates how the nonlocal principle benefits video matting via the KNN Laplacian, which comes with a straightforward implementation using motion-aware K nearest neighbors. In hindsight, the fundamental problem to solve in video matting is to produce spatio-temporally coherent clusters of moving foreground pixels. When used as described, the motion-aware KNN Laplacian is effective in addressing this fundamental problem, as demonstrated by sparse user markups typically on only one frame in a variety of challenging examples featuring ambiguous foreground and background colors, changing topologies with disocclusion, significant illumination changes, fast motion, and motion blur. When working with existing Laplacian-based systems, we expect our Laplacian can benefit them immediately with an improved clustering of moving foreground pixels. © 2013 IEEE