2 research outputs found
Arbitrary Video Style Transfer via Multi-Channel Correlation
Video style transfer is getting more attention in AI community for its
numerous applications such as augmented reality and animation productions.
Compared with traditional image style transfer, performing this task on video
presents new challenges: how to effectively generate satisfactory stylized
results for any specified style, and maintain temporal coherence across frames
at the same time. Towards this end, we propose Multi-Channel Correction network
(MCCNet), which can be trained to fuse the exemplar style features and input
content features for efficient style transfer while naturally maintaining the
coherence of input videos. Specifically, MCCNet works directly on the feature
space of style and content domain where it learns to rearrange and fuse style
features based on their similarity with content features. The outputs generated
by MCC are features containing the desired style patterns which can further be
decoded into images with vivid style textures. Moreover, MCCNet is also
designed to explicitly align the features to input which ensures the output
maintains the content structures as well as the temporal continuity. To further
improve the performance of MCCNet under complex light conditions, we also
introduce the illumination loss during training. Qualitative and quantitative
evaluations demonstrate that MCCNet performs well in both arbitrary video and
image style transfer tasks