413 research outputs found
Perceptually Inspired Real-time Artistic Style Transfer for Video Stream
This study presents a real-time texture transfer method for artistic style transfer for video stream. We propose a parallel framework using a T-shaped kernel to enhance the computational performance. With regard to accelerated motion estimation, which is necessarily required for maintaining temporal coherence, we present a method using a downscaled motion field to successfully achieve high real-time performance for texture transfer of video stream. In addition, to enhance the artistic quality, we calculate the level of abstraction using visual saliency and integrate it with the texture transfer algorithm. Thus, our algorithm can stylize video with perceptual enhancements
MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP
Style transfer driven by text prompts paved a new path for creatively
stylizing the images without collecting an actual style image. Despite having
promising results, with text-driven stylization, the user has no control over
the stylization. If a user wants to create an artistic image, the user requires
fine control over the stylization of various entities individually in the
content image, which is not addressed by the current state-of-the-art
approaches. On the other hand, diffusion style transfer methods also suffer
from the same issue because the regional stylization control over the stylized
output is ineffective. To address this problem, We propose a new method
Multi-Object Segmented Arbitrary Stylization Using CLIP (MOSAIC), that can
apply styles to different objects in the image based on the context extracted
from the input prompt. Text-based segmentation and stylization modules which
are based on vision transformer architecture, were used to segment and stylize
the objects. Our method can extend to any arbitrary objects, styles and produce
high-quality images compared to the current state of art methods. To our
knowledge, this is the first attempt to perform text-guided arbitrary
object-wise stylization. We demonstrate the effectiveness of our approach
through qualitative and quantitative analysis, showing that it can generate
visually appealing stylized images with enhanced control over stylization and
the ability to generalize to unseen object classes.Comment: Camera ready, New Ideas in Vision Transformers workshop, ICCV 202
MicroAST: Towards Super-Fast Ultra-Resolution Arbitrary Style Transfer
Arbitrary style transfer (AST) transfers arbitrary artistic styles onto
content images. Despite the recent rapid progress, existing AST methods are
either incapable or too slow to run at ultra-resolutions (e.g., 4K) with
limited resources, which heavily hinders their further applications. In this
paper, we tackle this dilemma by learning a straightforward and lightweight
model, dubbed MicroAST. The key insight is to completely abandon the use of
cumbersome pre-trained Deep Convolutional Neural Networks (e.g., VGG) at
inference. Instead, we design two micro encoders (content and style encoders)
and one micro decoder for style transfer. The content encoder aims at
extracting the main structure of the content image. The style encoder, coupled
with a modulator, encodes the style image into learnable dual-modulation
signals that modulate both intermediate features and convolutional filters of
the decoder, thus injecting more sophisticated and flexible style signals to
guide the stylizations. In addition, to boost the ability of the style encoder
to extract more distinct and representative style signals, we also introduce a
new style signal contrastive loss in our model. Compared to the state of the
art, our MicroAST not only produces visually superior results but also is 5-73
times smaller and 6-18 times faster, for the first time enabling super-fast
(about 0.5 seconds) AST at 4K ultra-resolutions. Code is available at
https://github.com/EndyWon/MicroAST.Comment: Accepted by AAAI 202
Topology Optimization with Text-Guided Stylization
We propose an approach for the generation of topology-optimized structures
with text-guided appearance stylization. This methodology aims to enrich the
concurrent design of a structure's physical functionality and aesthetic
appearance. Users can effortlessly input descriptive text to govern the style
of the structure. Our system employs a hash-encoded neural network as the
implicit structure representation backbone, which serves as the foundation for
the co-optimization of structural mechanical performance, style, and
connectivity, to ensure full-color, high-quality 3D-printable solutions. We
substantiate the effectiveness of our system through extensive comparisons,
demonstrations, and a 3D printing test
Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers
Current arbitrary style transfer models are limited to either image or video
domains. In order to achieve satisfying image and video style transfers, two
different models are inevitably required with separate training processes on
image and video domains, respectively. In this paper, we show that this can be
precluded by introducing UniST, a Unified Style Transfer framework for both
images and videos. At the core of UniST is a domain interaction transformer
(DIT), which first explores context information within the specific domain and
then interacts contextualized domain information for joint learning. In
particular, DIT enables exploration of temporal information from videos for the
image style transfer task and meanwhile allows rich appearance texture from
images for video style transfer, thus leading to mutual benefits. Considering
heavy computation of traditional multi-head self-attention, we present a simple
yet effective axial multi-head self-attention (AMSA) for DIT, which improves
computational efficiency while maintains style transfer performance. To verify
the effectiveness of UniST, we conduct extensive experiments on both image and
video style transfer tasks and show that UniST performs favorably against
state-of-the-art approaches on both tasks. Code is available at
https://github.com/NevSNev/UniST.Comment: Conference on International Conference on Computer Vision.(ICCV 2023
- …