Search CORE

413 research outputs found

Perceptually Inspired Real-time Artistic Style Transfer for Video Stream

Author: Kang Dongwann
Seo S.H.
Tian Feng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/06/2016
Field of study

This study presents a real-time texture transfer method for artistic style transfer for video stream. We propose a parallel framework using a T-shaped kernel to enhance the computational performance. With regard to accelerated motion estimation, which is necessarily required for maintaining temporal coherence, we present a method using a downscaled motion field to successfully achieve high real-time performance for texture transfer of video stream. In addition, to enhance the artistic quality, we calculate the level of abstraction using visual saliency and integrate it with the texture transfer algorithm. Thus, our algorithm can stylize video with perceptual enhancements

Crossref

Bournemouth University Research Online

MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP

Author: Anand C Shyam
Chellingi Prabhath
Ganugula Prajwal
Kasera Neeraj
Kumar Y S S S Santosh
Reddy N K Sagar
Thakur Avinash
Publication venue
Publication date: 24/09/2023
Field of study

Style transfer driven by text prompts paved a new path for creatively stylizing the images without collecting an actual style image. Despite having promising results, with text-driven stylization, the user has no control over the stylization. If a user wants to create an artistic image, the user requires fine control over the stylization of various entities individually in the content image, which is not addressed by the current state-of-the-art approaches. On the other hand, diffusion style transfer methods also suffer from the same issue because the regional stylization control over the stylized output is ineffective. To address this problem, We propose a new method Multi-Object Segmented Arbitrary Stylization Using CLIP (MOSAIC), that can apply styles to different objects in the image based on the context extracted from the input prompt. Text-based segmentation and stylization modules which are based on vision transformer architecture, were used to segment and stylize the objects. Our method can extend to any arbitrary objects, styles and produce high-quality images compared to the current state of art methods. To our knowledge, this is the first attempt to perform text-guided arbitrary object-wise stylization. We demonstrate the effectiveness of our approach through qualitative and quantitative analysis, showing that it can generate visually appealing stylized images with enhanced control over stylization and the ability to generalize to unseen object classes.Comment: Camera ready, New Ideas in Vision Transformers workshop, ICCV 202

arXiv.org e-Print Archive

MicroAST: Towards Super-Fast Ultra-Resolution Arbitrary Style Transfer

Author: Chen Haibo
Li Ailin
Lu Dongming
Wang Zhizhong
Xing Wei
Zhao Lei
Zuo Zhiwen
Publication venue
Publication date: 28/11/2022
Field of study

Arbitrary style transfer (AST) transfers arbitrary artistic styles onto content images. Despite the recent rapid progress, existing AST methods are either incapable or too slow to run at ultra-resolutions (e.g., 4K) with limited resources, which heavily hinders their further applications. In this paper, we tackle this dilemma by learning a straightforward and lightweight model, dubbed MicroAST. The key insight is to completely abandon the use of cumbersome pre-trained Deep Convolutional Neural Networks (e.g., VGG) at inference. Instead, we design two micro encoders (content and style encoders) and one micro decoder for style transfer. The content encoder aims at extracting the main structure of the content image. The style encoder, coupled with a modulator, encodes the style image into learnable dual-modulation signals that modulate both intermediate features and convolutional filters of the decoder, thus injecting more sophisticated and flexible style signals to guide the stylizations. In addition, to boost the ability of the style encoder to extract more distinct and representative style signals, we also introduce a new style signal contrastive loss in our model. Compared to the state of the art, our MicroAST not only produces visually superior results but also is 5-73 times smaller and 6-18 times faster, for the first time enabling super-fast (about 0.5 seconds) AST at 4K ultra-resolutions. Code is available at https://github.com/EndyWon/MicroAST.Comment: Accepted by AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Topology Optimization with Text-Guided Stylization

Author: Iwai Daisuke
Punpongsanon Parinya
Sato Kosuke
Zhong Shengze
Publication venue
Publication date: 24/10/2023
Field of study

We propose an approach for the generation of topology-optimized structures with text-guided appearance stylization. This methodology aims to enrich the concurrent design of a structure's physical functionality and aesthetic appearance. Users can effortlessly input descriptive text to govern the style of the structure. Our system employs a hash-encoded neural network as the implicit structure representation backbone, which serves as the foundation for the co-optimization of structural mechanical performance, style, and connectivity, to ensure full-color, high-quality 3D-printable solutions. We substantiate the effectiveness of our system through extensive comparisons, demonstrations, and a 3D printing test

arXiv.org e-Print Archive

Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers

Author: Fan Heng
Gu Bohai
Zhang Libo
Publication venue
Publication date: 01/09/2023
Field of study

Current arbitrary style transfer models are limited to either image or video domains. In order to achieve satisfying image and video style transfers, two different models are inevitably required with separate training processes on image and video domains, respectively. In this paper, we show that this can be precluded by introducing UniST, a Unified Style Transfer framework for both images and videos. At the core of UniST is a domain interaction transformer (DIT), which first explores context information within the specific domain and then interacts contextualized domain information for joint learning. In particular, DIT enables exploration of temporal information from videos for the image style transfer task and meanwhile allows rich appearance texture from images for video style transfer, thus leading to mutual benefits. Considering heavy computation of traditional multi-head self-attention, we present a simple yet effective axial multi-head self-attention (AMSA) for DIT, which improves computational efficiency while maintains style transfer performance. To verify the effectiveness of UniST, we conduct extensive experiments on both image and video style transfer tasks and show that UniST performs favorably against state-of-the-art approaches on both tasks. Code is available at https://github.com/NevSNev/UniST.Comment: Conference on International Conference on Computer Vision.(ICCV 2023

arXiv.org e-Print Archive