35,567 research outputs found
Depth-aware neural style transfer
Neural style transfer has recently received significant attention and demonstrated amazing results. An efficient solution proposed by Johnson et al. trains feed-forward convolutional neural networks by defining and optimizing perceptual loss functions. Such methods are typically based on high-level features extracted from pre-trained neural networks, where the loss functions contain two components: style loss and content loss. However, such pre-trained networks are originally designed for object recognition, and hence the high-level features often focus on the primary target and neglect other details. As a result, when input images contain multiple objects potentially at different depths, the resulting images are often unsatisfactory because image layout is destroyed and the boundary between the foreground and background as well as different objects becomes obscured. We observe that the depth map effectively reflects the spatial distribution in an image and preserving the depth map of the content image after stylization helps produce an image that preserves its semantic content. In this paper, we introduce a novel approach for neural style transfer that integrates depth preservation as additional loss, preserving overall image layout while performing style transfer
Depth-aware Neural Style Transfer using Instance Normalization
Neural Style Transfer (NST) is concerned with the artistic stylization of
visual media. It can be described as the process of transferring the style of
an artistic image onto an ordinary photograph. Recently, a number of studies
have considered the enhancement of the depth-preserving capabilities of the NST
algorithms to address the undesired effects that occur when the input content
images include numerous objects at various depths. Our approach uses a deep
residual convolutional network with instance normalization layers that utilizes
an advanced depth prediction network to integrate depth preservation as an
additional loss function to content and style. We demonstrate results that are
effective in retaining the depth and global structure of content images. Three
different evaluation processes show that our system is capable of preserving
the structure of the stylized results while exhibiting style-capture
capabilities and aesthetic qualities comparable or superior to state-of-the-art
methods. Project page:
https://ioannoue.github.io/depth-aware-nst-using-in.html.Comment: 8 pages, 8 figures, Computer Graphics & Visual Computing (CGVC) 202
Depth-aware neural style transfer for videos
Temporal consistency and content preservation are the prominent challenges in artistic video style transfer. To address these challenges, we present a technique that utilizes depth data and we demonstrate this on real-world videos from the web, as well as on a standard video dataset of three-dimensional computer-generated content. Our algorithm employs an image-transformation network combined with a depth encoder network for stylizing video sequences. For improved global structure preservation and temporal stability, the depth encoder network encodes ground-truth depth information which is fused into the stylization network. To further enforce temporal coherence, we employ ConvLSTM layers in the encoder, and a loss function based on calculated depth information for the output frames is also used. We show that our approach is capable of producing stylized videos with improved temporal consistency compared to state-of-the-art methods whilst also successfully transferring the artistic style of a target painting
Depth-aware neural style transfer using instance normalization
Neural Style Transfer (NST) is concerned with the artistic stylization of visual media. It can be described as the process of
transferring the style of an artistic image onto an ordinary photograph. Recently, a number of studies have considered the
enhancement of the depth-preserving capabilities of the NST algorithms to address the undesired effects that occur when the
input content images include numerous objects at various depths. Our approach uses a deep residual convolutional network
with instance normalization layers that utilizes an advanced depth prediction network to integrate depth preservation as an
additional loss function to content and style. We demonstrate results that are effective in retaining the depth and global structure
of content images. Three different evaluation processes show that our system is capable of preserving the structure of the stylized
results while exhibiting style-capture capabilities and aesthetic qualities comparable or superior to state-of-the-art methods
Deep Bilateral Learning for Real-Time Image Enhancement
Performance is a critical challenge in mobile image processing. Given a
reference imaging pipeline, or even human-adjusted pairs of images, we seek to
reproduce the enhancements and enable real-time evaluation. For this, we
introduce a new neural network architecture inspired by bilateral grid
processing and local affine color transforms. Using pairs of input/output
images, we train a convolutional neural network to predict the coefficients of
a locally-affine model in bilateral space. Our architecture learns to make
local, global, and content-dependent decisions to approximate the desired image
transformation. At runtime, the neural network consumes a low-resolution
version of the input image, produces a set of affine transformations in
bilateral space, upsamples those transformations in an edge-preserving fashion
using a new slicing node, and then applies those upsampled transformations to
the full-resolution image. Our algorithm processes high-resolution images on a
smartphone in milliseconds, provides a real-time viewfinder at 1080p
resolution, and matches the quality of state-of-the-art approximation
techniques on a large class of image operators. Unlike previous work, our model
is trained off-line from data and therefore does not require access to the
original operator at runtime. This allows our model to learn complex,
scene-dependent transformations for which no reference implementation is
available, such as the photographic edits of a human retoucher.Comment: 12 pages, 14 figures, Siggraph 201
- …