2,802 research outputs found
Discovering Pattern Structure Using Differentiable Compositing
Patterns, which are collections of elements arranged in regular or near-regular arrangements, are an important graphic art form and widely used due to their elegant simplicity and aesthetic appeal. When a pattern is encoded as a flat image without the underlying structure, manually editing the pattern is tedious and challenging as one has to both preserve the individual element shapes and their original relative arrangements. State-of-the-art deep learning frameworks that operate at the pixel level are unsuitable for manipulating such patterns. Specifically, these methods can easily disturb the shapes of the individual elements or their arrangement, and thus fail to preserve the latent structures of the input patterns. We present a novel differentiable compositing operator using pattern elements and use it to discover structures, in the form of a layered representation of graphical objects, directly from raw pattern images. This operator allows us to adapt current deep learning based image methods to effectively handle patterns. We evaluate our method on a range of patterns and demonstrate superiority in the context of pattern manipulations when compared against state-of-the-art pixel- or point-based alternatives
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
We present the content deformation field CoDeF as a new type of video
representation, which consists of a canonical content field aggregating the
static contents in the entire video and a temporal deformation field recording
the transformations from the canonical image (i.e., rendered from the canonical
content field) to each individual frame along the time axis.Given a target
video, these two fields are jointly optimized to reconstruct it through a
carefully tailored rendering pipeline.We advisedly introduce some
regularizations into the optimization process, urging the canonical content
field to inherit semantics (e.g., the object shape) from the video.With such a
design, CoDeF naturally supports lifting image algorithms for video processing,
in the sense that one can apply an image algorithm to the canonical image and
effortlessly propagate the outcomes to the entire video with the aid of the
temporal deformation field.We experimentally show that CoDeF is able to lift
image-to-image translation to video-to-video translation and lift keypoint
detection to keypoint tracking without any training.More importantly, thanks to
our lifting strategy that deploys the algorithms on only one image, we achieve
superior cross-frame consistency in processed videos compared to existing
video-to-video translation approaches, and even manage to track non-rigid
objects like water and smog.Project page can be found at
https://qiuyu96.github.io/CoDeF/.Comment: Project Webpage: https://qiuyu96.github.io/CoDeF/, Code:
https://github.com/qiuyu96/CoDe
Livrable D4.2 of the PERSEE project : Représentation et codage 3D - Rapport intermédiaire - Définitions des softs et architecture
51Livrable D4.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D4.2 du projet. Son titre : Représentation et codage 3D - Rapport intermédiaire - Définitions des softs et architectur
Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
Taking a photo outside, can we predict the immediate future, e.g., how would
the cloud move in the sky? We address this problem by presenting a generative
adversarial network (GAN) based two-stage approach to generating realistic
time-lapse videos of high resolution. Given the first frame, our model learns
to generate long-term future frames. The first stage generates videos of
realistic contents for each frame. The second stage refines the generated video
from the first stage by enforcing it to be closer to real videos with regard to
motion dynamics. To further encourage vivid motion in the final generated
video, Gram matrix is employed to model the motion more precisely. We build a
large scale time-lapse dataset, and test our approach on this new dataset.
Using our model, we are able to generate realistic videos of up to resolution for 32 frames. Quantitative and qualitative experiment results
have demonstrated the superiority of our model over the state-of-the-art
models.Comment: To appear in Proceedings of CVPR 201
SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling
Synthetic data has emerged as a promising source for 3D human research as it
offers low-cost access to large-scale human datasets. To advance the diversity
and annotation quality of human models, we introduce a new synthetic dataset,
SynBody, with three appealing features: 1) a clothed parametric human model
that can generate a diverse range of subjects; 2) the layered human
representation that naturally offers high-quality 3D annotations to support
multiple tasks; 3) a scalable system for producing realistic data to facilitate
real-world tasks. The dataset comprises 1.2M images with corresponding accurate
3D annotations, covering 10,000 human body models, 1,187 actions, and various
viewpoints. The dataset includes two subsets for human pose and shape
estimation as well as human neural rendering. Extensive experiments on SynBody
indicate that it substantially enhances both SMPL and SMPL-X estimation.
Furthermore, the incorporation of layered annotations offers a valuable
training resource for investigating the Human Neural Radiance Fields (NeRF).Comment: Accepted by ICCV 2023. Project webpage: https://synbody.github.io
Metappearance: Meta-Learning for Visual Appearance Reproduction
There currently are two main approaches to reproducing visual appearance
using Machine Learning (ML): The first is training models that generalize over
different instances of a problem, e.g., different images from a dataset. Such
models learn priors over the data corpus and use this knowledge to provide fast
inference with little input, often as a one-shot operation. However, this
generality comes at the cost of fidelity, as such methods often struggle to
achieve the final quality required. The second approach does not train a model
that generalizes across the data, but overfits to a single instance of a
problem, e.g., a flash image of a material. This produces detailed and
high-quality results, but requires time-consuming training and is, as mere
non-linear function fitting, unable to exploit previous experience. Techniques
such as fine-tuning or auto-decoders combine both approaches but are sequential
and rely on per-exemplar optimization. We suggest to combine both techniques
end-to-end using meta-learning: We over-fit onto a single problem instance in
an inner loop, while also learning how to do so efficiently in an outer-loop
that builds intuition over many optimization runs. We demonstrate this concept
to be versatile and efficient, applying it to RGB textures, Bi-directional
Reflectance Distribution Functions (BRDFs), or Spatially-varying BRDFs
(svBRDFs)
Non-Stationary Texture Synthesis by Adversarial Expansion
The real world exhibits an abundance of non-stationary textures. Examples
include textures with large-scale structures, as well as spatially variant and
inhomogeneous textures. While existing example-based texture synthesis methods
can cope well with stationary textures, non-stationary textures still pose a
considerable challenge, which remains unresolved. In this paper, we propose a
new approach for example-based non-stationary texture synthesis. Our approach
uses a generative adversarial network (GAN), trained to double the spatial
extent of texture blocks extracted from a specific texture exemplar. Once
trained, the fully convolutional generator is able to expand the size of the
entire exemplar, as well as of any of its sub-blocks. We demonstrate that this
conceptually simple approach is highly effective for capturing large-scale
structures, as well as other non-stationary attributes of the input exemplar.
As a result, it can cope with challenging textures, which, to our knowledge, no
other existing method can handle.Comment: Accepted to SIGGRAPH 201
- …