345 research outputs found
Streaming Aerial Video Textures
We present a streaming compression algorithm for huge time-varying aerial imagery. New airborne optical sensors are capable of collecting billion-pixel images at multiple frames per second. These images must be transmitted through a low-bandwidth pipe requiring aggressive compression techniques. We achieve such compression by treating foreground portions of the imagery separately from background portions. Foreground information consists of moving objects, which form a tiny fraction of the total pixels. Background areas are compressed effectively over time using streaming wavelet analysis to compute a compact video texture map that represents several frames of raw input images. This map can be rendered efficiently using an algorithm amenable to GPU implementation. The core algorithmic contributions of this work are methods for fast, low-memory streaming wavelet compression and efficient display of wavelet video textures resulting from such compression
Study of Compression Statistics and Prediction of Rate-Distortion Curves for Video Texture
Encoding textural content remains a challenge for current standardised video
codecs. It is therefore beneficial to understand video textures in terms of
both their spatio-temporal characteristics and their encoding statistics in
order to optimize encoding performance. In this paper, we analyse the
spatio-temporal features and statistics of video textures, explore the
rate-quality performance of different texture types and investigate models to
mathematically describe them. For all considered theoretical models, we employ
machine-learning regression to predict the rate-quality curves based solely on
selected spatio-temporal features extracted from uncompressed content. All
experiments were performed on homogeneous video textures to ensure validity of
the observations. The results of the regression indicate that using an
exponential model we can more accurately predict the expected rate-quality
curve (with a mean Bj{\o}ntegaard Delta rate of 0.46% over the considered
dataset) while maintaining a low relative complexity. This is expected to be
adopted by in the loop processes for faster encoding decisions such as
rate-distortion optimisation, adaptive quantization, partitioning, etc.Comment: 17 page
Novel approaches for generating video textures
Video texture, a new type of medium, can produce a new video with a continuously varying stream of images from a recorded video. It is synthesized by reordering the input video frames in a way which can be played without any visual discontinuity. However, video texture still experiences few unappealing drawbacks. For instance, video texture techniques can only generate new videos by simply rearranging the order of frames in original videos. Therefore, all the individual frames are the same as before and the result would suffer from "dead-ends" if the current frame could not discover similar frames to make a transition. In this thesis, we propose several new approaches for synthesizing video textures. These approaches adopt dimensionality reduction and regression techniques to generate video textures. Not only the frames in the resulted video textures are new, but also the "Dead end" problem is avoided. First, we have extended die work of applying principal components analysis (PCA) and autoregressive (AR) process to generate video textures by replacing PCA with five other dimension reduction techniques. Based on our experiments, using these dimensionality reduction techniques has improved the quality of video textures compared with extraction of frame signatures using PCA. The synthesized video textures may contain similar motions as the input video and will never be repeated exactly. All frames synthesized have never appeared before. We also propose a new approach for generating video textures using probabilistic principal components analysis (PPCA) and Gaussian process dynamical model (GPDM). GPDM is a nonparametric model for learning high-dimensional nonlinear dynamical data sets. We adopt PPCA and GPDM on several movie clips to synthesize video textures which contain frames that never appeared before and with similar motions as original videos. Furthermore, we have proposed two ways of generating real-time video textures by applying the incremental Isomap and incremental Spati04emporal Isomap (IST-Isomap). Both approaches can produce good real-time video texture results. In particular, IST-Isomap, that we propose, is more suitable for sparse video data (e.g. cartoon
Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning
Cinemagraphs are a compelling way to convey dynamic aspects of a scene. In
these media, dynamic and still elements are juxtaposed to create an artistic
and narrative experience. Creating a high-quality, aesthetically pleasing
cinemagraph requires isolating objects in a semantically meaningful way and
then selecting good start times and looping periods for those objects to
minimize visual artifacts (such a tearing). To achieve this, we present a new
technique that uses object recognition and semantic segmentation as part of an
optimization method to automatically create cinemagraphs from videos that are
both visually appealing and semantically meaningful. Given a scene with
multiple objects, there are many cinemagraphs one could create. Our method
evaluates these multiple candidates and presents the best one, as determined by
a model trained to predict human preferences in a collaborative way. We
demonstrate the effectiveness of our approach with multiple results and a user
study.Comment: To appear in ICCV 2017. Total 17 pages including the supplementary
materia
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
We study the problem of synthesizing a number of likely future frames from a
single input image. In contrast to traditional methods, which have tackled this
problem in a deterministic or non-parametric way, we propose a novel approach
that models future frames in a probabilistic manner. Our probabilistic model
makes it possible for us to sample and synthesize many possible future frames
from a single input image. Future frame synthesis is challenging, as it
involves low- and high-level image and motion understanding. We propose a novel
network structure, namely a Cross Convolutional Network to aid in synthesizing
future frames; this network structure encodes image and motion information as
feature maps and convolutional kernels, respectively. In experiments, our model
performs well on synthetic data, such as 2D shapes and animated game sprites,
as well as on real-wold videos. We also show that our model can be applied to
tasks such as visual analogy-making, and present an analysis of the learned
network representations.Comment: The first two authors contributed equally to this wor
Skeleton-aided Articulated Motion Generation
This work make the first attempt to generate articulated human motion
sequence from a single image. On the one hand, we utilize paired inputs
including human skeleton information as motion embedding and a single human
image as appearance reference, to generate novel motion frames, based on the
conditional GAN infrastructure. On the other hand, a triplet loss is employed
to pursue appearance-smoothness between consecutive frames. As the proposed
framework is capable of jointly exploiting the image appearance space and
articulated/kinematic motion space, it generates realistic articulated motion
sequence, in contrast to most previous video generation methods which yield
blurred motion effects. We test our model on two human action datasets
including KTH and Human3.6M, and the proposed framework generates very
promising results on both datasets.Comment: ACM MM 201
Which game are you playing? - An interactive and educational video show
Project “Which game are you playing?” is an interactive video show, which has an educational purpose. Video records projected on the three walls are swapped according to where a person has sat down, in other words, according to a person’s »ego« role: child, parent or adult. Videos show conflicts among young people and their resolution according to the person's »ego« role. We made the project with two programs: Smart Wall and vvvv. For the realization of this project we need two computers, which are connected with an ethernet cable, two cameras, one placed on the ceiling and the other in front of the »ego« chairs, and three projectors
- …