345 research outputs found

    Streaming Aerial Video Textures

    Get PDF
    We present a streaming compression algorithm for huge time-varying aerial imagery. New airborne optical sensors are capable of collecting billion-pixel images at multiple frames per second. These images must be transmitted through a low-bandwidth pipe requiring aggressive compression techniques. We achieve such compression by treating foreground portions of the imagery separately from background portions. Foreground information consists of moving objects, which form a tiny fraction of the total pixels. Background areas are compressed effectively over time using streaming wavelet analysis to compute a compact video texture map that represents several frames of raw input images. This map can be rendered efficiently using an algorithm amenable to GPU implementation. The core algorithmic contributions of this work are methods for fast, low-memory streaming wavelet compression and efficient display of wavelet video textures resulting from such compression

    Study of Compression Statistics and Prediction of Rate-Distortion Curves for Video Texture

    Get PDF
    Encoding textural content remains a challenge for current standardised video codecs. It is therefore beneficial to understand video textures in terms of both their spatio-temporal characteristics and their encoding statistics in order to optimize encoding performance. In this paper, we analyse the spatio-temporal features and statistics of video textures, explore the rate-quality performance of different texture types and investigate models to mathematically describe them. For all considered theoretical models, we employ machine-learning regression to predict the rate-quality curves based solely on selected spatio-temporal features extracted from uncompressed content. All experiments were performed on homogeneous video textures to ensure validity of the observations. The results of the regression indicate that using an exponential model we can more accurately predict the expected rate-quality curve (with a mean Bj{\o}ntegaard Delta rate of 0.46% over the considered dataset) while maintaining a low relative complexity. This is expected to be adopted by in the loop processes for faster encoding decisions such as rate-distortion optimisation, adaptive quantization, partitioning, etc.Comment: 17 page

    Novel approaches for generating video textures

    Get PDF
    Video texture, a new type of medium, can produce a new video with a continuously varying stream of images from a recorded video. It is synthesized by reordering the input video frames in a way which can be played without any visual discontinuity. However, video texture still experiences few unappealing drawbacks. For instance, video texture techniques can only generate new videos by simply rearranging the order of frames in original videos. Therefore, all the individual frames are the same as before and the result would suffer from "dead-ends" if the current frame could not discover similar frames to make a transition. In this thesis, we propose several new approaches for synthesizing video textures. These approaches adopt dimensionality reduction and regression techniques to generate video textures. Not only the frames in the resulted video textures are new, but also the "Dead end" problem is avoided. First, we have extended die work of applying principal components analysis (PCA) and autoregressive (AR) process to generate video textures by replacing PCA with five other dimension reduction techniques. Based on our experiments, using these dimensionality reduction techniques has improved the quality of video textures compared with extraction of frame signatures using PCA. The synthesized video textures may contain similar motions as the input video and will never be repeated exactly. All frames synthesized have never appeared before. We also propose a new approach for generating video textures using probabilistic principal components analysis (PPCA) and Gaussian process dynamical model (GPDM). GPDM is a nonparametric model for learning high-dimensional nonlinear dynamical data sets. We adopt PPCA and GPDM on several movie clips to synthesize video textures which contain frames that never appeared before and with similar motions as original videos. Furthermore, we have proposed two ways of generating real-time video textures by applying the incremental Isomap and incremental Spati04emporal Isomap (IST-Isomap). Both approaches can produce good real-time video texture results. In particular, IST-Isomap, that we propose, is more suitable for sparse video data (e.g. cartoon

    Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning

    Full text link
    Cinemagraphs are a compelling way to convey dynamic aspects of a scene. In these media, dynamic and still elements are juxtaposed to create an artistic and narrative experience. Creating a high-quality, aesthetically pleasing cinemagraph requires isolating objects in a semantically meaningful way and then selecting good start times and looping periods for those objects to minimize visual artifacts (such a tearing). To achieve this, we present a new technique that uses object recognition and semantic segmentation as part of an optimization method to automatically create cinemagraphs from videos that are both visually appealing and semantically meaningful. Given a scene with multiple objects, there are many cinemagraphs one could create. Our method evaluates these multiple candidates and presents the best one, as determined by a model trained to predict human preferences in a collaborative way. We demonstrate the effectiveness of our approach with multiple results and a user study.Comment: To appear in ICCV 2017. Total 17 pages including the supplementary materia

    Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks

    Get PDF
    We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods, which have tackled this problem in a deterministic or non-parametric way, we propose a novel approach that models future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. Future frame synthesis is challenging, as it involves low- and high-level image and motion understanding. We propose a novel network structure, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, as well as on real-wold videos. We also show that our model can be applied to tasks such as visual analogy-making, and present an analysis of the learned network representations.Comment: The first two authors contributed equally to this wor

    Skeleton-aided Articulated Motion Generation

    Full text link
    This work make the first attempt to generate articulated human motion sequence from a single image. On the one hand, we utilize paired inputs including human skeleton information as motion embedding and a single human image as appearance reference, to generate novel motion frames, based on the conditional GAN infrastructure. On the other hand, a triplet loss is employed to pursue appearance-smoothness between consecutive frames. As the proposed framework is capable of jointly exploiting the image appearance space and articulated/kinematic motion space, it generates realistic articulated motion sequence, in contrast to most previous video generation methods which yield blurred motion effects. We test our model on two human action datasets including KTH and Human3.6M, and the proposed framework generates very promising results on both datasets.Comment: ACM MM 201

    Which game are you playing? - An interactive and educational video show

    Get PDF
    Project “Which game are you playing?” is an interactive video show, which has an educational purpose. Video records projected on the three walls are swapped according to where a person has sat down, in other words, according to a person’s »ego« role: child, parent or adult. Videos show conflicts among young people and their resolution according to the person's »ego« role. We made the project with two programs: Smart Wall and vvvv. For the realization of this project we need two computers, which are connected with an ethernet cable, two cameras, one placed on the ceiling and the other in front of the »ego« chairs, and three projectors