121 research outputs found
Intelligent visual media processing: when graphics meets vision
The computer graphics and computer vision communities have been working closely together in recent
years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media
around us. There are three major driving forces behind this phenomenon: i) the availability of big data from the
Internet has created a demand for dealing with the ever increasing, vast amount of resources; ii) powerful processing
tools, such as deep neural networks, provide e�ective ways for learning how to deal with heterogeneous visual data;
iii) new data capture devices, such as the Kinect, bridge between algorithms for 2D image understanding and
3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics
and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey
recent research on how computer vision techniques bene�t computer graphics techniques and vice versa, and cover
research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest
possible further research directions
Content-Preserving Warps for 3D Video Stabilization
We describe a technique that transforms a video from a hand-held video camera so that it appears as if it were taken with a directed camera motion. Our method adjusts the video to appear as if it were taken from nearby viewpoints, allowing 3D camera movements to be simulated. By aiming only for perceptual plausibility, rather than accurate reconstruction, we are able to develop algorithms that can effectively recreate dynamic scenes from a single source video. Our technique first recovers the original 3D camera motion and a sparse set of 3D, static scene points using an off-the-shelf structure-frommotion system. Then, a desired camera path is computed either automatically (e.g., by fitting a linear or quadratic path) or interactively. Finally, our technique performs a least-squares optimization that computes a spatially-varying warp from each input video frame into an output frame. The warp is computed to both follow the sparse displacements suggested by the recovered 3D structure, and avoid deforming the content in the video frame. Our experiments on stabilizing challenging videos of dynamic scenes demonstrate the effectiveness of our technique
Real-time content-aware video retargeting on the Android platform for tunnel vision assistance
As mobile devices continue to rise in popularity, advances in overall mobile device processing power lead to further expansion of their capabilities. This, coupled with the fact that many people suffer from low vision, leaves substantial room for advancing mobile development for low vision assistance. Computer vision is capable of assisting and accommodating individuals with blind spots or tunnel vision by extracting the necessary information and presenting it to the user in a manner they are able to visualize. Such a system would enable individuals with low vision to function with greater ease. Additionally, offering assistance on a mobile platform allows greater access. The objective of this thesis is to develop a computer vision application for low vision assistance on the Android mobile device platform. Specifically, the goal of the application is to reduce the effects tunnel vision inflicts on individuals. This is accomplished by providing an in-depth real-time video retargeting model that builds upon previous works and applications. Seam carving is a content-aware retargeting operator which defines 8-connected paths, or seams, of pixels. The optimality of these seams is based on a specific energy function. Discrete removal of these seams permits changes in the aspect ratio while simultaneously preserving important regions. The video retargeting model incorporates spatial and temporal considerations to provide effective image and video retargeting. Data reduction techniques are utilized in order to generate an efficient model. Additionally, a minimalistic multi-operator approach is constructed to diminish the disadvantages experienced by individual operators. In the event automated techniques fail, interactive options are provided that allow for user intervention. Evaluation of the application and its video retargeting model is based on its comparison to existing standard algorithms and its ability to extend itself to real-time. Performance metrics are obtained for both PC environments and mobile device platforms for comparison
Prototypicality effects in global semantic description of objects
In this paper, we introduce a novel approach for semantic description of
object features based on the prototypicality effects of the Prototype Theory.
Our prototype-based description model encodes and stores the semantic meaning
of an object, while describing its features using the semantic prototype
computed by CNN-classifications models. Our method uses semantic prototypes to
create discriminative descriptor signatures that describe an object
highlighting its most distinctive features within the category. Our experiments
show that: i) our descriptor preserves the semantic information used by the
CNN-models in classification tasks; ii) our distance metric can be used as the
object's typicality score; iii) our descriptor signatures are semantically
interpretable and enables the simulation of the prototypical organization of
objects within a category.Comment: Paper accepted in IEEE Winter Conference on Applications of Computer
Vision 2019 (WACV2019). Content: 10 pages (8 + 2 reference) with 7 figure
IDA: Improved Data Augmentation Applied to Salient Object Detection
In this paper, we present an Improved Data Augmentation (IDA) technique
focused on Salient Object Detection (SOD). Standard data augmentation
techniques proposed in the literature, such as image cropping, rotation,
flipping, and resizing, only generate variations of the existing examples,
providing a limited generalization. Our method combines image inpainting,
affine transformations, and the linear combination of different generated
background images with salient objects extracted from labeled data. Our
proposed technique enables more precise control of the object's position and
size while preserving background information. The background choice is based on
an inter-image optimization, while object size follows a uniform random
distribution within a specified interval, and the object position is
intra-image optimal. We show that our method improves the segmentation quality
when used for training state-of-the-art neural networks on several famous
datasets of the SOD field. Combining our method with others surpasses
traditional techniques such as horizontal-flip in 0.52% for F-measure and 1.19%
for Precision. We also provide an evaluation in 7 different SOD datasets, with
9 distinct evaluation metrics and an average ranking of the evaluated methods.Comment: Accepted for presentation at SIBGRAPI 2020 - 33rd Conference on
Graphics, Patterns and Image
Sketch Video Synthesis
Understanding semantic intricacies and high-level concepts is essential in
image sketch generation, and this challenge becomes even more formidable when
applied to the domain of videos. To address this, we propose a novel
optimization-based framework for sketching videos represented by the frame-wise
B\'ezier curve. In detail, we first propose a cross-frame stroke initialization
approach to warm up the location and the width of each curve. Then, we optimize
the locations of these curves by utilizing a semantic loss based on CLIP
features and a newly designed consistency loss using the self-decomposed 2D
atlas network. Built upon these design elements, the resulting sketch video
showcases impressive visual abstraction and temporal coherence. Furthermore, by
transforming a video into SVG lines through the sketching process, our method
unlocks applications in sketch-based video editing and video doodling, enabled
through video composition, as exemplified in the teaser.Comment: Webpage: https://sketchvideo.github.io/ Github:
https://github.com/yudianzheng/SketchVide
- …