6,136 research outputs found

    Quality Assessment of Free-viewpoint Videos by Quantifying the Elastic Changes of Multi-Scale Motion Trajectories

    Full text link
    Virtual viewpoints synthesis is an essential process for many immersive applications including Free-viewpoint TV (FTV). A widely used technique for viewpoints synthesis is Depth-Image-Based-Rendering (DIBR) technique. However, such techniques may introduce challenging non-uniform spatial-temporal structure-related distortions. Most of the existing state-of-the-art quality metrics fail to handle these distortions, especially the temporal structure inconsistencies observed during the switch of different viewpoints. To tackle this problem, an elastic metric and multi-scale trajectory based video quality metric (EM-VQM) is proposed in this paper. Dense motion trajectory is first used as a proxy for selecting temporal sensitive regions, where local geometric distortions might significantly diminish the perceived quality. Afterwards, the amount of temporal structure inconsistencies and unsmooth viewpoints transitions are quantified by calculating 1) the amount of motion trajectory deformations with elastic metric and, 2) the spatial-temporal structural dissimilarity. According to the comprehensive experimental results on two FTV video datasets, the proposed metric outperforms the state-of-the-art metrics designed for free-viewpoint videos significantly and achieves a gain of 12.86% and 16.75% in terms of median Pearson linear correlation coefficient values on the two datasets compared to the best one, respectively.Comment: 13 page

    Real-time collision detection method for deformable bodies

    Full text link
    This paper presents a real-time solution for collision detection between objects based on the physics properties. Traditional approaches on collision detection often rely on the geometric relationships that computing the intersections between polygons. Such technique is very computationally expensive when applied for deformable objects. As an alternative, we approximate the 3D mesh in an spherical surface implicitly. This allows us to perform a coarse-level collision detection at extremely fast speed. Then a dynamic programming based procedure is applied to identify the collision in fine details. Our method demonstrates better prevention to collision tunnelling and works more efficiently than the state-of-the-arts.Comment: Computer-Aided Design, 201

    A Novel Semantics and Feature Preserving Perspective for Content Aware Image Retargeting

    Full text link
    There is an increasing requirement for efficient image retargeting techniques to adapt the content to various forms of digital media. With rapid growth of mobile communications and dynamic web page layouts, one often needs to resize the media content to adapt to the desired display sizes. For various layouts of web pages and typically small sizes of handheld portable devices, the importance in the original image content gets obfuscated after resizing it with the approach of uniform scaling. Thus, there occurs a need for resizing the images in a content aware manner which can automatically discard irrelevant information from the image and present the salient features with more magnitude. There have been proposed some image retargeting techniques keeping in mind the content awareness of the input image. However, these techniques fail to prove globally effective for various kinds of images and desired sizes. The major problem is the inefficiency of these algorithms to process these images with minimal visual distortion while also retaining the meaning conveyed from the image. In this dissertation, we present a novel perspective for content aware image retargeting, which is well implementable in real time. We introduce a novel method of analysing semantic information within the input image while also maintaining the important and visually significant features. We present the various nuances of our algorithm mathematically and logically, and show that the results prove better than the state-of-the-art techniques.Comment: 74 Pages, 46 Figures, Masters Thesi

    HDFD --- A High Deformation Facial Dynamics Benchmark for Evaluation of Non-Rigid Surface Registration and Classification

    Full text link
    Objects that undergo non-rigid deformation are common in the real world. A typical and challenging example is the human faces. While various techniques have been developed for deformable shape registration and classification, benchmarks with detailed labels and landmarks suitable for evaluating such techniques are still limited. In this paper, we present a novel facial dynamic dataset HDFD which addresses the gap of existing datasets, including 4D funny faces with substantial non-isometric deformation, and 4D visual-audio faces of spoken phrases in a minority language (Welsh). Both datasets are captured from 21 participants. The sequences are manually landmarked, with the spoken phrases further rated by a Welsh expert for level of fluency. These are useful for quantitative evaluation of both registration and classification tasks. We further develop a methodology to evaluate several recent non-rigid surface registration techniques, using our dynamic sequences as test cases. The study demonstrates the significance and usefulness of our new dataset --- a challenging benchmark dataset for future techniques

    Diffusion framework for geometric and photometric data fusion in non-rigid shape analysis

    Full text link
    In this paper, we explore the use of the diffusion geometry framework for the fusion of geometric and photometric information in local and global shape descriptors. Our construction is based on the definition of a diffusion process on the shape manifold embedded into a high-dimensional space where the embedding coordinates represent the photometric information. Experimental results show that such data fusion is useful in coping with different challenges of shape analysis where pure geometric and pure photometric methods fail

    Cycle-IR: Deep Cyclic Image Retargeting

    Full text link
    Supervised deep learning techniques have achieved great success in various fields due to getting rid of the limitation of handcrafted representations. However, most previous image retargeting algorithms still employ fixed design principles such as using gradient map or handcrafted features to compute saliency map, which inevitably restricts its generality. Deep learning techniques may help to address this issue, but the challenging problem is that we need to build a large-scale image retargeting dataset for the training of deep retargeting models. However, building such a dataset requires huge human efforts. In this paper, we propose a novel deep cyclic image retargeting approach, called Cycle-IR, to firstly implement image retargeting with a single deep model, without relying on any explicit user annotations. Our idea is built on the reverse mapping from the retargeted images to the given input images. If the retargeted image has serious distortion or excessive loss of important visual information, the reverse mapping is unlikely to restore the input image well. We constrain this forward-reverse consistency by introducing a cyclic perception coherence loss. In addition, we propose a simple yet effective image retargeting network (IRNet) to implement the image retargeting process. Our IRNet contains a spatial and channel attention layer, which is able to discriminate visually important regions of input images effectively, especially in cluttered images. Given arbitrary sizes of input images and desired aspect ratios, our Cycle-IR can produce visually pleasing target images directly. Extensive experiments on the standard RetargetMe dataset show the superiority of our Cycle-IR. In addition, our Cycle-IR outperforms the Multiop method and obtains the best result in the user study. Code is available at https://github.com/mintanwei/Cycle-IR.Comment: 12 page

    HDM-Net: Monocular Non-Rigid 3D Reconstruction with Learned Deformation Model

    Full text link
    Monocular dense 3D reconstruction of deformable objects is a hard ill-posed problem in computer vision. Current techniques either require dense correspondences and rely on motion and deformation cues, or assume a highly accurate reconstruction (referred to as a template) of at least a single frame given in advance and operate in the manner of non-rigid tracking. Accurate computation of dense point tracks often requires multiple frames and might be computationally expensive. Availability of a template is a very strong prior which restricts system operation to a pre-defined environment and scenarios. In this work, we propose a new hybrid approach for monocular non-rigid reconstruction which we call Hybrid Deformation Model Network (HDM-Net). In our approach, deformation model is learned by a deep neural network, with a combination of domain-specific loss functions. We train the network with multiple states of a non-rigidly deforming structure with a known shape at rest. HDM-Net learns different reconstruction cues including texture-dependent surface deformations, shading and contours. We show generalisability of HDM-Net to states not presented in the training dataset, with unseen textures and under new illumination conditions. Experiments with noisy data and a comparison with other methods demonstrate robustness and accuracy of the proposed approach and suggest possible application scenarios of the new technique in interventional diagnostics and augmented reality.Comment: 9 pages, 9 figure

    Fast detection of multiple objects in traffic scenes with a common detection framework

    Full text link
    Traffic scene perception (TSP) aims to real-time extract accurate on-road environment information, which in- volves three phases: detection of objects of interest, recognition of detected objects, and tracking of objects in motion. Since recognition and tracking often rely on the results from detection, the ability to detect objects of interest effectively plays a crucial role in TSP. In this paper, we focus on three important classes of objects: traffic signs, cars, and cyclists. We propose to detect all the three important objects in a single learning based detection framework. The proposed framework consists of a dense feature extractor and detectors of three important classes. Once the dense features have been extracted, these features are shared with all detectors. The advantage of using one common framework is that the detection speed is much faster, since all dense features need only to be evaluated once in the testing phase. In contrast, most previous works have designed specific detectors using different features for each of these objects. To enhance the feature robustness to noises and image deformations, we introduce spatially pooled features as a part of aggregated channel features. In order to further improve the generalization performance, we propose an object subcategorization method as a means of capturing intra-class variation of objects. We experimentally demonstrate the effectiveness and efficiency of the proposed framework in three detection applications: traffic sign detection, car detection, and cyclist detection. The proposed framework achieves the competitive performance with state-of- the-art approaches on several benchmark datasets.Comment: Appearing in IEEE Transactions on Intelligent Transportation System

    Rigid and Non-rigid Shape Evolutions for Shape Alignment and Recovery in Images

    Full text link
    The same type of objects in different images may vary in their shapes because of rigid and non-rigid shape deformations, occluding foreground as well as cluttered background. The problem concerned in this work is the shape extraction in such challenging situations. We approach the shape extraction through shape alignment and recovery. This paper presents a novel and general method for shape alignment and recovery by using one example shapes based on deterministic energy minimization. Our idea is to use general model of shape deformation in minimizing active contour energies. Given \emph{a priori} form of the shape deformation, we show how the curve evolution equation corresponding to the shape deformation can be derived. The curve evolution is called the prior variation shape evolution (PVSE). We also derive the energy-minimizing PVSE for minimizing active contour energies. For shape recovery, we propose to use the PVSE that deforms the shape while preserving its shape characteristics. For choosing such shape-preserving PVSE, a theory of shape preservability of the PVSE is established. Experimental results validate the theory and the formulations, and they demonstrate the effectiveness of our method

    Skeletal Representations and Applications

    Full text link
    When representing a solid object there are alternatives to the use of traditional explicit (surface meshes) or implicit (zero crossing of implicit functions) methods. Skeletal representations encode shape information in a mixed fashion: they are composed of a set of explicit primitives, yet they are able to efficiently encode the shape's volume as well as its topology. I will discuss, in two dimensions, how symmetry can be used to reduce the dimensionality of the data (from a 2D solid to a 1D curve), and how this relates to the classical definition of skeletons by Medial Axis Transform. While the medial axis of a 2D shape is composed of a set of curves, in 3D it results in a set of sheets connected in a complex fashion. Because of this complexity, medial skeletons are difficult to use in practical applications. Curve skeletons address this problem by strictly requiring their geometry to be one dimensional, resulting in an intuitive yet powerful shape representation. In this report I will define both medial and curve skeletons and discuss their mutual relationship. I will also present several algorithms for their computation and a variety of scenarios where skeletons are employed, with a special focus on geometry processing and shape analysis.Comment: 42 pages, SFU Depth Exa
    • …
    corecore