6,907 research outputs found

    Learning how to be robust: Deep polynomial regression

    Get PDF
    Polynomial regression is a recurrent problem with a large number of applications. In computer vision it often appears in motion analysis. Whatever the application, standard methods for regression of polynomial models tend to deliver biased results when the input data is heavily contaminated by outliers. Moreover, the problem is even harder when outliers have strong structure. Departing from problem-tailored heuristics for robust estimation of parametric models, we explore deep convolutional neural networks. Our work aims to find a generic approach for training deep regression models without the explicit need of supervised annotation. We bypass the need for a tailored loss function on the regression parameters by attaching to our model a differentiable hard-wired decoder corresponding to the polynomial operation at hand. We demonstrate the value of our findings by comparing with standard robust regression methods. Furthermore, we demonstrate how to use such models for a real computer vision problem, i.e., video stabilization. The qualitative and quantitative experiments show that neural networks are able to learn robustness for general polynomial regression, with results that well overpass scores of traditional robust estimation methods.Comment: 18 pages, conferenc

    Transitioning360: Content-aware NFoV Virtual Camera Paths for 360° Video Playback

    Get PDF

    Real-time low-complexity digital video stabilization in the compressed domain

    Get PDF

    Detecting and removing visual distractors for video aesthetic enhancement

    Get PDF
    Personal videos often contain visual distractors, which are objects that are accidentally captured that can distract viewers from focusing on the main subjects. We propose a method to automatically detect and localize these distractors through learning from a manually labeled dataset. To achieve spatially and temporally coherent detection, we propose extracting features at the Temporal-Superpixel (TSP) level using a traditional SVM-based learning framework. We also experiment with end-to-end learning using Convolutional Neural Networks (CNNs), which achieves slightly higher performance than other methods. The classification result is further refined in a post-processing step based on graph-cut optimization. Experimental results show that our method achieves an accuracy of 81% and a recall of 86%. We demonstrate several ways of removing the detected distractors to improve the video quality, including video hole filling; video frame replacement; and camera path re-planning. The user study results show that our method can significantly improve the aesthetic quality of videos

    CAMHID: Camera motion histogram descriptor and its application to cinematographic shot classification

    Full text link
    © 1991-2012 IEEE. In this paper, we propose a nonparametric camera motion descriptor for video shot classification. In the proposed method, a motion vector field (MVF) is constructed for each consecutive video frame by computing the motion vector (MV) of each macroblock. Then, the MVFs are divided into a number of local region of equal size. Next, the inconsistent/noisy MVs of each local region are eliminated by a motion consistency analysis. The remaining MVs of each local region from a number of consecutive frames are further collected for a compact representation. Initially, a matrix is formed using the MVs. Then, the matrix is decomposed using a singular value decomposition technique to represent the dominant motion. Finally, the angle of the most variance retaining principal component is computed and quantized to represent the motion of a local region by using a histogram. In order to represent the global camera motion, the local histograms are combined. The effectiveness of the proposed motion descriptor for video shot classification is tested by using a support vector machine. First, the proposed camera motion descriptors for video shots classification are computed on a video data set consisting of regular camera motion patterns (e.g., pan, zoom, tilt, static). Then, we apply the camera motion descriptors with an extended set of features to the classification of cinematographic shots. The experimental results show that the proposed shot level camera motion descriptor has a strong discriminative capability to classify different camera motion patterns of different videos effectively. We also show that our approach outperforms state-of-the-art methods
    • …
    corecore