5,444 research outputs found

    Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

    Full text link
    It is common to implicitly assume access to intelligently captured inputs (e.g., photos from a human photographer), yet autonomously capturing good observations is itself a major challenge. We address the problem of learning to look around: if a visual agent has the ability to voluntarily acquire new views to observe its environment, how can it learn efficient exploratory behaviors to acquire informative observations? We propose a reinforcement learning solution, where the agent is rewarded for actions that reduce its uncertainty about the unobserved portions of its environment. Based on this principle, we develop a recurrent neural network-based approach to perform active completion of panoramic natural scenes and 3D object shapes. Crucially, the learned policies are not tied to any recognition task nor to the particular semantic content seen during training. As a result, 1) the learned "look around" behavior is relevant even for new tasks in unseen environments, and 2) training data acquisition involves no manual labeling. Through tests in diverse settings, we demonstrate that our approach learns useful generic policies that transfer to new unseen tasks and environments. Completion episodes are shown at https://goo.gl/BgWX3W

    Disparity map generation based on trapezoidal camera architecture for multiview video

    Get PDF
    Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities, the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video remains a huge challenge. This paper presents the mathematical description of trapezoidal camera architecture and relationships which facilitate the determination of camera position for visual content acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera Architecture is that it allows for adaptive camera topology by which points within the scene, especially the occluded ones can be optically and geometrically viewed from several different viewpoints either on the edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate description could very well be used to address the issue of occlusion which continues to be a major problem in computer vision with regards to the generation of depth map

    Free Viewpoint Video Based on Stitching Technique

    Get PDF
    Image stitching is a technique used for creating one panoramic scene from multiple images. It is used in panoramic photography and video where the viewer can only scroll horizontally and vertically across the scene. However, stitching has not been used for creating free-viewpoint videos (FVV) where viewers can change their viewing points freely and smoothly while playing the video. current research, implemented FVV playing system using image stitching, this system allows users to enjoy the capability of moving their viewpoint freely and smoothly. To develop this system, user should capture MVV from different viewpoints and with appropriate region area for each pair of cameras then the system stitch the overlapped video to create stitched video/videos to display it in FVV playing system with applying freely and smoothly switching and interpolation of viewpoints over video playback. Current research evaluated the performance of video playing system based on system idea, system accuracy, smoothness, and user satisfaction. The results of evaluation have been very positive in most aspects
    • …
    corecore