224 research outputs found

    Casual 3D photography

    Get PDF
    We present an algorithm that enables casual 3D photography. Given a set of input photos captured with a hand-held cell phone or DSLR camera, our algorithm reconstructs a 3D photo, a central panoramic, textured, normal mapped, multi-layered geometric mesh representation. 3D photos can be stored compactly and are optimized for being rendered from viewpoints that are near the capture viewpoints. They can be rendered using a standard rasterization pipeline to produce perspective views with motion parallax. When viewed in VR, 3D photos provide geometrically consistent views for both eyes. Our geometric representation also allows interacting with the scene using 3D geometry-aware effects, such as adding new objects to the scene and artistic lighting effects. Our 3D photo reconstruction algorithm starts with a standard structure from motion and multi-view stereo reconstruction of the scene. The dense stereo reconstruction is made robust to the imperfect capture conditions using a novel near envelope cost volume prior that discards erroneous near depth hypotheses. We propose a novel parallax-tolerant stitching algorithm that warps the depth maps into the central panorama and stitches two color-and-depth panoramas for the front and back scene surfaces. The two panoramas are fused into a single non-redundant, well-connected geometric mesh. We provide videos demonstrating users interactively viewing and manipulating our 3D photos

    Image-Based Scene Representations for Head-Motion Parallax in 360° Panoramas

    Get PDF
    Creation and delivery of “RealVR” experiences essentially consists of the following four main steps: capture, processing, representation and rendering. In this chapter, we present, compare, and discuss two recent end-to-end approaches, Parallax360 by Luo et al. [9] and MegaParallax by Bertel et al. [3]. Both propose complete pipelines for RealVR content generation and novel-view synthesis with head-motion parallax for 360° environments.Parallax360 uses a robotic arm for capturing thousands of input views on the surface of a sphere. Based on precomputed disparity motion fields and pairwise optical flow, novel viewpoints are synthesized on the fly using flow-based blending of the nearest two to three input views which provides compelling head-motion parallax.MegaParallax proposes a pipeline for RealVR content generation and rendering that emphasizes casual, hand-held capturing. The approach introduces view-dependent flow-based blending to enable novel-view synthesis with head-motion parallax within a viewing area determined by the field of view of the input cameras and the capturing radius.We describe both methods and discuss their similarities and differences in corresponding steps in the RealVR pipeline and show selected results. The chapter ends by discussing advantages and disadvantages as well as outlining the most important limitations and future work.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 66599

    Distortion Estimation Through Explicit Modeling of the Refractive Surface

    Full text link
    Precise calibration is a must for high reliance 3D computer vision algorithms. A challenging case is when the camera is behind a protective glass or transparent object: due to refraction, the image is heavily distorted; the pinhole camera model alone can not be used and a distortion correction step is required. By directly modeling the geometry of the refractive media, we build the image generation process by tracing individual light rays from the camera to a target. Comparing the generated images to their distorted - observed - counterparts, we estimate the geometry parameters of the refractive surface via model inversion by employing an RBF neural network. We present an image collection methodology that produces data suited for finding the distortion parameters and test our algorithm on synthetic and real-world data. We analyze the results of the algorithm.Comment: Accepted to ICANN 201

    Complexity of Discrete Energy Minimization Problems

    Full text link
    Discrete energy minimization is widely-used in computer vision and machine learning for problems such as MAP inference in graphical models. The problem, in general, is notoriously intractable, and finding the global optimal solution is known to be NP-hard. However, is it possible to approximate this problem with a reasonable ratio bound on the solution quality in polynomial time? We show in this paper that the answer is no. Specifically, we show that general energy minimization, even in the 2-label pairwise case, and planar energy minimization with three or more labels are exp-APX-complete. This finding rules out the existence of any approximation algorithm with a sub-exponential approximation ratio in the input size for these two problems, including constant factor approximations. Moreover, we collect and review the computational complexity of several subclass problems and arrange them on a complexity scale consisting of three major complexity classes -- PO, APX, and exp-APX, corresponding to problems that are solvable, approximable, and inapproximable in polynomial time. Problems in the first two complexity classes can serve as alternative tractable formulations to the inapproximable ones. This paper can help vision researchers to select an appropriate model for an application or guide them in designing new algorithms.Comment: ECCV'16 accepte

    Object class recognition using combination of colour dense SIFT and texture descriptors

    Get PDF
    Object class recognition has recently become one of the most popular research fields. This is due to its importance in many applications such as image classification, retrieval, indexing, and searching. The main aim of object class recognition is determining how to make computers understand and identify automatically which object or scene is being displayed on the image. Despite a lot of efforts that have been made, it still considered as one of the most challenging tasks, mainly due to inter-class variations and intra-class variations like occlusion, background clutter, viewpoint changes, pose, scale and illumination. Feature extraction is one of the important steps in any object class recognition system. Different image features are proposed in the literature review to increase categorisation accuracy such as appearance, texture, shape descriptors. In this paper, we propose to combine different descriptors which are dense colour scale-invariant feature transform (dense colour SIFT) as appearance descriptors with different texture descriptors. The colour completed local binary pattern (CCLBP) and completed local ternary pattern (CLTP) are integrated with dense colour SIFT due to the importance of the texture information in the image. Using different pattern sizes to extract the CLTP and CCLBP texture descriptors will help to find dense texture information from the image. Bag of features is also used in the proposed system with each descriptor while the late fusion strategy is used in the classification stage. The proposed system achieved high recognition accuracy rate when applied in some datasets, namely SUN-397, OT4N, OT8, and Event sport datasets, which accomplished 38.9%, 95.9%, 89.02%, and 88.167%, respectively

    Capture, Reconstruction, and Representation of the Visual Real World for Virtual Reality

    Get PDF
    We provide an overview of the concerns, current practice, and limitations for capturing, reconstructing, and representing the real world visually within virtual reality. Given that our goals are to capture, transmit, and depict complex real-world phenomena to humans, these challenges cover the opto-electro-mechanical, computational, informational, and perceptual fields. Practically producing a system for real-world VR capture requires navigating a complex design space and pushing the state of the art in each of these areas. As such, we outline several promising directions for future work to improve the quality and flexibility of real-world VR capture systems

    Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

    Get PDF
    We present a principled framework for inferring pixel labels in weakly-annotated image datasets. Most previous, example-based approaches to computer vision rely on a large corpus of densely labeled images. However, for large, modern image datasets, such labels are expensive to obtain and are often unavailable. We establish a large-scale graphical model spanning all labeled and unlabeled images, then solve it to infer pixel labels jointly for all images in the dataset while enforcing consistent annotations over similar visual patterns. This model requires significantly less labeled data and assists in resolving ambiguities by propagating inferred annotations from images with stronger local visual evidences to images with weaker local evidences. We apply our proposed framework to two computer vision problems, namely image annotation with semantic segmentation, and object discovery and co-segmentation (segmenting multiple images containing a common object). Extensive numerical evaluations and comparisons show that our method consistently outperforms the state-of-the-art in automatic annotation and semantic labeling, while requiring significantly less labeled data. In contrast to previous co-segmentation techniques, our method manages to discover and segment objects well even in the presence of substantial amounts of noise images (images not containing the common object), as typical for datasets collected from Internet search

    A Survey of Methods for Volumetric Scene Reconstruction from Photographs

    Get PDF
    Scene reconstruction, the task of generating a 3D model of a scene given multiple 2D photographs taken of the scene, is an old and difficult problem in computer vision. Since its introduction, scene reconstruction has found application in many fields, including robotics, virtual reality, and entertainment. Volumetric models are a natural choice for scene reconstruction. Three broad classes of volumetric reconstruction techniques have been developed based on geometric intersections, color consistency, and pair-wise matching. Some of these techniques have spawned a number of variations and undergone considerable refinement. This paper is a survey of techniques for volumetric scene reconstruction

    On Mixing in Pairwise Markov Random Fields with Application to Social Networks

    Get PDF
    International audienceWe consider pairwise Markov random fields which have a number of important applications in statistical physics, image processing and machine learning such as Ising model and labeling problem to name a couple. Our own motivation comes from the need to produce synthetic models for social networks with attributes. First, we give conditions for rapid mixing of the associated Glauber dynamics and consider interesting particular cases. Then, for pairwise Markov random fields with submodular energy functions we construct monotone perfect simulation

    On a coupled PDE model for image restoration

    Full text link
    In this paper, we consider a new coupled PDE model for image restoration. Both the image and the edge variables are incorporated by coupling them into two different PDEs. It is shown that the initial-boundary value problem has global in time dissipative solutions (in a sense going back to P.-L. Lions), and several properties of these solutions are established. This is a rough draft, and the final version of the paper will contain a modelling part and numerical experiments
    • …
    corecore