584,781 research outputs found

    Point-based modeling from a single image

    No full text
    The complexity of virtual environments has grown spectacularly over the recent years, mainly thanks to the use of the currently cheap high performance graphics cards. As the graphics cards improve the performance and the geometry complexity grows, many of the objects present in the scene only project to a few pixels on the screen. This represents a waste in computing effort for the transforming and clipping of maybe a lot of polygons that could be substituted by a simple point or a small set of points. Recently, efficient rendering algorithms for point models have been proposed. However, little attention has been focused on building a point-based modeler, using the advantages that such a representation can provide. In this paper we present a modeler that can generate 3D geometry from an image, completely built on points. It takes as input an image and creates a point-based representation from it. Then, a set of operators allow to modify the geometry in order to produce 3D geometry from the image. With our system it is possible to generate in short time complex geometries that would be difficult to model with a polygon-based modeler.Postprint (published version

    Indoor scene 3D modeling with single image

    Get PDF
    3D modeling is a fundamental and very important research area in computer vision and computer graphics. One specific category of this research field is indoor scene 3D modeling. Many efforts have been devoted to its development, but this particular type of modeling is far from mature. Some researchers have focused on single-view reconstruction which reconstructs a 3D model from a single-view 2D indoor image. This is based on the Manhattan world assumption, which states that structure edges are usually parallel with the X, Y, Z axis of the Cartesian coordinate system defined in a scene. Parallel lines, when projected to a 2D image, are straight lines that converge to a vanishing point. Single-view reconstruction uses these constraints to do 3D modeling from a 2D image only. However, this is not an easy task due to the lack of depth information in the 2D image. With the development and maturity of 3D imaging methods such as stereo vision, structured light triangulation, laser strip triangulation, etc., devices that gives 2D images associated with depth information, which form the so called RGBD image, are becoming more popular. Processing of RGB color images and depth images can be combined to ease the 3D modeling of indoor scenes. Two methods combining 2D and 3D modeling are developed in this thesis for comparison. One is region growing segmentation, and second is RANSAC planar segmentation in 3D directly. Results are compared, and 3D modeling is illustrated. 3D modeling is composed of plane labeling, automatic floor, wall, and boundary point detection, wall domain partitions using automatically detected wall, and wall boundary points in 2D image, 3D modeling by extruding from obtained boundary points from floor plane etc. Tests were conducted to verify the method

    Single-picture reconstruction and rendering of trees for plausible vegetation synthesis

    Get PDF
    State-of-the-art approaches for tree reconstruction either put limiting constraints on the input side (requiring multiple photographs, a scanned point cloud or intensive user input) or provide a representation only suitable for front views of the tree. In this paper we present a complete pipeline for synthesizing and rendering detailed trees from a single photograph with minimal user effort. Since the overall shape and appearance of each tree is recovered from a single photograph of the tree crown, artists can benefit from georeferenced images to populate landscapes with native tree species. A key element of our approach is a compact representation of dense tree crowns through a radial distance map. Our first contribution is an automatic algorithm for generating such representations from a single exemplar image of a tree. We create a rough estimate of the crown shape by solving a thin-plate energy minimization problem, and then add detail through a simplified shape-from-shading approach. The use of seamless texture synthesis results in an image-based representation that can be rendered from arbitrary view directions at different levels of detail. Distant trees benefit from an output-sensitive algorithm inspired on relief mapping. For close-up trees we use a billboard cloud where leaflets are distributed inside the crown shape through a space colonization algorithm. In both cases our representation ensures efficient preservation of the crown shape. Major benefits of our approach include: it recovers the overall shape from a single tree image, involves no tree modeling knowledge and minimal authoring effort, and the associated image-based representation is easy to compress and thus suitable for network streaming.Peer ReviewedPostprint (author's final draft

    Flow-based GAN for 3D Point Cloud Generation from a Single Image

    Get PDF
    Generating a 3D point cloud from a single 2D image is of great importance for 3D scene understanding applications. To reconstruct the whole 3D shape of the object shown in the image, the existing deep learning based approaches use either explicit or implicit generative modeling of point clouds, which, however, suffer from limited quality. In this work, we aim to alleviate this issue by introducing a hybrid explicit-implicit generative modeling scheme, which inherits the flow-based explicit generative models for sampling point clouds with arbitrary resolutions while improving the detailed 3D structures of point clouds by leveraging the implicit generative adversarial networks (GANs). We evaluate on the large-scale synthetic dataset ShapeNet, with the experimental results demonstrating the superior performance of the proposed method. In addition, the generalization ability of our method is demonstrated by performing on cross-category synthetic images as well as by testing on real images from PASCAL3D+ dataset.Comment: 13 pages, 5 figures, accepted to BMVC202

    C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds

    Get PDF
    Flow-based generative models have highly desirable properties like exact log-likelihood evaluation and exact latent-variable inference, however they are still in their infancy and have not received as much attention as alternative generative models. In this paper, we introduce C-Flow, a novel conditioning scheme that brings normalizing flows to an entirely new scenario with great possibilities for multi-modal data modeling. C-Flow is based on a parallel sequence of invertible mappings in which a source flow guides the target flow at every step, enabling fine-grained control over the generation process. We also devise a new strategy to model unordered 3D point clouds that, in combination with the conditioning scheme, makes it possible to address 3D reconstruction from a single image and its inverse problem of rendering an image given a point cloud. We demonstrate our conditioning method to be very adaptable, being also applicable to image manipulation, style transfer and multi-modal image-to-image mapping in a diversity of domains, including RGB images, segmentation maps, and edge masks

    Single View Reconstruction for Human Face and Motion with Priors

    Get PDF
    Single view reconstruction is fundamentally an under-constrained problem. We aim to develop new approaches to model human face and motion with model priors that restrict the space of possible solutions. First, we develop a novel approach to recover the 3D shape from a single view image under challenging conditions, such as large variations in illumination and pose. The problem is addressed by employing the techniques of non-linear manifold embedding and alignment. Specifically, the local image models for each patch of facial images and the local surface models for each patch of 3D shape are learned using a non-linear dimensionality reduction technique, and the correspondences between these local models are then learned by a manifold alignment method. Local models successfully remove the dependency of large training databases for human face modeling. By combining the local shapes, the global shape of a face can be reconstructed directly from a single linear system of equations via least square. Unfortunately, this learning-based approach cannot be successfully applied to the problem of human motion modeling due to the internal and external variations in single view video-based marker-less motion capture. Therefore, we introduce a new model-based approach for capturing human motion using a stream of depth images from a single depth sensor. While a depth sensor provides metric 3D information, using a single sensor, instead of a camera array, results in a view-dependent and incomplete measurement of object motion. We develop a novel two-stage template fitting algorithm that is invariant to subject size and view-point variations, and robust to occlusions. Starting from a known pose, our algorithm first estimates a body configuration through temporal registration, which is used to search the template motion database for a best match. The best match body configuration as well as its corresponding surface mesh model are deformed to fit the input depth map, filling in the part that is occluded from the input and compensating for differences in pose and body-size between the input image and the template. Our approach does not require any makers, user-interaction, or appearance-based tracking. Experiments show that our approaches can achieve good modeling results for human face and motion, and are capable of dealing with variety of challenges in single view reconstruction, e.g., occlusion

    3D Hallway Modeling Using a Single Image

    Full text link
    Real-time, low-resource corridor reconstruction using a single consumer grade RGB camera is a powerful tool for allowing a fast, inexpensive solution to indoor mobility of a visually impaired person or a robot. The perspective and known geometry of a corridor is used to extract the important features of the image and create a 3D model from a single image. Multiple 3D models can be combined to increase confidence and provide a global 3D model. This paper presents our results on 3D corridor modeling using single images. First a simple but effective 3D corridor modeling approach is introduced which makes very few assumptions of the camera information. Second, a perspective based Hough transform algorithm is proposed to detect vertical lines in order to determine the edges of the corridor. Finally, issues in real-time implementation on a smartphone are discussed. Experimental results are provided to validate the proposed approach. Index Terms-- indoor modeling, vanishing point, visual impairment, perspective based Hough transform

    CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer

    Full text link
    Reconstructing personalized animatable head avatars has significant implications in the fields of AR/VR. Existing methods for achieving explicit face control of 3D Morphable Models (3DMM) typically rely on multi-view images or videos of a single subject, making the reconstruction process complex. Additionally, the traditional rendering pipeline is time-consuming, limiting real-time animation possibilities. In this paper, we introduce CVTHead, a novel approach that generates controllable neural head avatars from a single reference image using point-based neural rendering. CVTHead considers the sparse vertices of mesh as the point set and employs the proposed Vertex-feature Transformer to learn local feature descriptors for each vertex. This enables the modeling of long-range dependencies among all the vertices. Experimental results on the VoxCeleb dataset demonstrate that CVTHead achieves comparable performance to state-of-the-art graphics-based methods. Moreover, it enables efficient rendering of novel human heads with various expressions, head poses, and camera views. These attributes can be explicitly controlled using the coefficients of 3DMMs, facilitating versatile and realistic animation in real-time scenarios.Comment: WACV202

    Subpixel temperature estimation from single-band thermal infrared imagery

    Get PDF
    Target temperature estimation from thermal infrared (TIR) imagery is a complex task that becomes increasingly more difficult as the target size approaches the size of a projected pixel. At that point the assumption of pixel homogeneity is invalid as the radiance value recorded at the sensor is the result of energy contributions from the target material and any other background material that falls within a pixel boundary. More often than not, thermal infrared pixels are heterogeneous and therefore subpixel temperature extraction becomes an important capability. Typical subpixel estimation approaches make use of multispectral or hyperspectral sensors. These technologies are expensive and multispectral or hyperspectral thermal imagery might not be readily available for a target of interest. A methodology was developed to retrieve the temperature of an object that is smaller than a projected pixel of a single-band TIR image using physics-based modeling. Physics-based refers to the utilization of the Multi-Service Electro-optic Signature (MuSES) heat transfer model, the MODerate spectral resolution atmospheric TRANsmission (MODTRAN) atmospheric propagation algorithm, and the Digital Imaging and Remote Sensing Image Generation (DIRSIG) synthetic image generation model to reproduce a collected thermal image under a number of user-supplied conditions. A target space is created and searched to determine the temperature of the subpixel target of interest from a collected TIR image. The methodology was tested by applying it to single-band thermal imagery collected during an airborne campaign. The emissivity of the targets of interest ranged from 0.02 to 0.91 and the temperature extraction error for the high emissivity targets were similar to the temperature extraction errors found in published papers that employed multi-band techniques

    Hierarchical Graphical Models for Multigroup Shape Analysis using Expectation Maximization with Sampling in Kendall's Shape Space

    Full text link
    This paper proposes a novel framework for multi-group shape analysis relying on a hierarchical graphical statistical model on shapes within a population.The framework represents individual shapes as point setsmodulo translation, rotation, and scale, following the notion in Kendall shape space.While individual shapes are derived from their group shape model, each group shape model is derived from a single population shape model. The hierarchical model follows the natural organization of population data and the top level in the hierarchy provides a common frame of reference for multigroup shape analysis, e.g. classification and hypothesis testing. Unlike typical shape-modeling approaches, the proposed model is a generative model that defines a joint distribution of object-boundary data and the shape-model variables. Furthermore, it naturally enforces optimal correspondences during the process of model fitting and thereby subsumes the so-called correspondence problem. The proposed inference scheme employs an expectation maximization (EM) algorithm that treats the individual and group shape variables as hidden random variables and integrates them out before estimating the parameters (population mean and variance and the group variances). The underpinning of the EM algorithm is the sampling of pointsets, in Kendall shape space, from their posterior distribution, for which we exploit a highly-efficient scheme based on Hamiltonian Monte Carlo simulation. Experiments in this paper use the fitted hierarchical model to perform (1) hypothesis testing for comparison between pairs of groups using permutation testing and (2) classification for image retrieval. The paper validates the proposed framework on simulated data and demonstrates results on real data.Comment: 9 pages, 7 figures, International Conference on Machine Learning 201
    • …
    corecore