13 research outputs found

    SG-VAE: Scene Grammar Variational Autoencoder to Generate New Indoor Scenes

    Get PDF
    Deep generative models have been used in recent years to learn coherent latent representations in order to synthesize high-quality images. In this work, we propose a neural network to learn a generative model for sampling consistent indoor scene layouts. Our method learns the co-occurrences, and appearance parameters such as shape and pose, for different objects categories through a grammar-based auto-encoder, resulting in a compact and accurate representation for scene layouts. In contrast to existing grammar-based methods with a user-specified grammar, we construct the grammar automatically by extracting a set of production rules on reasoning about object co-occurrences in training data. The extracted grammar is able to represent a scene by an augmented parse tree. The proposed auto-encoder encodes these parse trees to a latent code, and decodes the latent code to a parse tree, thereby ensuring the generated scene is always valid. We experimentally demonstrate that the proposed auto-encoder learns not only to generate valid scenes (i.e. the arrangements and appearances of objects), but it also learns coherent latent representations where nearby latent samples decode to similar scene outputs. The obtained generative model is applicable to several computer vision tasks such as 3D pose and layout estimation from RGB-D data

    SG-VAE: Scene Grammar Variational Autoencoder to generate new indoor scenes

    Get PDF
    Deep generative models have been used in recent years to learn coherent latent representations in order to synthesize high-quality images. In this work, we propose a neural network to learn a generative model for sampling consistent indoor scene layouts. Our method learns the co-occurrences, and appearance parameters such as shape and pose, for different objects categories through a grammar-based auto-encoder, resulting in a compact and accurate representation for scene layouts. In contrast to existing grammar-based methods with a user-specified grammar, we construct the grammar automatically by extracting a set of production rules on reasoning about object co-occurrences in training data. The extracted grammar is able to represent a scene by an augmented parse tree. The proposed auto-encoder encodes these parse trees to a latent code, and decodes the latent code to a parse tree, thereby ensuring the generated scene is always valid. We experimentally demonstrate that the proposed auto-encoder learns not only to generate valid scenes (i.e. the arrangements and appearances of objects), but it also learns coherent latent representations where nearby latent samples decode to similar scene outputs. The obtained generative model is applicable to several computer vision tasks such as 3D pose and layout estimation from RGB-D data

    Data-driven structural priors for shape completion

    No full text

    Mechanical load testing of solar panels - beyond certification testing

    No full text
    Mechanical load tests are a commonly-performed stress test where pressure is applied to the front and back sides of solar panels. In this paper we review the motivation for load tests and the different ways of performing them. We then discuss emerging durability concerns and ways in which the load tests can be modified and/or enhanced by combining them with other characterization methods. In particular, we present data from a new tool where the loads are applied by using vacuum and air pressure from the rear side of the panels, thus leaving the front side available for EL and IV characterization with the panels in the bent state. Tightly closed cracks in the cells can be temporarily opened by such a test, thus enabling a prediction of panel degradation in the field were these cracks to open up over time. Based on this predictive crack opening test, we introduce the concept of using a quick load test on each panel in the factory as a quality control tool and potentially as a type of burn-in test to initiate cracks that would certainly form early on during a panel's field life. We examine the stresses seen by the cells under panel load through Finite Element Modeling and demonstrate the importance of constraining the panel motion during testing as it will be constrained when mounted in the field

    Guiding monocular depth estimation using depth-attention volume

    No full text
    Abstract Recovering the scene depth from a single image is an ill-posed problem that requires additional priors, often referred to as monocular depth cues, to disambiguate different 3D interpretations. In recent works, those priors have been learned in an end-to-end manner from large datasets by using deep neural networks. In this paper, we propose guiding depth estimation to favor planar structures that are ubiquitous especially in indoor environments. This is achieved by incorporating a non-local coplanarity constraint to the network with a novel attention mechanism called depth-attention volume (DAV). Experiments on two popular indoor datasets, namely NYU-Depth-v2 and ScanNet, show that our method achieves state-of-the-art depth estimation results while using only a fraction of the number of parameters needed by the competing methods. Code is available at: https://github.com/HuynhLam/DAV

    LabNet: An Image Repository for Virtual Science Laboratories

    No full text
    Part 1: Technology Adoption, Diffusion and Ubiquitous ComputingInternational audienceThere has been recent research on image and shape storage and retrieval. Several image/shape repositories and databases of large datasets have existed in literature. However, it can be said that these repositories have generic image data content as most of them are English based images of the general world. Since they do not focus on specific field of interest while populating them, there is a high probability that they may not have a sufficient coverage for images and shapes related to specific domains or fields such as high school science-oriented images and shapes. Hence, we develop ‘LabNet’; an image repository for high school science which contains images of high school science-related subjects and laboratory courses. We use Canny’s algorithm for edge detection of objects from crawled images; and then perform morphological operation algorithms for segmentation and extraction of object images. We state that our object image does not have any background and can be utilized for scene modelling and synthesis. LabNet can also be useful for high school science-based research as well as an educational tool for elementary science-based classes and laboratory exercises
    corecore