24,194 research outputs found

    A Closer Look at Jamnitzer's Polyhedra

    Get PDF
    The Renaissance artist Wentzel Jamnitzer designed series of intriguing polyhedra in perspective in his book “Perspectiva Corporum Regularium”. In this paper we investigate the possible principles of the construction of the polyhedra and create 3D computer models of them. Comparing those to the originals, we get an idea of how successful he was in drawing the complex structures by imagination. Furthermore, we analyse Jamnitzer's use of linear perspective, an important key in creating such drawings

    Self-learning voxel-based multi-camera occlusion maps for 3D reconstruction

    Get PDF
    The quality of a shape-from-silhouettes 3D reconstruction technique strongly depends on the completeness of the silhouettes from each of the cameras. Static occlusion, due to e.g. furniture, makes reconstruction difficult, as we assume no prior knowledge concerning shape and size of occluding objects in the scene. In this paper we present a self-learning algorithm that is able to build an occlusion map for each camera from a voxel perspective. This information is then used to determine which cameras need to be evaluated when reconstructing the 3D model at every voxel in the scene. We show promising results in a multi-camera setup with seven cameras where the object is significantly better reconstructed compared to the state of the art methods, despite the occluding object in the center of the room

    Learning a Hierarchical Latent-Variable Model of 3D Shapes

    Full text link
    We propose the Variational Shape Learner (VSL), a generative model that learns the underlying structure of voxelized 3D shapes in an unsupervised fashion. Through the use of skip-connections, our model can successfully learn and infer a latent, hierarchical representation of objects. Furthermore, realistic 3D objects can be easily generated by sampling the VSL's latent probabilistic manifold. We show that our generative model can be trained end-to-end from 2D images to perform single image 3D model retrieval. Experiments show, both quantitatively and qualitatively, the improved generalization of our proposed model over a range of tasks, performing better or comparable to various state-of-the-art alternatives.Comment: Accepted as oral presentation at International Conference on 3D Vision (3DV), 201

    Proof of concept of a workflow methodology for the creation of basic canine head anatomy veterinary education tool using augmented reality

    Get PDF
    Neuroanatomy can be challenging to both teach and learn within the undergraduate veterinary medicine and surgery curriculum. Traditional techniques have been used for many years, but there has now been a progression to move towards alternative digital models and interactive 3D models to engage the learner. However, digital innovations in the curriculum have typically involved the medical curriculum rather than the veterinary curriculum. Therefore, we aimed to create a simple workflow methodology to highlight the simplicity there is in creating a mobile augmented reality application of basic canine head anatomy. Using canine CT and MRI scans and widely available software programs, we demonstrate how to create an interactive model of head anatomy. This was applied to augmented reality for a popular Android mobile device to demonstrate the user-friendly interface. Here we present the processes, challenges and resolutions for the creation of a highly accurate, data based anatomical model that could potentially be used in the veterinary curriculum. This proof of concept study provides an excellent framework for the creation of augmented reality training products for veterinary education. The lack of similar resources within this field provides the ideal platform to extend this into other areas of veterinary education and beyond

    Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text

    Full text link
    Real world multimedia data is often composed of multiple modalities such as an image or a video with associated text (e.g. captions, user comments, etc.) and metadata. Such multimodal data packages are prone to manipulations, where a subset of these modalities can be altered to misrepresent or repurpose data packages, with possible malicious intent. It is, therefore, important to develop methods to assess or verify the integrity of these multimedia packages. Using computer vision and natural language processing methods to directly compare the image (or video) and the associated caption to verify the integrity of a media package is only possible for a limited set of objects and scenes. In this paper, we present a novel deep learning-based approach for assessing the semantic integrity of multimedia packages containing images and captions, using a reference set of multimedia packages. We construct a joint embedding of images and captions with deep multimodal representation learning on the reference dataset in a framework that also provides image-caption consistency scores (ICCSs). The integrity of query media packages is assessed as the inlierness of the query ICCSs with respect to the reference dataset. We present the MultimodAl Information Manipulation dataset (MAIM), a new dataset of media packages from Flickr, which we make available to the research community. We use both the newly created dataset as well as Flickr30K and MS COCO datasets to quantitatively evaluate our proposed approach. The reference dataset does not contain unmanipulated versions of tampered query packages. Our method is able to achieve F1 scores of 0.75, 0.89 and 0.94 on MAIM, Flickr30K and MS COCO, respectively, for detecting semantically incoherent media packages.Comment: *Ayush Jaiswal and Ekraam Sabir contributed equally to the work in this pape

    Implicit 3D Orientation Learning for 6D Object Detection from RGB Images

    Get PDF
    We propose a real-time RGB-based pipeline for object detection and 6D pose estimation. Our novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization. This so-called Augmented Autoencoder has several advantages over existing methods: It does not require real, pose-annotated training data, generalizes to various test sensors and inherently handles object and view symmetries. Instead of learning an explicit mapping from input images to object poses, it provides an implicit representation of object orientations defined by samples in a latent space. Our pipeline achieves state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D domain. We also evaluate on the LineMOD dataset where we can compete with other synthetically trained approaches. We further increase performance by correcting 3D orientation estimates to account for perspective errors when the object deviates from the image center and show extended results.Comment: Code available at: https://github.com/DLR-RM/AugmentedAutoencode

    Multi-set canonical correlation analysis for 3D abnormal gait behaviour recognition based on virtual sample generation

    Get PDF
    Small sample dataset and two-dimensional (2D) approach are challenges to vision-based abnormal gait behaviour recognition (AGBR). The lack of three-dimensional (3D) structure of the human body causes 2D based methods to be limited in abnormal gait virtual sample generation (VSG). In this paper, 3D AGBR based on VSG and multi-set canonical correlation analysis (3D-AGRBMCCA) is proposed. First, the unstructured point cloud data of gait are obtained by using a structured light sensor. A 3D parametric body model is then deformed to fit the point cloud data, both in shape and posture. The features of point cloud data are then converted to a high-level structured representation of the body. The parametric body model is used for VSG based on the estimated body pose and shape data. Symmetry virtual samples, pose-perturbation virtual samples and various body-shape virtual samples with multi-views are generated to extend the training samples. The spatial-temporal features of the abnormal gait behaviour from different views, body pose and shape parameters are then extracted by convolutional neural network based Long Short-Term Memory model network. These are projected onto a uniform pattern space using deep learning based multi-set canonical correlation analysis. Experiments on four publicly available datasets show the proposed system performs well under various conditions
    corecore