69 research outputs found

    WAYLA - Generating Images from Eye Movements

    Full text link
    We present a method for reconstructing images viewed by observers based only on their eye movements. By exploring the relationships between gaze patterns and image stimuli, the "What Are You Looking At?" (WAYLA) system learns to synthesize photo-realistic images that are similar to the original pictures being viewed. The WAYLA approach is based on the Conditional Generative Adversarial Network (Conditional GAN) image-to-image translation technique of Isola et al. We consider two specific applications - the first, of reconstructing newspaper images from gaze heat maps, and the second, of detailed reconstruction of images containing only text. The newspaper image reconstruction process is divided into two image-to-image translation operations, the first mapping gaze heat maps into image segmentations, and the second mapping the generated segmentation into a newspaper image. We validate the performance of our approach using various evaluation metrics, along with human visual inspection. All results confirm the ability of our network to perform image generation tasks using eye tracking data

    Adversarial Semantic Scene Completion from a Single Depth Image

    Full text link
    We propose a method to reconstruct, complete and semantically label a 3D scene from a single input depth image. We improve the accuracy of the regressed semantic 3D maps by a novel architecture based on adversarial learning. In particular, we suggest using multiple adversarial loss terms that not only enforce realistic outputs with respect to the ground truth, but also an effective embedding of the internal features. This is done by correlating the latent features of the encoder working on partial 2.5D data with the latent features extracted from a variational 3D auto-encoder trained to reconstruct the complete semantic scene. In addition, differently from other approaches that operate entirely through 3D convolutions, at test time we retain the original 2.5D structure of the input during downsampling to improve the effectiveness of the internal representation of our model. We test our approach on the main benchmark datasets for semantic scene completion to qualitatively and quantitatively assess the effectiveness of our proposal.Comment: 2018 International Conference on 3D Vision (3DV

    Modelling the Distribution of 3D Brain MRI using a 2D Slice VAE

    Full text link
    Probabilistic modelling has been an essential tool in medical image analysis, especially for analyzing brain Magnetic Resonance Images (MRI). Recent deep learning techniques for estimating high-dimensional distributions, in particular Variational Autoencoders (VAEs), opened up new avenues for probabilistic modeling. Modelling of volumetric data has remained a challenge, however, because constraints on available computation and training data make it difficult effectively leverage VAEs, which are well-developed for 2D images. We propose a method to model 3D MR brain volumes distribution by combining a 2D slice VAE with a Gaussian model that captures the relationships between slices. We do so by estimating the sample mean and covariance in the latent space of the 2D model over the slice direction. This combined model lets us sample new coherent stacks of latent variables to decode into slices of a volume. We also introduce a novel evaluation method for generated volumes that quantifies how well their segmentations match those of true brain anatomy. We demonstrate that our proposed model is competitive in generating high quality volumes at high resolutions according to both traditional metrics and our proposed evaluation.Comment: accepted for publication at MICCAI 2020. Code available https://github.com/voanna/slices-to-3d-brain-vae
    • …
    corecore