69 research outputs found
WAYLA - Generating Images from Eye Movements
We present a method for reconstructing images viewed by observers based only
on their eye movements. By exploring the relationships between gaze patterns
and image stimuli, the "What Are You Looking At?" (WAYLA) system learns to
synthesize photo-realistic images that are similar to the original pictures
being viewed. The WAYLA approach is based on the Conditional Generative
Adversarial Network (Conditional GAN) image-to-image translation technique of
Isola et al. We consider two specific applications - the first, of
reconstructing newspaper images from gaze heat maps, and the second, of
detailed reconstruction of images containing only text. The newspaper image
reconstruction process is divided into two image-to-image translation
operations, the first mapping gaze heat maps into image segmentations, and the
second mapping the generated segmentation into a newspaper image. We validate
the performance of our approach using various evaluation metrics, along with
human visual inspection. All results confirm the ability of our network to
perform image generation tasks using eye tracking data
Adversarial Semantic Scene Completion from a Single Depth Image
We propose a method to reconstruct, complete and semantically label a 3D
scene from a single input depth image. We improve the accuracy of the regressed
semantic 3D maps by a novel architecture based on adversarial learning. In
particular, we suggest using multiple adversarial loss terms that not only
enforce realistic outputs with respect to the ground truth, but also an
effective embedding of the internal features. This is done by correlating the
latent features of the encoder working on partial 2.5D data with the latent
features extracted from a variational 3D auto-encoder trained to reconstruct
the complete semantic scene. In addition, differently from other approaches
that operate entirely through 3D convolutions, at test time we retain the
original 2.5D structure of the input during downsampling to improve the
effectiveness of the internal representation of our model. We test our approach
on the main benchmark datasets for semantic scene completion to qualitatively
and quantitatively assess the effectiveness of our proposal.Comment: 2018 International Conference on 3D Vision (3DV
Modelling the Distribution of 3D Brain MRI using a 2D Slice VAE
Probabilistic modelling has been an essential tool in medical image analysis,
especially for analyzing brain Magnetic Resonance Images (MRI). Recent deep
learning techniques for estimating high-dimensional distributions, in
particular Variational Autoencoders (VAEs), opened up new avenues for
probabilistic modeling. Modelling of volumetric data has remained a challenge,
however, because constraints on available computation and training data make it
difficult effectively leverage VAEs, which are well-developed for 2D images. We
propose a method to model 3D MR brain volumes distribution by combining a 2D
slice VAE with a Gaussian model that captures the relationships between slices.
We do so by estimating the sample mean and covariance in the latent space of
the 2D model over the slice direction. This combined model lets us sample new
coherent stacks of latent variables to decode into slices of a volume. We also
introduce a novel evaluation method for generated volumes that quantifies how
well their segmentations match those of true brain anatomy. We demonstrate that
our proposed model is competitive in generating high quality volumes at high
resolutions according to both traditional metrics and our proposed evaluation.Comment: accepted for publication at MICCAI 2020. Code available
https://github.com/voanna/slices-to-3d-brain-vae
- …