3,929 research outputs found

    GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks

    Full text link
    In the last decade, supervised deep learning approaches have been extensively employed in visual odometry (VO) applications, which is not feasible in environments where labelled data is not abundant. On the other hand, unsupervised deep learning approaches for localization and mapping in unknown environments from unlabelled data have received comparatively less attention in VO research. In this study, we propose a generative unsupervised learning framework that predicts 6-DoF pose camera motion and monocular depth map of the scene from unlabelled RGB image sequences, using deep convolutional Generative Adversarial Networks (GANs). We create a supervisory signal by warping view sequences and assigning the re-projection minimization to the objective loss function that is adopted in multi-view pose estimation and single-view depth generation network. Detailed quantitative and qualitative evaluations of the proposed framework on the KITTI and Cityscapes datasets show that the proposed method outperforms both existing traditional and unsupervised deep VO methods providing better results for both pose estimation and depth recovery.Comment: ICRA 2019 - accepte

    An Adversarial Super-Resolution Remedy for Radar Design Trade-offs

    Full text link
    Radar is of vital importance in many fields, such as autonomous driving, safety and surveillance applications. However, it suffers from stringent constraints on its design parametrization leading to multiple trade-offs. For example, the bandwidth in FMCW radars is inversely proportional with both the maximum unambiguous range and range resolution. In this work, we introduce a new method for circumventing radar design trade-offs. We propose the use of recent advances in computer vision, more specifically generative adversarial networks (GANs), to enhance low-resolution radar acquisitions into higher resolution counterparts while maintaining the advantages of the low-resolution parametrization. The capability of the proposed method was evaluated on the velocity resolution and range-azimuth trade-offs in micro-Doppler signatures and FMCW uniform linear array (ULA) radars, respectively.Comment: Accepted in EUSIPCO 2019, 5 page

    Dense 3D Object Reconstruction from a Single Depth View

    Get PDF
    In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks. Unlike existing work which typically requires multiple views of the same object or class labels to recover the full 3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid with a high resolution of 256^3 by recovering the occluded/missing regions. The key idea is to combine the generative capabilities of autoencoders and the conditional Generative Adversarial Networks (GAN) framework, to infer accurate and fine-grained 3D structures of objects in high-dimensional voxel space. Extensive experiments on large synthetic datasets and real-world Kinect datasets show that the proposed 3D-RecGAN++ significantly outperforms the state of the art in single view 3D object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at: https://github.com/Yang7879/3D-RecGAN-extended. This article extends from arXiv:1708.0796

    Multimodal Sensor Fusion In Single Thermal image Super-Resolution

    Full text link
    With the fast growth in the visual surveillance and security sectors, thermal infrared images have become increasingly necessary ina large variety of industrial applications. This is true even though IR sensors are still more expensive than their RGB counterpart having the same resolution. In this paper, we propose a deep learning solution to enhance the thermal image resolution. The following results are given:(I) Introduction of a multimodal, visual-thermal fusion model that ad-dresses thermal image super-resolution, via integrating high-frequency information from the visual image. (II) Investigation of different net-work architecture schemes in the literature, their up-sampling methods,learning procedures, and their optimization functions by showing their beneficial contribution to the super-resolution problem. (III) A bench-mark ULB17-VT dataset that contains thermal images and their visual images counterpart is presented. (IV) Presentation of a qualitative evaluation of a large test set with 58 samples and 22 raters which shows that our proposed model performs better against state-of-the-arts
    • …
    corecore