4,665 research outputs found

    The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping

    Full text link
    Many tasks performed by autonomous vehicles such as road marking detection, object tracking, and path planning are simpler in bird's-eye view. Hence, Inverse Perspective Mapping (IPM) is often applied to remove the perspective effect from a vehicle's front-facing camera and to remap its images into a 2D domain, resulting in a top-down view. Unfortunately, however, this leads to unnatural blurring and stretching of objects at further distance, due to the resolution of the camera, limiting applicability. In this paper, we present an adversarial learning approach for generating a significantly improved IPM from a single camera image in real time. The generated bird's-eye-view images contain sharper features (e.g. road markings) and a more homogeneous illumination, while (dynamic) objects are automatically removed from the scene, thus revealing the underlying road layout in an improved fashion. We demonstrate our framework using real-world data from the Oxford RobotCar Dataset and show that scene understanding tasks directly benefit from our boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures, accepted at IV 201

    ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing

    Full text link
    We address the problem of finding realistic geometric corrections to a foreground object such that it appears natural when composited into a background image. To achieve this, we propose a novel Generative Adversarial Network (GAN) architecture that utilizes Spatial Transformer Networks (STNs) as the generator, which we call Spatial Transformer GANs (ST-GANs). ST-GANs seek image realism by operating in the geometric warp parameter space. In particular, we exploit an iterative STN warping scheme and propose a sequential training strategy that achieves better results compared to naive training of a single generator. One of the key advantages of ST-GAN is its applicability to high-resolution images indirectly since the predicted warp parameters are transferable between reference frames. We demonstrate our approach in two applications: (1) visualizing how indoor furniture (e.g. from product images) might be perceived in a room, (2) hallucinating how accessories like glasses would look when matched with real portraits.Comment: Accepted to CVPR 2018 (website & code: https://chenhsuanlin.bitbucket.io/spatial-transformer-GAN/

    Generic Object Detection and Segmentation for Real-World Environments

    Get PDF

    Super-resolution assessment and detection

    Get PDF
    Super Resolution (SR) techniques are powerful digital manipulation tools that have significantly impacted various industries due to their ability to enhance the resolution of lower quality images and videos. Yet, the real-world adaptation of SR models poses numerous challenges, which blind SR models aim to overcome by emulating complex real-world degradations. In this thesis, we investigate these SR techniques, with a particular focus on comparing the performance of blind models to their non-blind counterparts under various conditions. Despite recent progress, the proliferation of SR techniques raises concerns about their potential misuse. These methods can easily manipulate real digital content and create misrepresentations, which highlights the need for robust SR detection mechanisms. In our study, we analyze the limitations of current SR detection techniques and propose a new detection system that exhibits higher performance in discerning real and upscaled videos. Moreover, we conduct several experiments to gain insights into the strengths and weaknesses of the detection models, providing a better understanding of their behavior and limitations. Particularly, we target 4K videos, which are rapidly becoming the standard resolution in various fields such as streaming services, gaming, and content creation. As part of our research, we have created and utilized a unique dataset in 4K resolution, specifically designed to facilitate the investigation of SR techniques and their detection
    • …
    corecore