122 research outputs found

    Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

    Get PDF
    Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are: (1) the generation of high quality images, (2) diversity of image generation, and (3) stable training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state of the art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress towards addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress towards important computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Code related to GAN-variants studied in this work is summarized on https://github.com/sheqi/GAN_Review.Comment: Accepted by ACM Computing Surveys, 23 November 202

    Human segmentation in surveillance video with deep learning

    Get PDF
    Advanced intelligent surveillance systems are able to automatically analyze video of surveillance data without human intervention. These systems allow high accuracy of human activity recognition and then a high-level activity evaluation. To provide such features, an intelligent surveillance system requires a background subtraction scheme for human segmentation that captures a sequence of images containing moving humans from the reference background image. This paper proposes an alternative approach for human segmentation in videos through the use of a deep convolutional neural network. Two specific datasets were created to train our network, using the shapes of 35 different moving actors arranged on background images related to the area where the camera is located, allowing the network to take advantage of the entire site chosen for video surveillance. To assess the proposed approach, we compare our results with an Adobe Photoshop tool called Select Subject, the conditional generative adversarial network Pix2Pix, and the fully-convolutional model for real-time instance segmentation Yolact. The results show that the main benefit of our method is the possibility to automatically recognize and segment people in videos without constraints on camera and people movements in the scene (Video, code and datasets are available at http://graphics.unibas.it/www/HumanSegmentation/index.md.html)
    corecore