122 research outputs found
Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy
Generative adversarial networks (GANs) have been extensively studied in the
past few years. Arguably their most significant impact has been in the area of
computer vision where great advances have been made in challenges such as
plausible image generation, image-to-image translation, facial attribute
manipulation and similar domains. Despite the significant successes achieved to
date, applying GANs to real-world problems still poses significant challenges,
three of which we focus on here. These are: (1) the generation of high quality
images, (2) diversity of image generation, and (3) stable training. Focusing on
the degree to which popular GAN technologies have made progress against these
challenges, we provide a detailed review of the state of the art in GAN-related
research in the published scientific literature. We further structure this
review through a convenient taxonomy we have adopted based on variations in GAN
architectures and loss functions. While several reviews for GANs have been
presented to date, none have considered the status of this field based on their
progress towards addressing practical challenges relevant to computer vision.
Accordingly, we review and critically discuss the most popular
architecture-variant, and loss-variant GANs, for tackling these challenges. Our
objective is to provide an overview as well as a critical analysis of the
status of GAN research in terms of relevant progress towards important computer
vision application requirements. As we do this we also discuss the most
compelling applications in computer vision in which GANs have demonstrated
considerable success along with some suggestions for future research
directions. Code related to GAN-variants studied in this work is summarized on
https://github.com/sheqi/GAN_Review.Comment: Accepted by ACM Computing Surveys, 23 November 202
Human segmentation in surveillance video with deep learning
Advanced intelligent surveillance systems are able to automatically analyze video of surveillance data without human intervention. These systems allow high accuracy of human activity recognition and then a high-level activity evaluation. To provide such features, an intelligent surveillance system requires a background subtraction scheme for human segmentation that captures a sequence of images containing moving humans from the reference background image. This paper proposes an alternative approach for human segmentation in videos through the use of a deep convolutional neural network. Two specific datasets were created to train our network, using the shapes of 35 different moving actors arranged on background images related to the area where the camera is located, allowing the network to take advantage of the entire site chosen for video surveillance. To assess the proposed approach, we compare our results with an Adobe Photoshop tool called Select Subject, the conditional generative adversarial network Pix2Pix, and the fully-convolutional model for real-time instance segmentation Yolact. The results show that the main benefit of our method is the possibility to automatically recognize and segment people in videos without constraints on camera and people movements in the scene (Video, code and datasets are available at http://graphics.unibas.it/www/HumanSegmentation/index.md.html)
- …