77 research outputs found

    Learning to Annotate Part Segmentation with Gradient Matching

    Full text link
    The success of state-of-the-art deep neural networks heavily relies on the presence of large-scale labelled datasets, which are extremely expensive and time-consuming to annotate. This paper focuses on tackling semi-supervised part segmentation tasks by generating high-quality images with a pre-trained GAN and labelling the generated images with an automatic annotator. In particular, we formulate the annotator learning as a learning-to-learn problem. Given a pre-trained GAN, the annotator learns to label object parts in a set of randomly generated images such that a part segmentation model trained on these synthetic images with their predicted labels obtains low segmentation error on a small validation set of manually labelled images. We further reduce this nested-loop optimization problem to a simple gradient matching problem and efficiently solve it with an iterative algorithm. We show that our method can learn annotators from a broad range of labelled images including real images, generated images, and even analytically rendered images. Our method is evaluated with semi-supervised part segmentation tasks and significantly outperforms other semi-supervised competitors when the amount of labelled examples is extremely limited.Comment: ICLR 202

    Recent Advances in Deep Learning Techniques for Face Recognition

    Full text link
    In recent years, researchers have proposed many deep learning (DL) methods for various tasks, and particularly face recognition (FR) made an enormous leap using these techniques. Deep FR systems benefit from the hierarchical architecture of the DL methods to learn discriminative face representation. Therefore, DL techniques significantly improve state-of-the-art performance on FR systems and encourage diverse and efficient real-world applications. In this paper, we present a comprehensive analysis of various FR systems that leverage the different types of DL techniques, and for the study, we summarize 168 recent contributions from this area. We discuss the papers related to different algorithms, architectures, loss functions, activation functions, datasets, challenges, improvement ideas, current and future trends of DL-based FR systems. We provide a detailed discussion of various DL methods to understand the current state-of-the-art, and then we discuss various activation and loss functions for the methods. Additionally, we summarize different datasets used widely for FR tasks and discuss challenges related to illumination, expression, pose variations, and occlusion. Finally, we discuss improvement ideas, current and future trends of FR tasks.Comment: 32 pages and citation: M. T. H. Fuad et al., "Recent Advances in Deep Learning Techniques for Face Recognition," in IEEE Access, vol. 9, pp. 99112-99142, 2021, doi: 10.1109/ACCESS.2021.309613

    The robustness of animated text CAPTCHAs

    Get PDF
    PhD ThesisCAPTCHA is standard security technology that uses AI techniques to tells computer and human apart. The most widely used CAPTCHA are text-based CAPTCHA schemes. The robustness and usability of these CAPTCHAs relies mainly on the segmentation resistance mechanism that provides robustness against individual character recognition attacks. However, many CAPTCHAs have been shown to have critical flaws caused by many exploitable invariants in their design, leaving only a few CAPTCHA schemes resistant to attacks, including ReCAPTCHA and the Wikipedia CAPTCHA. Therefore, new alternative approaches to add motion to the CAPTCHA are used to add another dimension to the character cracking algorithms by animating the distorted characters and the background, which are also supported by tracking resistance mechanisms that prevent the attacks from identifying the main answer through frame-toframe attacks. These technologies are used in many of the new CAPTCHA schemes including the Yahoo CAPTCHA, CAPTCHANIM, KillBot CAPTCHAs, non-standard CAPTCHA and NuCAPTCHA. Our first question: can the animated techniques included in the new CAPTCHA schemes provide the required level of robustness against the attacks? Our examination has shown many of the CAPTCHA schemes that use the animated features can be broken through tracking attacks including the CAPTCHA schemes that uses complicated tracking resistance mechanisms. The second question: can the segmentation resistance mechanism used in the latest standard text-based CAPTCHA schemes still provide the additional required level of resistance against attacks that are not present missed in animated schemes? Our test against the latest version of ReCAPTCHA and the Wikipedia CAPTCHA exposed vulnerability problems against the novel attacks mechanisms that achieved a high success rate against them. The third question: how much space is available to design an animated text-based CAPTCHA scheme that could provide a good balance between security and usability? We designed a new animated text-based CAPTCHA using guidelines we designed based on the results of our attacks on standard and animated text-based CAPTCHAs, and we then tested its security and usability to answer this question. ii In this thesis, we put forward different approaches to examining the robustness of animated text-based CAPTCHA schemes and other standard text-based CAPTCHA schemes against segmentation and tracking attacks. Our attacks included several methodologies that required thinking skills in order to distinguish the animated text from the other animated noises, including the text distorted by highly tracking resistance mechanisms that displayed them partially as animated segments and which looked similar to noises in other CAPTCHA schemes. These attacks also include novel attack mechanisms and other mechanisms that uses a recognition engine supported by attacking methods that exploit the identified invariants to recognise the connected characters at once. Our attacks also provided a guideline for animated text-based CAPTCHAs that could provide resistance to tracking and segmentation attacks which we designed and tested in terms of security and usability, as mentioned before. Our research also contributes towards providing a toolbox for breaking CAPTCHAs in addition to a list of robustness and usability issues in the current CAPTCHA design that can be used to provide a better understanding of how to design a more resistant CAPTCHA scheme

    A survey on generative adversarial networks for imbalance problems in computer vision tasks

    Get PDF
    Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms
    • …
    corecore