2,794 research outputs found

    Deep Multi-task Multi-label CNN for Effective Facial Attribute Classification

    Get PDF
    Facial Attribute Classification (FAC) has attracted increasing attention in computer vision and pattern recognition. However, state-of-the-art FAC methods perform face detection/alignment and FAC independently. The inherent dependencies between these tasks are not fully exploited. In addition, most methods predict all facial attributes using the same CNN network architecture, which ignores the different learning complexities of facial attributes. To address the above problems, we propose a novel deep multi-task multi-label CNN, termed DMM-CNN, for effective FAC. Specifically, DMM-CNN jointly optimizes two closely-related tasks (i.e., facial landmark detection and FAC) to improve the performance of FAC by taking advantage of multi-task learning. To deal with the diverse learning complexities of facial attributes, we divide the attributes into two groups: objective attributes and subjective attributes. Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training. Furthermore, an adaptive thresholding strategy is developed to effectively alleviate the problem of class imbalance for multi-label learning. Experimental results on the challenging CelebA and LFWA datasets show the superiority of the proposed DMM-CNN method compared with several state-of-the-art FAC methods

    Spatial-Contextual Discrepancy Information Compensation for GAN Inversion

    Full text link
    Most existing GAN inversion methods either achieve accurate reconstruction but lack editability or offer strong editability at the cost of fidelity. Hence, how to balance the distortioneditability trade-off is a significant challenge for GAN inversion. To address this challenge, we introduce a novel spatial-contextual discrepancy information compensationbased GAN-inversion method (SDIC), which consists of a discrepancy information prediction network (DIPN) and a discrepancy information compensation network (DICN). SDIC follows a "compensate-and-edit" paradigm and successfully bridges the gap in image details between the original image and the reconstructed/edited image. On the one hand, DIPN encodes the multi-level spatial-contextual information of the original and initial reconstructed images and then predicts a spatial-contextual guided discrepancy map with two hourglass modules. In this way, a reliable discrepancy map that models the contextual relationship and captures finegrained image details is learned. On the other hand, DICN incorporates the predicted discrepancy information into both the latent code and the GAN generator with different transformations, generating high-quality reconstructed/edited images. This effectively compensates for the loss of image details during GAN inversion. Both quantitative and qualitative experiments demonstrate that our proposed method achieves the excellent distortion-editability trade-off at a fast inference speed for both image inversion and editing tasks

    A new controller design of electro-hydraulic servo system based on empirical mode decomposition

    Get PDF
    The signal of electro-hydraulic servo system is non-stationary and time-varying due to the influence of vibration, noise and mechanical impact. The traditional digital filter always suffers delay in time domain and the delay increases along with the increasing of frequency. Considering the features of electro-hydraulic servo system, the Hilbert-Huang transform method is an effective method to decompose the original signal and obtain the noise components. Some improvements are made based on Hilbert Huang transform method and a new real time on-line filtering method is proposed in this paper. This improved filter is able to decompose out the noise components and other interference components from original signal, and remove them off in real time. Based on this new on-line filter, a new controller is also designed. Compared the filtering result with the traditional digital filter, this new controller’s control performance is much better

    Terrace-like structure in the above-threshold ionization spectrum of an atom in an IR+XUV two-color laser field

    Full text link
    Based on the frequency-domain theory, we investigate the above-threshold ionization (ATI) process of an atom in a two-color laser field with infrared (IR) and extreme ultraviolet (XUV) frequencies, where the photon energy of the XUV laser is close to or larger than the atomic ionization threshold. By using the channel analysis, we find that the two laser fields play different roles in an ionization process, where the XUV laser determines the ionization probability by the photon number that the atom absorbs from it, while the IR laser accelerates the ionized electron and hence widens the electron kinetic energy spectrum. As a result, the ATI spectrum presents a terrace-like structure. By using the saddle-point approximation, we obtain a classical formula which can predict the cutoff of each plateau in the terrace-like ATI spectrum. Furthermore, we find that the difference of the heights between two neighboring plateaus in the terrace-like structure of the ATI spectrum increases as the frequency of the XUV laser increases

    When Sparse Neural Network Meets Label Noise Learning: A Multistage Learning Framework

    Get PDF
    Recent methods in network pruning have indicated that a dense neural network involves a sparse subnetwork (called a winning ticket), which can achieve similar test accuracy to its dense counterpart with much fewer network parameters. Generally, these methods search for the winning tickets on well-labeled data. Unfortunately, in many real-world applications, the training data are unavoidably contaminated with noisy labels, thereby leading to performance deterioration of these methods. To address the above-mentioned problem, we propose a novel two-stream sample selection network (TS 3 -Net), which consists of a sparse subnetwork and a dense subnetwork, to effectively identify the winning ticket with noisy labels. The training of TS 3 -Net contains an iterative procedure that switches between training both subnetworks and pruning the smallest magnitude weights of the sparse subnetwork. In particular, we develop a multistage learning framework including a warm-up stage, a semisupervised alternate learning stage, and a label refinement stage, to progressively train the two subnetworks. In this way, the classification capability of the sparse subnetwork can be gradually improved at a high sparsity level. Extensive experimental results on both synthetic and real-world noisy datasets (including MNIST, CIFAR-10, CIFAR-100, ANIMAL-10N, Clothing1M, and WebVision) demonstrate that our proposed method achieves state-of-the-art performance with very small memory consumption for label noise learning. Code is available at https://github.com/Runqing-forMost/TS3-Net/tree/master

    Drop Loss for Person Attribute Recognition With Imbalanced Noisy-Labeled Samples

    Get PDF
    Person attribute recognition (PAR) aims to simultaneously predict multiple attributes of a person. Existing deep learning-based PAR methods have achieved impressive performance. Unfortunately, these methods usually ignore the fact that different attributes have an imbalance in the number of noisy-labeled samples in the PAR training datasets, thus leading to suboptimal performance. To address the above problem of imbalanced noisy-labeled samples, we propose a novel and effective loss called drop loss for PAR. In the drop loss, the attributes are treated differently in an easy-to-hard way. In particular, the noisy-labeled candidates, which are identified according to their gradient norms, are dropped with a higher drop rate for the harder attribute. Such a manner adaptively alleviates the adverse effect of imbalanced noisy-labeled samples on model learning. To illustrate the effectiveness of the proposed loss, we train a simple ResNet-50 model based on the drop loss and term it DropNet. Experimental results on two representative PAR tasks (including facial attribute recognition and pedestrian attribute recognition) demonstrate that the proposed DropNet achieves comparable or better performance in terms of both balanced accuracy and classification accuracy over several state-of-the-art PAR methods

    Knowledge Distillation Meets Label Noise Learning: Ambiguity-Guided Mutual Label Refinery

    Get PDF
    Knowledge distillation (KD), which aims at transferring the knowledge from a complex network (a teacher) to a simpler and smaller network (a student), has received considerable attention in recent years. Typically, most existing KD methods work on well-labeled data. Unfortunately, real-world data often inevitably involve noisy labels, thus leading to performance deterioration of these methods. In this article, we study a little-explored but important issue, i.e., KD with noisy labels. To this end, we propose a novel KD method, called ambiguity-guided mutual label refinery KD (AML-KD), to train the student model in the presence of noisy labels. Specifically, based on the pretrained teacher model, a two-stage label refinery framework is innovatively introduced to refine labels gradually. In the first stage, we perform label propagation (LP) with small-loss selection guided by the teacher model, improving the learning capability of the student model. In the second stage, we perform mutual LP between the teacher and student models in a mutual-benefit way. During the label refinery, an ambiguity-aware weight estimation (AWE) module is developed to address the problem of ambiguous samples, avoiding overfitting these samples. One distinct advantage of AML-KD is that it is capable of learning a high-accuracy and low-cost student model with label noise. The experimental results on synthetic and real-world noisy datasets show the effectiveness of our AML-KD against state-of-the-art KD methods and label noise learning (LNL) methods. Code is available at https://github.com/Runqing-forMost/ AML-KD
    • …
    corecore