12 research outputs found
Dynamic Loss For Robust Learning
Label noise and class imbalance commonly coexist in real-world data. Previous
works for robust learning, however, usually address either one type of the data
biases and underperform when facing them both. To mitigate this gap, this work
presents a novel meta-learning based dynamic loss that automatically adjusts
the objective functions with the training process to robustly learn a
classifier from long-tailed noisy data. Concretely, our dynamic loss comprises
a label corrector and a margin generator, which respectively correct noisy
labels and generate additive per-class classification margins by perceiving the
underlying data distribution as well as the learning state of the classifier.
Equipped with a new hierarchical sampling strategy that enriches a small amount
of unbiased metadata with diverse and hard samples, the two components in the
dynamic loss are optimized jointly through meta-learning and cultivate the
classifier to well adapt to clean and balanced test data. Extensive experiments
show our method achieves state-of-the-art accuracy on multiple real-world and
synthetic datasets with various types of data biases, including CIFAR-10/100,
Animal-10N, ImageNet-LT, and Webvision. Code will soon be publicly available
Automated optical inspection of FAST’s reflector surface using drones and computer vision
The Five-hundred-meter Aperture Spherical radio Telescope (FAST) is the world ’ s largest single-dish radio telescope. Its large reflecting surface achieves unprecedented sensitivity but is prone to damage, such as dents and holes, caused by naturally-occurring falling objects. Hence, the timely and accurate detection of surface defects is crucial for FAST’s stable operation. Conventional manual inspection involves human inspectors climbing up and examining the large surface visually, a time-consuming and potentially unreliable process. To accelerate the inspection process and increase its accuracy, this work makes the first step towards automating the inspection of FAST by integrating deep-learning techniques with drone technology. First, a drone flies over the surface along a predetermined route. Since surface defects significantly vary in scale and show high inter-class similarity, directly applying existing deep detectors to detect defects on the drone imagery is highly prone to missing and misidentifying defects. As a remedy, we introduce cross-fusion, a dedicated plug-in operation for deep detectors that enables the adaptive fusion of multi-level features in a point-wise selective fashion, depending on local defect patterns. Consequently, strong semantics and fine-grained details are dynamically fused at different positions to support the accurate detection of defects of various scales and types. Our AI-powered drone-based automated inspection is time-efficient, reliable, and has good accessibility, which guarantees the long-term and stable operation of FAST
Tree-CNN: from generalization to specialization
Abstract Traditional convolutional neural networks (CNNs) classify all categories by a single network, which passes all kinds of samples through totally the same network flow. In fact, it is quite challengeable to distinguish schooner with ketch and chair by a single network. To address it, we propose a new image classification architecture composed of a cluster algorithm and the Tree-CNN. The cluster algorithm devotes to classifying similar fine categories into a coarse category. The Tree-CNN is comprised of a Trunk-CNN for coarse classification of all categories and Branch-CNNs to treat different groups of similar categories differently. Branch-CNNs are fine-tuning based on the Trunk-CNN, which extracts the special feature map of image and divides it into fine categories. But Branch-CNNs bring extra computation and are hard to train. To address it, we introduce adaptive algorithm to balance the heavy computation and accuracy. We have tested Tree-CNNs based on CaffeNet, VGG16, and GoogLeNet in Caltech101 and Caltech256 for image classification. Experiment results show the superiority of the proposed Tree-CNN
A multi-task approach to face deblurring
Abstract Image deblurring is a foundational problem with numerous application, and the face deblurring subject is one of the most interesting branches. We propose a convolutional neural network (CNN)-based architecture that embraces multi-scale deep features. In this paper, we address the deblurring problems with transfer learning via a multi-task embedding network; the proposed method is effective at restoring more implicit and explicit structures from the blur images. In addition, by introducing perceptual features in the deblurring process and adopting a generative adversarial network, we develop a new method to deblur the face images with reservation of more facial features and details. Extensive experiments compared with state-of-the-art deblurring algorithms demonstrate the effectiveness of the proposed approach
Single-Image Super Resolution of Remote Sensing Images with Real-World Degradation Modeling
Limited resolution is one of the most important factors hindering the application of remote sensing images (RSIs). Single-image super resolution (SISR) is a technique to improve the spatial resolution of digital images and has attracted the attention of many researchers. In recent years, with the advancement of deep learning (DL) frameworks, many DL-based SISR models have been proposed and achieved state-of-the-art performance; however, most SISR models for RSIs use the bicubic downsampler to construct low-resolution (LR) and high-resolution (HR) training pairs. Considering that the quality of the actual RSIs depends on a variety of factors, such as illumination, atmosphere, imaging sensor responses, and signal processing, training on “ideal” datasets results in a dramatic drop in model performance on real RSIs. To address this issue, we propose to build a more realistic training dataset by modeling the degradation with blur kernels and imaging noises. We also design a novel residual balanced attention network (RBAN) as a generator to estimate super-resolution results from the LR inputs. To encourage RBAN to generate more realistic textures, we apply a UNet-shape discriminator for adversarial training. Both referenced evaluations on synthetic data and non-referenced evaluations on actual images were carried out. Experimental results validate the effectiveness of the proposed framework, and our model exhibits state-of-the-art performance in quantitative evaluation and visual quality. We believe that the proposed framework can facilitate super-resolution techniques from research to practical applications in RSIs processing
Moving Object Detection Using Scanning Camera on a High-Precision Intelligent Holder
During the process of moving object detection in an intelligent visual surveillance system, a scenario with complex background is sure to appear. The traditional methods, such as “frame difference” and “optical flow”, may not able to deal with the problem very well. In such scenarios, we use a modified algorithm to do the background modeling work. In this paper, we use edge detection to get an edge difference image just to enhance the ability of resistance illumination variation. Then we use a “multi-block temporal-analyzing LBP (Local Binary Pattern)” algorithm to do the segmentation. In the end, a connected component is used to locate the object. We also produce a hardware platform, the core of which consists of the DSP (Digital Signal Processor) and FPGA (Field Programmable Gate Array) platforms and the high-precision intelligent holder
Real-Time Tracking Framework with Adaptive Features and Constrained Labels
This paper proposes a novel tracking framework with adaptive features and constrained labels (AFCL) to handle illumination variation, occlusion and appearance changes caused by the variation of positions. The novel ensemble classifier, including the Forward–Backward error and the location constraint is applied, to get the precise coordinates of the promising bounding boxes. The Forward–Backward error can enhance the adaptation and accuracy of the binary features, whereas the location constraint can overcome the label noise to a certain degree. We use the combiner which can evaluate the online templates and the outputs of the classifier to accommodate the complex situation. Evaluation of the widely used tracking benchmark shows that the proposed framework can significantly improve the tracking accuracy, and thus reduce the processing time. The proposed framework has been tested and implemented on the embedded system using TMS320C6416 and Cyclone Ⅲ kernel processors. The outputs show that achievable and satisfying results can be obtained
Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data
Corrupted labels and class imbalance are commonly encountered in practically collected training data, which easily leads to over-fitting of deep neural networks (DNNs). Existing approaches alleviate these issues by adopting a sample re-weighting strategy, which is to re-weight sample by designing weighting function. However, it is only applicable for training data containing only either one type of data biases.
In practice, however, biased samples with corrupted labels and of tailed classes commonly co-exist in training data.
How to handle them simultaneously is a key but under-explored problem. In this paper, we find that these two types of biased samples, though have similar transient loss, have distinguishable trend and characteristics in loss curves, which could provide valuable priors for sample weight assignment. Motivated by this, we delve into the loss curves and propose a novel probe-and-allocate training strategy: In the probing stage, we train the network on the whole biased training data without intervention, and record the loss curve of each sample as an additional attribute; In the allocating stage, we feed the resulting attribute to a newly designed curve-perception network, named CurveNet, to learn to identify the bias type of each sample and assign proper weights through meta-learning adaptively.
The training speed of meta learning also blocks its application.
To solve it, we propose a method named skip layer meta optimization (SLMO) to accelerate training speed by skipping the bottom layers.
Extensive synthetic and real experiments well validate the proposed method, which achieves state-of-the-art performance on multiple challenging benchmarks
Experimental study on the influence of high frequency PWM harmonics on the losses of induction motor
For the purpose of studying the influence of high frequency PWM harmonics on the losses of induction motor, this paper first introduces the latest international standard IEC/TS 60034-2-3 for the separation of the losses and the measurement of the efficiency of induction motor under PWM power supply. Then, considering that the skin effect of windings in induction motors under the condition of PWM power supply more apparent, this paper uses the stator AC resistance to replace the stator DC resistance in the IEC/TS 60034-2-3 standard to obtain more accurate separation results. On this basis, an improved method of loss separation for converter-fed induction motor is proposed by using the accurate time-step finite element method to calculate the rotor copper loss under no-load condition which cannot be considered in IEC/TS 60034-2-3 standard. Finally, this method is used to measure and separate the losses of a 5.5 kW converter-fed motor under different operating conditions. The results show that the loss of induction motor which is driven by PWM converter is obviously greater than that of sinusoidal drive, especially under light load or no load condition. The whole no-load losses of the motor driven by PWM converter increases by more than 20.0% compared with that of the sinusoidal driving condition