8,123 research outputs found

    CNN-based small object detection and visualization with feature activation mapping

    Get PDF
    Object detection is a well-studied topic, however detection of small objects still lacks attention. Detecting small objects has been difficult due to small sizes, occlusion and complex backgrounds. Small objects detection is important in a number of applications including detection of small insects. One application is spider detection and removal. Spiders are frequently found on grapes and broccolis sold at supermarkets and this poses a significant safety issue and generates negative publicity for the industry. In this paper, we present a fine-tuned VGG16 network for detection of small objects such as spiders. Furthermore, we introduce a simple technique called “feature activation mapping” for object visualization from VGG16 feature maps. The testing accuracy of our network on tiny spiders with various backgrounds is 84%, as compared to 72% using finedtuned Faster R-CNN and 95.32% using CAM. Even though our feature activation mapping technique has a mid-range of test accuracy, it provides more detailed shape and size of spiders than using CAM which is important for the application area. A data set for spider detection is made available online.Authors would like to thank Australian Government Research Training Program for funding this research. This research was conducted by the Australian Research Council Center of Excellence for Robotic Vision (CE140100016) http: //www.roboticvision.org

    Solar Power Plant Detection on Multi-Spectral Satellite Imagery using Weakly-Supervised CNN with Feedback Features and m-PCNN Fusion

    Full text link
    Most of the traditional convolutional neural networks (CNNs) implements bottom-up approach (feed-forward) for image classifications. However, many scientific studies demonstrate that visual perception in primates rely on both bottom-up and top-down connections. Therefore, in this work, we propose a CNN network with feedback structure for Solar power plant detection on middle-resolution satellite images. To express the strength of the top-down connections, we introduce feedback CNN network (FB-Net) to a baseline CNN model used for solar power plant classification on multi-spectral satellite data. Moreover, we introduce a method to improve class activation mapping (CAM) to our FB-Net, which takes advantage of multi-channel pulse coupled neural network (m-PCNN) for weakly-supervised localization of the solar power plants from the features of proposed FB-Net. For the proposed FB-Net CAM with m-PCNN, experimental results demonstrated promising results on both solar-power plant image classification and detection task.Comment: 9 pages, 9 figures, 4 table

    Res2Net: A New Multi-scale Backbone Architecture

    Full text link
    Representing features at multiple scales is of great importance for numerous vision tasks. Recent advances in backbone convolutional neural networks (CNNs) continually demonstrate stronger multi-scale representation ability, leading to consistent performance gains on a wide range of applications. However, most existing methods represent the multi-scale features in a layer-wise manner. In this paper, we propose a novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-like connections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the range of receptive fields for each network layer. The proposed Res2Net block can be plugged into the state-of-the-art backbone CNN models, e.g., ResNet, ResNeXt, and DLA. We evaluate the Res2Net block on all these models and demonstrate consistent performance gains over baseline models on widely-used datasets, e.g., CIFAR-100 and ImageNet. Further ablation studies and experimental results on representative computer vision tasks, i.e., object detection, class activation mapping, and salient object detection, further verify the superiority of the Res2Net over the state-of-the-art baseline methods. The source code and trained models are available on https://mmcheng.net/res2net/.Comment: 11 pages, 7 figure

    Evaluating Generalization Ability of Convolutional Neural Networks and Capsule Networks for Image Classification via Top-2 Classification

    Full text link
    Image classification is a challenging problem which aims to identify the category of object in the image. In recent years, deep Convolutional Neural Networks (CNNs) have been applied to handle this task, and impressive improvement has been achieved. However, some research showed the output of CNNs can be easily altered by adding relatively small perturbations to the input image, such as modifying few pixels. Recently, Capsule Networks (CapsNets) are proposed, which can help eliminating this limitation. Experiments on MNIST dataset revealed that capsules can better characterize the features of object than CNNs. But it's hard to find a suitable quantitative method to compare the generalization ability of CNNs and CapsNets. In this paper, we propose a new image classification task called Top-2 classification to evaluate the generalization ability of CNNs and CapsNets. The models are trained on single label image samples same as the traditional image classification task. But in the test stage, we randomly concatenate two test image samples which contain different labels, and then use the trained models to predict the top-2 labels on the unseen newly-created two label image samples. This task can provide us precise quantitative results to compare the generalization ability of CNNs and CapsNets. Back to the CapsNet, because it uses Full Connectivity (FC) mechanism among all capsules, it requires many parameters. To reduce the number of parameters, we introduce the Parameter-Sharing (PS) mechanism between capsules. Experiments on five widely used benchmark image datasets demonstrate the method significantly reduces the number of parameters, without losing the effectiveness of extracting features. Further, on the Top-2 classification task, the proposed PS CapsNets obtain impressive higher accuracy compared to the traditional CNNs and FC CapsNets by a large margin.Comment: This paper is under consideration at Computer Vision and Image Understandin
    • …
    corecore