8,123 research outputs found
CNN-based small object detection and visualization with feature activation mapping
Object detection is a well-studied topic, however
detection of small objects still lacks attention. Detecting small
objects has been difficult due to small sizes, occlusion and
complex backgrounds. Small objects detection is important in
a number of applications including detection of small insects.
One application is spider detection and removal. Spiders are
frequently found on grapes and broccolis sold at supermarkets
and this poses a significant safety issue and generates negative
publicity for the industry. In this paper, we present a fine-tuned
VGG16 network for detection of small objects such as spiders.
Furthermore, we introduce a simple technique called “feature
activation mapping” for object visualization from VGG16 feature
maps. The testing accuracy of our network on tiny spiders with
various backgrounds is 84%, as compared to 72% using finedtuned Faster R-CNN and 95.32% using CAM. Even though our
feature activation mapping technique has a mid-range of test
accuracy, it provides more detailed shape and size of spiders
than using CAM which is important for the application area. A
data set for spider detection is made available online.Authors would like to thank Australian Government Research Training Program for funding this research. This research was conducted by the Australian Research Council
Center of Excellence for Robotic Vision (CE140100016) http:
//www.roboticvision.org
Solar Power Plant Detection on Multi-Spectral Satellite Imagery using Weakly-Supervised CNN with Feedback Features and m-PCNN Fusion
Most of the traditional convolutional neural networks (CNNs) implements
bottom-up approach (feed-forward) for image classifications. However, many
scientific studies demonstrate that visual perception in primates rely on both
bottom-up and top-down connections. Therefore, in this work, we propose a CNN
network with feedback structure for Solar power plant detection on
middle-resolution satellite images. To express the strength of the top-down
connections, we introduce feedback CNN network (FB-Net) to a baseline CNN model
used for solar power plant classification on multi-spectral satellite data.
Moreover, we introduce a method to improve class activation mapping (CAM) to
our FB-Net, which takes advantage of multi-channel pulse coupled neural network
(m-PCNN) for weakly-supervised localization of the solar power plants from the
features of proposed FB-Net. For the proposed FB-Net CAM with m-PCNN,
experimental results demonstrated promising results on both solar-power plant
image classification and detection task.Comment: 9 pages, 9 figures, 4 table
Res2Net: A New Multi-scale Backbone Architecture
Representing features at multiple scales is of great importance for numerous
vision tasks. Recent advances in backbone convolutional neural networks (CNNs)
continually demonstrate stronger multi-scale representation ability, leading to
consistent performance gains on a wide range of applications. However, most
existing methods represent the multi-scale features in a layer-wise manner. In
this paper, we propose a novel building block for CNNs, namely Res2Net, by
constructing hierarchical residual-like connections within one single residual
block. The Res2Net represents multi-scale features at a granular level and
increases the range of receptive fields for each network layer. The proposed
Res2Net block can be plugged into the state-of-the-art backbone CNN models,
e.g., ResNet, ResNeXt, and DLA. We evaluate the Res2Net block on all these
models and demonstrate consistent performance gains over baseline models on
widely-used datasets, e.g., CIFAR-100 and ImageNet. Further ablation studies
and experimental results on representative computer vision tasks, i.e., object
detection, class activation mapping, and salient object detection, further
verify the superiority of the Res2Net over the state-of-the-art baseline
methods. The source code and trained models are available on
https://mmcheng.net/res2net/.Comment: 11 pages, 7 figure
Recommended from our members
Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline.
Neuropathologists assess vast brain areas to identify diverse and subtly-differentiated morphologies. Standard semi-quantitative scoring approaches, however, are coarse-grained and lack precise neuroanatomic localization. We report a proof-of-concept deep learning pipeline that identifies specific neuropathologies-amyloid plaques and cerebral amyloid angiopathy-in immunohistochemically-stained archival slides. Using automated segmentation of stained objects and a cloud-based interface, we annotate > 70,000 plaque candidates from 43 whole slide images (WSIs) to train and evaluate convolutional neural networks. Networks achieve strong plaque classification on a 10-WSI hold-out set (0.993 and 0.743 areas under the receiver operating characteristic and precision recall curve, respectively). Prediction confidence maps visualize morphology distributions at high resolution. Resulting network-derived amyloid beta (Aβ)-burden scores correlate well with established semi-quantitative scores on a 30-WSI blinded hold-out. Finally, saliency mapping demonstrates that networks learn patterns agreeing with accepted pathologic features. This scalable means to augment a neuropathologist's ability suggests a route to neuropathologic deep phenotyping
Evaluating Generalization Ability of Convolutional Neural Networks and Capsule Networks for Image Classification via Top-2 Classification
Image classification is a challenging problem which aims to identify the
category of object in the image. In recent years, deep Convolutional Neural
Networks (CNNs) have been applied to handle this task, and impressive
improvement has been achieved. However, some research showed the output of CNNs
can be easily altered by adding relatively small perturbations to the input
image, such as modifying few pixels. Recently, Capsule Networks (CapsNets) are
proposed, which can help eliminating this limitation. Experiments on MNIST
dataset revealed that capsules can better characterize the features of object
than CNNs. But it's hard to find a suitable quantitative method to compare the
generalization ability of CNNs and CapsNets. In this paper, we propose a new
image classification task called Top-2 classification to evaluate the
generalization ability of CNNs and CapsNets. The models are trained on single
label image samples same as the traditional image classification task. But in
the test stage, we randomly concatenate two test image samples which contain
different labels, and then use the trained models to predict the top-2 labels
on the unseen newly-created two label image samples. This task can provide us
precise quantitative results to compare the generalization ability of CNNs and
CapsNets. Back to the CapsNet, because it uses Full Connectivity (FC) mechanism
among all capsules, it requires many parameters. To reduce the number of
parameters, we introduce the Parameter-Sharing (PS) mechanism between capsules.
Experiments on five widely used benchmark image datasets demonstrate the method
significantly reduces the number of parameters, without losing the
effectiveness of extracting features. Further, on the Top-2 classification
task, the proposed PS CapsNets obtain impressive higher accuracy compared to
the traditional CNNs and FC CapsNets by a large margin.Comment: This paper is under consideration at Computer Vision and Image
Understandin
- …