14,173 research outputs found
Deep Multibranch Fusion Residual Network for Insect Pest Recognition
Earlier insect pest recognition is one of the critical factors for agricultural yield. Thus, an effective method to recognize the category of insect pests has become significant issues in the agricultural field. In this paper, we proposed a new residual block to learn multi-scale representation. In each block, it contains three branches: one is parameter-free, and the others contain several successive convolution layers. Moreover, we proposed a module and embedded it into the new residual block to recalibrate the channel-wise feature response and to model the relationship of the three branches. By stacking this kind of block, we constructed the Deep Multi-branch Fusion Residual Network (DMF-ResNet). For evaluating the model performance, we first test our model on CIFAR-10 and CIFAR-100 benchmark datasets. The experimental results show that DMF-ResNet outperforms the baseline models significantly. Then, we construct DMF-ResNet with different depths for high-resolution image classification tasks and apply it to recognize insect pests. We evaluate the model performance on the IP102 dataset, and the experimental results show that DMF-ResNet could achieve the best accuracy performance than the baseline models and other state-of-art methods. Based on these empirical experiments, we demonstrate the effectiveness of our approach
Image Aesthetics Assessment Using Composite Features from off-the-Shelf Deep Models
Deep convolutional neural networks have recently achieved great success on
image aesthetics assessment task. In this paper, we propose an efficient method
which takes the global, local and scene-aware information of images into
consideration and exploits the composite features extracted from corresponding
pretrained deep learning models to classify the derived features with support
vector machine. Contrary to popular methods that require fine-tuning or
training a new model from scratch, our training-free method directly takes the
deep features generated by off-the-shelf models for image classification and
scene recognition. Also, we analyzed the factors that could influence the
performance from two aspects: the architecture of the deep neural network and
the contribution of local and scene-aware information. It turns out that deep
residual network could produce more aesthetics-aware image representation and
composite features lead to the improvement of overall performance. Experiments
on common large-scale aesthetics assessment benchmarks demonstrate that our
method outperforms the state-of-the-art results in photo aesthetics assessment.Comment: Accepted by ICIP 201
Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification
Designing discriminative powerful texture features robust to realistic
imaging conditions is a challenging computer vision problem with many
applications, including material recognition and analysis of satellite or
aerial imagery. In the past, most texture description approaches were based on
dense orderless statistical distribution of local features. However, most
recent approaches to texture recognition and remote sensing scene
classification are based on Convolutional Neural Networks (CNNs). The d facto
practice when learning these CNN models is to use RGB patches as input with
training performed on large amounts of labeled data (ImageNet). In this paper,
we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained
using mapped coded images with explicit texture information provide
complementary information to the standard RGB deep models. Additionally, two
deep architectures, namely early and late fusion, are investigated to combine
the texture and color information. To the best of our knowledge, we are the
first to investigate Binary Patterns encoded CNNs and different deep network
fusion architectures for texture recognition and remote sensing scene
classification. We perform comprehensive experiments on four texture
recognition datasets and four remote sensing scene classification benchmarks:
UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with
7 categories and the recently introduced large scale aerial image dataset (AID)
with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary
information to standard RGB deep model of the same network architecture. Our
late fusion TEX-Net architecture always improves the overall performance
compared to the standard RGB network on both recognition problems. Our final
combination outperforms the state-of-the-art without employing fine-tuning or
ensemble of RGB network architectures.Comment: To appear in ISPRS Journal of Photogrammetry and Remote Sensin
Data Dropout: Optimizing Training Data for Convolutional Neural Networks
Deep learning models learn to fit training data while they are highly
expected to generalize well to testing data. Most works aim at finding such
models by creatively designing architectures and fine-tuning parameters. To
adapt to particular tasks, hand-crafted information such as image prior has
also been incorporated into end-to-end learning. However, very little progress
has been made on investigating how an individual training sample will influence
the generalization ability of a model. In other words, to achieve high
generalization accuracy, do we really need all the samples in a training
dataset? In this paper, we demonstrate that deep learning models such as
convolutional neural networks may not favor all training samples, and
generalization accuracy can be further improved by dropping those unfavorable
samples. Specifically, the influence of removing a training sample is
quantifiable, and we propose a Two-Round Training approach, aiming to achieve
higher generalization accuracy. We locate unfavorable samples after the first
round of training, and then retrain the model from scratch with the reduced
training dataset in the second round. Since our approach is essentially
different from fine-tuning or further training, the computational cost should
not be a concern. Our extensive experimental results indicate that, with
identical settings, the proposed approach can boost performance of the
well-known networks on both high-level computer vision problems such as image
classification, and low-level vision problems such as image denoising
- …