2,800 research outputs found
Modelling Local Deep Convolutional Neural Network Features to Improve Fine-Grained Image Classification
We propose a local modelling approach using deep convolutional neural
networks (CNNs) for fine-grained image classification. Recently, deep CNNs
trained from large datasets have considerably improved the performance of
object recognition. However, to date there has been limited work using these
deep CNNs as local feature extractors. This partly stems from CNNs having
internal representations which are high dimensional, thereby making such
representations difficult to model using stochastic models. To overcome this
issue, we propose to reduce the dimensionality of one of the internal fully
connected layers, in conjunction with layer-restricted retraining to avoid
retraining the entire network. The distribution of low-dimensional features
obtained from the modified layer is then modelled using a Gaussian mixture
model. Comparative experiments show that considerable performance improvements
can be achieved on the challenging Fish and UEC FOOD-100 datasets.Comment: 5 pages, three figure
Underwater Fish Detection with Weak Multi-Domain Supervision
Given a sufficiently large training dataset, it is relatively easy to train a
modern convolution neural network (CNN) as a required image classifier.
However, for the task of fish classification and/or fish detection, if a CNN
was trained to detect or classify particular fish species in particular
background habitats, the same CNN exhibits much lower accuracy when applied to
new/unseen fish species and/or fish habitats. Therefore, in practice, the CNN
needs to be continuously fine-tuned to improve its classification accuracy to
handle new project-specific fish species or habitats. In this work we present a
labelling-efficient method of training a CNN-based fish-detector (the Xception
CNN was used as the base) on relatively small numbers (4,000) of project-domain
underwater fish/no-fish images from 20 different habitats. Additionally, 17,000
of known negative (that is, missing fish) general-domain (VOC2012) above-water
images were used. Two publicly available fish-domain datasets supplied
additional 27,000 of above-water and underwater positive/fish images. By using
this multi-domain collection of images, the trained Xception-based binary
(fish/not-fish) classifier achieved 0.17% false-positives and 0.61%
false-negatives on the project's 20,000 negative and 16,000 positive holdout
test images, respectively. The area under the ROC curve (AUC) was 99.94%.Comment: Published in the 2019 International Joint Conference on Neural
Networks (IJCNN-2019), Budapest, Hungary, July 14-19, 2019,
https://www.ijcnn.org/ , https://ieeexplore.ieee.org/document/885190
SIMCO: SIMilarity-based object COunting
We present SIMCO, the first agnostic multi-class object counting approach.
SIMCO starts by detecting foreground objects through a novel Mask RCNN-based
architecture trained beforehand (just once) on a brand-new synthetic 2D shape
dataset, InShape; the idea is to highlight every object resembling a primitive
2D shape (circle, square, rectangle, etc.). Each object detected is described
by a low-dimensional embedding, obtained from a novel similarity-based head
branch; this latter implements a triplet loss, encouraging similar objects
(same 2D shape + color and scale) to map close. Subsequently, SIMCO uses this
embedding for clustering, so that different types of objects can emerge and be
counted, making SIMCO the very first multi-class unsupervised counter.
Experiments show that SIMCO provides state-of-the-art scores on counting
benchmarks and that it can also help in many challenging image understanding
tasks
- …