10,461 research outputs found
Modelling Local Deep Convolutional Neural Network Features to Improve Fine-Grained Image Classification
We propose a local modelling approach using deep convolutional neural
networks (CNNs) for fine-grained image classification. Recently, deep CNNs
trained from large datasets have considerably improved the performance of
object recognition. However, to date there has been limited work using these
deep CNNs as local feature extractors. This partly stems from CNNs having
internal representations which are high dimensional, thereby making such
representations difficult to model using stochastic models. To overcome this
issue, we propose to reduce the dimensionality of one of the internal fully
connected layers, in conjunction with layer-restricted retraining to avoid
retraining the entire network. The distribution of low-dimensional features
obtained from the modified layer is then modelled using a Gaussian mixture
model. Comparative experiments show that considerable performance improvements
can be achieved on the challenging Fish and UEC FOOD-100 datasets.Comment: 5 pages, three figure
Recognition and Classification of Fast Food Images
Image processing is widely used for food recognition. A lot of different algorithms regarding food identification and classification have been proposed in recent research works. In this paper, we have use an easy and one of the most powerful machine learning technique from the field of deep learning to recognize and classify different categories of fast food images. We have used a pre trained Convolutional Neural Network (CNN) as a feature extractor to train an image category classifier. CNN2019;s can learn rich feature representations which often perform much better than other handcrafted features such as histogram of oriented gradients (HOG), Local binary patterns (LBP), or speeded up robust features (SURF). A multiclass linear Support Vector Machine (SVM) classifier trained with extracted CNN features is used to classify fast food images to ten different classes. After working on two different benchmark databases, we got the success rate of 99.5% which is higher than the accuracy achieved using bag of features (BoF) and SURF
a deep learning model to recognize food contaminating beetle species based on elytra fragments
Abstract Insect pests are often associated with food contamination and public health risks. Accurate and timely species-specific identification of pests is a key step to scale impacts, trace back the contamination process and promptly set intervention measures, which usually have serious economic impact. The current procedure involves visual inspection by human analysts of pest fragments recovered from food samples, a time-consuming and error-prone process. Deep Learning models have been widely applied for image recognition, outperforming other machine learning algorithms; however only few studies have applied deep learning for food contamination detection. In this paper, we describe our solution for automatic identification of 15 storage product beetle species frequently detected in food inspection. Our approach is based on a convolutional neural network trained on a dataset of 6900 microscopic images of elytra fragments, obtaining an overall accuracy of 83.8% in cross validation. Notably, the classification performance is obtained without the need of designing and selecting domain specific image features, thus demonstrating the promising prospects of Deep Learning models in detecting food contamination
CuisineNet: Food Attributes Classification using Multi-scale Convolution Network
Diversity of food and its attributes represents the culinary habits of
peoples from different countries. Thus, this paper addresses the problem of
identifying food culture of people around the world and its flavor by
classifying two main food attributes, cuisine and flavor. A deep learning model
based on multi-scale convotuional networks is proposed for extracting more
accurate features from input images. The aggregation of multi-scale convolution
layers with different kernel size is also used for weighting the features
results from different scales. In addition, a joint loss function based on
Negative Log Likelihood (NLL) is used to fit the model probability to multi
labeled classes for multi-modal classification task. Furthermore, this work
provides a new dataset for food attributes, so-called Yummly48K, extracted from
the popular food website, Yummly. Our model is assessed on the constructed
Yummly48K dataset. The experimental results show that our proposed method
yields 65% and 62% average F1 score on validation and test set which
outperforming the state-of-the-art models.Comment: 8 pages, Submitted in CCIA 201
A deep representation for depth images from synthetic data
Convolutional Neural Networks (CNNs) trained on large scale RGB databases
have become the secret sauce in the majority of recent approaches for object
categorization from RGB-D data. Thanks to colorization techniques, these
methods exploit the filters learned from 2D images to extract meaningful
representations in 2.5D. Still, the perceptual signature of these two kind of
images is very different, with the first usually strongly characterized by
textures, and the second mostly by silhouettes of objects. Ideally, one would
like to have two CNNs, one for RGB and one for depth, each trained on a
suitable data collection, able to capture the perceptual properties of each
channel for the task at hand. This has not been possible so far, due to the
lack of a suitable depth database. This paper addresses this issue, proposing
to opt for synthetically generated images rather than collecting by hand a 2.5D
large scale database. While being clearly a proxy for real data, synthetic
images allow to trade quality for quantity, making it possible to generate a
virtually infinite amount of data. We show that the filters learned from such
data collection, using the very same architecture typically used on visual
data, learns very different filters, resulting in depth features (a) able to
better characterize the different facets of depth images, and (b) complementary
with respect to those derived from CNNs pre-trained on 2D datasets. Experiments
on two publicly available databases show the power of our approach
One-Shot Fine-Grained Instance Retrieval
Fine-Grained Visual Categorization (FGVC) has achieved significant progress
recently. However, the number of fine-grained species could be huge and
dynamically increasing in real scenarios, making it difficult to recognize
unseen objects under the current FGVC framework. This raises an open issue to
perform large-scale fine-grained identification without a complete training
set. Aiming to conquer this issue, we propose a retrieval task named One-Shot
Fine-Grained Instance Retrieval (OSFGIR). "One-Shot" denotes the ability of
identifying unseen objects through a fine-grained retrieval task assisted with
an incomplete auxiliary training set. This paper first presents the detailed
description to OSFGIR task and our collected OSFGIR-378K dataset. Next, we
propose the Convolutional and Normalization Networks (CN-Nets) learned on the
auxiliary dataset to generate a concise and discriminative representation.
Finally, we present a coarse-to-fine retrieval framework consisting of three
components, i.e., coarse retrieval, fine-grained retrieval, and query
expansion, respectively. The framework progressively retrieves images with
similar semantics, and performs fine-grained identification. Experiments show
our OSFGIR framework achieves significantly better accuracy and efficiency than
existing FGVC and image retrieval methods, thus could be a better solution for
large-scale fine-grained object identification.Comment: Accepted by MM2017, 9 pages, 7 figure
- …