439 research outputs found
NU-InNet: Thai Food Image Recognition Using Convolutional Neural Networks on Smartphone
Currently, Convolutional Neural Networks (CNN) have been widely used in many applications. Image recognition is one of the applications utilizing CNN. For most of the research in this field, CNN is used mainly to increase the effectiveness of the recognition. However, the processing time and the amount of the parameters (or model size) are not taken into account as the main factors. In this paper, the image recognition for Thai food using a smartphone is studied. The processing time and the model size are reduced so that they can be properly used with smartphones. A new network called NUInNet (Naresuan University Inception Network) that adopts the concept of Inception module used in GoogLeNet is proposed in the paper. It is applied and tested with Thai food database called THFOOD-50, which contains 50 kinds of famousThai food. It is found that NU-InNet can reduce the processing time and the model size by the factors of 2 and 10, respectively, comparing to those obtained from GoogLeNet while maintaining the recognition precision to the same level as GoogLeNet. This significant reduction in the processing time and the model size using the proposed network can certainly satisfy users for Thai-food recognition application in a smartphone
clcNet: Improving the Efficiency of Convolutional Neural Network using Channel Local Convolutions
Depthwise convolution and grouped convolution has been successfully applied
to improve the efficiency of convolutional neural network (CNN). We suggest
that these models can be considered as special cases of a generalized
convolution operation, named channel local convolution(CLC), where an output
channel is computed using a subset of the input channels. This definition
entails computation dependency relations between input and output channels,
which can be represented by a channel dependency graph(CDG). By modifying the
CDG of grouped convolution, a new CLC kernel named interlaced grouped
convolution (IGC) is created. Stacking IGC and GC kernels results in a
convolution block (named CLC Block) for approximating regular convolution. By
resorting to the CDG as an analysis tool, we derive the rule for setting the
meta-parameters of IGC and GC and the framework for minimizing the
computational cost. A new CNN model named clcNet is then constructed using CLC
blocks, which shows significantly higher computational efficiency and fewer
parameters compared to state-of-the-art networks, when being tested using the
ImageNet-1K dataset. Source code is available at
https://github.com/dqzhang17/clcnet.torch
MirBot: A collaborative object recognition system for smartphones using convolutional neural networks
MirBot is a collaborative application for smartphones that allows users to
perform object recognition. This app can be used to take a photograph of an
object, select the region of interest and obtain the most likely class (dog,
chair, etc.) by means of similarity search using features extracted from a
convolutional neural network (CNN). The answers provided by the system can be
validated by the user so as to improve the results for future queries. All the
images are stored together with a series of metadata, thus enabling a
multimodal incremental dataset labeled with synset identifiers from the WordNet
ontology. This dataset grows continuously thanks to the users' feedback, and is
publicly available for research. This work details the MirBot object
recognition system, analyzes the statistics gathered after more than four years
of usage, describes the image classification methodology, and performs an
exhaustive evaluation using handcrafted features, convolutional neural codes
and different transfer learning techniques. After comparing various models and
transformation methods, the results show that the CNN features maintain the
accuracy of MirBot constant over time, despite the increasing number of new
classes. The app is freely available at the Apple and Google Play stores.Comment: Accepted in Neurocomputing, 201
Large-Scale Plant Classification with Deep Neural Networks
This paper discusses the potential of applying deep learning techniques for
plant classification and its usage for citizen science in large-scale
biodiversity monitoring. We show that plant classification using near
state-of-the-art convolutional network architectures like ResNet50 achieves
significant improvements in accuracy compared to the most widespread plant
classification application in test sets composed of thousands of different
species labels. We find that the predictions can be confidently used as a
baseline classification in citizen science communities like iNaturalist (or its
Spanish fork, Natusfera) which in turn can share their data with biodiversity
portals like GBIF.Comment: 5 pages, 3 figures, 1 table. Published at Proocedings of ACM
Computing Frontiers Conference 201
Advertisement billboard detection and geotagging system with inductive transfer learning in deep convolutional neural network
In this paper, we propose an approach to detect and geotag advertisement billboard in real-time condition. Our approach is using AlexNet’s Deep Convolutional Neural Network (DCNN) as a pre-trained neural network with 1000 categories for image classification. To improve the performance of the pre-trained neural network, we retrain the network by adding more advertisement billboard images using inductive transfer learning approach. Then, we fine-tuned the output layer into advertisement billboard related categories. Furthermore, the detected advertisement billboard images will be geotagged by inserting Exif metadata into the image file. Experimental results show that the approach achieves 92.7% training accuracy for advertisement billboard detection, while for overall testing results it will give 71,86% testing accuracy
- …