Search CORE

39,486 research outputs found

Waste management using an automatic sorting system for carrot fruit based on image processing technique and improved deep neural networks

Author: Jahanbakhshi Ahmad
Mahmoudi Majid
Momeny Mohammad
Radeva Petia
Publication venue: 'Elsevier BV'
Publication date: 31/01/2023
Field of study

In this study, we address the problem of classification of carrot fruit in order to manage and control their waste using improved deep neural networks. In this work, we perform a deep study of the problem of carrot classification and show that convolutional neural networks are a straightforward approach to solve the problem. Additionally, we improve the convolutional neural network (CNN) based on learning a pooling function by combining average pooling and max pooling. We experimentally show that the merging operation used increases the accuracy of the carrot classification compared to other merging methods. For this purpose, images of 878 carrot samples in various shapes (regular and irregular) were taken and after the preprocessing operation, they were classified by the improved deep CNN. To compare this method with the other methods, image features were extracted using Histograms of Oriented Gradients (HOG) and Local Binary Pattern (LBP) methods and they were classified by Multi-Layer Perceptron (MLP), Gradient Boosting Tree (GBT), and K-Nearest Neighbors (KNN) algorithms. Finally, the method proposed based on the improved CNN algorithm, was compared with other classification algorithms. The results showed 99.43% of accuracy for grading carrot through the CNN by configuring the proposed Batch Normalization (BN)-CNN method based on mixed pooling. Therefore, CNN can be effective in increasing marketability, controlling waste and improving traditional methods used for grading carrot fruit

Diposit Digital de la Universitat de Barcelona

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

Author: Huang Zhanchao
Wang Jianlin
Publication venue
Publication date: 20/03/2019
Field of study

Although YOLOv2 approach is extremely fast on object detection; its backbone network has the low ability on feature extraction and fails to make full use of multi-scale local region features, which restricts the improvement of object detection accuracy. Therefore, this paper proposed a DC-SPP-YOLO (Dense Connection and Spatial Pyramid Pooling Based YOLO) approach for ameliorating the object detection accuracy of YOLOv2. Specifically, the dense connection of convolution layers is employed in the backbone network of YOLOv2 to strengthen the feature extraction and alleviate the vanishing-gradient problem. Moreover, an improved spatial pyramid pooling is introduced to pool and concatenate the multi-scale local region features, so that the network can learn the object features more comprehensively. The DC-SPP-YOLO model is established and trained based on a new loss function composed of mean square error and cross entropy, and the object detection is realized. Experiments demonstrate that the mAP (mean Average Precision) of DC-SPP-YOLO proposed on PASCAL VOC datasets and UA-DETRAC datasets is higher than that of YOLOv2; the object detection accuracy of DC-SPP-YOLO is superior to YOLOv2 by strengthening feature extraction and using the multi-scale local region features.Comment: 23 pages, 9 figures, 9 table

arXiv.org e-Print Archive

Deep neural network for traffic sign recognition systems: An analysis of spatial transformers and stochastic optimisation methods

Author: Arcos García Álvaro
Soria Morillo Luis Miguel
Álvarez García Juan Antonio
Publication venue: 'Elsevier BV'
Publication date: 01/03/2018
Field of study

This paper presents a Deep Learning approach for traffic sign recognition systems. Several classification experiments are conducted over publicly available traffic sign datasets from Germany and Belgium using a Deep Neural Network which comprises Convolutional layers and Spatial Transformer Networks. Such trials are built to measure the impact of diverse factors with the end goal of designing a Convolutional Neural Network that can improve the state-of-the-art of traffic sign classification task. First, different adaptive and non-adaptive stochastic gradient descent optimisation algorithms such as SGD, SGD-Nesterov, RMSprop and Adam are evaluated. Subsequently, multiple combinations of Spatial Transformer Networks placed at distinct positions within the main neural network are analysed. The recognition rate of the proposed Convolutional Neural Network reports an accuracy of 99.71% in the German Traffic Sign Recognition Benchmark, outperforming previous state-of-the-art methods and also being more efficient in terms of memory requirements.Ministerio de Economía y Competitividad TIN2017-82113-C2-1-RMinisterio de Economía y Competitividad TIN2013-46801-C4-1-

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

CIFAR-10: KNN-based Ensemble of Classifiers

Author: Abouelnaga Yehya
Ali Ola S.
Moustafa Mohamed
Rady Hager
Publication venue
Publication date: 15/11/2016
Field of study

In this paper, we study the performance of different classifiers on the CIFAR-10 dataset, and build an ensemble of classifiers to reach a better performance. We show that, on CIFAR-10, K-Nearest Neighbors (KNN) and Convolutional Neural Network (CNN), on some classes, are mutually exclusive, thus yield in higher accuracy when combined. We reduce KNN overfitting using Principal Component Analysis (PCA), and ensemble it with a CNN to increase its accuracy. Our approach improves our best CNN model from 93.33% to 94.03%

arXiv.org e-Print Archive

Crossref

AUC Knowledge Fountain (American Univ. in Cairo)

Improvements to deep convolutional neural networks for LVCSR

Author: Aravkin Aleksandr Y.
Beran Tomas
Dahl George E.
Kingsbury Brian
Mohamed Abdel-rahman
Ramabhadran Bhuvana
Sainath Tara N.
Saon George
Soltau Hagen
Publication venue
Publication date: 10/12/2013
Field of study

Deep Convolutional Neural Networks (CNNs) are more powerful than Deep Neural Networks (DNN), as they are able to better reduce spectral variation in the input signal. This has also been confirmed experimentally, with CNNs showing improvements in word error rate (WER) between 4-12% relative compared to DNNs across a variety of LVCSR tasks. In this paper, we describe different methods to further improve CNN performance. First, we conduct a deep analysis comparing limited weight sharing and full weight sharing with state-of-the-art features. Second, we apply various pooling strategies that have shown improvements in computer vision to an LVCSR speech task. Third, we introduce a method to effectively incorporate speaker adaptation, namely fMLLR, into log-mel features. Fourth, we introduce an effective strategy to use dropout during Hessian-free sequence training. We find that with these improvements, particularly with fMLLR and dropout, we are able to achieve an additional 2-3% relative improvement in WER on a 50-hour Broadcast News task over our previous best CNN baseline. On a larger 400-hour BN task, we find an additional 4-5% relative improvement over our previous best CNN baseline.Comment: 6 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX