1,148 research outputs found
A Counting Method of Red Jujube Based on Improved YOLOv5s
Due to complex environmental factors such as illumination, shading between leaves and fruits, shading between fruits, and so on, it is a challenging task to quickly identify red jujubes and count red jujubes in orchards. A counting method of red jujube based on improved YOLOv5s was proposed, which realized the fast and accurate detection of red jujubes and reduced the model scale and estimation error. ShuffleNet V2 was used as the backbone of the model to improve model detection ability and light the weight. In addition, the Stem, a novel data loading module, was proposed to prevent the loss of information due to the change in feature map size. PANet was replaced by BiFPN to enhance the model feature fusion capability and improve the model accuracy. Finally, the improved YOLOv5s detection model was used to count red jujubes. The experimental results showed that the overall performance of the improved model was better than that of YOLOv5s. Compared with the YOLOv5s, the improved model was 6.25% and 8.33% of the original network in terms of the number of model parameters and model size, and the Precision, Recall, F1-score, AP, and Fps were improved by 4.3%, 2.0%, 3.1%, 0.6%, and 3.6%, respectively. In addition, RMSE and MAPE decreased by 20.87% and 5.18%, respectively. Therefore, the improved model has advantages in memory occupation and recognition accuracy, and the method provides a basis for the estimation of red jujube yield by vision
Precise Single-stage Detector
There are still two problems in SDD causing some inaccurate results: (1) In
the process of feature extraction, with the layer-by-layer acquisition of
semantic information, local information is gradually lost, resulting into less
representative feature maps; (2) During the Non-Maximum Suppression (NMS)
algorithm due to inconsistency in classification and regression tasks, the
classification confidence and predicted detection position cannot accurately
indicate the position of the prediction boxes. Methods: In order to address
these aforementioned issues, we propose a new architecture, a modified version
of Single Shot Multibox Detector (SSD), named Precise Single Stage Detector
(PSSD). Firstly, we improve the features by adding extra layers to SSD.
Secondly, we construct a simple and effective feature enhancement module to
expand the receptive field step by step for each layer and enhance its local
and semantic information. Finally, we design a more efficient loss function to
predict the IOU between the prediction boxes and ground truth boxes, and the
threshold IOU guides classification training and attenuates the scores, which
are used by the NMS algorithm. Main Results: Benefiting from the above
optimization, the proposed model PSSD achieves exciting performance in
real-time. Specifically, with the hardware of Titan Xp and the input size of
320 pix, PSSD achieves 33.8 mAP at 45 FPS speed on MS COCO benchmark and 81.28
mAP at 66 FPS speed on Pascal VOC 2007 outperforming state-of-the-art object
detection models. Besides, the proposed model performs significantly well with
larger input size. Under 512 pix, PSSD can obtain 37.2 mAP with 27 FPS on MS
COCO and 82.82 mAP with 40 FPS on Pascal VOC 2007. The experiment results prove
that the proposed model has a better trade-off between speed and accuracy.Comment: We will submit it soon to the IEEE transaction. Due to characters
limitation, we can not upload the full abstract. Please read the pdf file for
more detai
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
A DCNN-based Arbitrarily-Oriented Object Detector for Quality Control and Inspection Application
Following the success of machine vision systems for on-line automated quality
control and inspection processes, an object recognition solution is presented
in this work for two different specific applications, i.e., the detection of
quality control items in surgery toolboxes prepared for sterilizing in a
hospital, as well as the detection of defects in vessel hulls to prevent
potential structural failures. The solution has two stages. First, a feature
pyramid architecture based on Single Shot MultiBox Detector (SSD) is used to
improve the detection performance, and a statistical analysis based on ground
truth is employed to select parameters of a range of default boxes. Second, a
lightweight neural network is exploited to achieve oriented detection results
using a regression method. The first stage of the proposed method is capable of
detecting the small targets considered in the two scenarios. In the second
stage, despite the simplicity, it is efficient to detect elongated targets
while maintaining high running efficiency
RGB-D Salient Object Detection: A Survey
Salient object detection (SOD), which simulates the human visual perception
system to locate the most attractive object(s) in a scene, has been widely
applied to various computer vision tasks. Now, with the advent of depth
sensors, depth maps with affluent spatial information that can be beneficial in
boosting the performance of SOD, can easily be captured. Although various RGB-D
based SOD models with promising performance have been proposed over the past
several years, an in-depth understanding of these models and challenges in this
topic remains lacking. In this paper, we provide a comprehensive survey of
RGB-D based SOD models from various perspectives, and review related benchmark
datasets in detail. Further, considering that the light field can also provide
depth maps, we review SOD models and popular benchmark datasets from this
domain as well. Moreover, to investigate the SOD ability of existing models, we
carry out a comprehensive evaluation, as well as attribute-based evaluation of
several representative RGB-D based SOD models. Finally, we discuss several
challenges and open directions of RGB-D based SOD for future research. All
collected models, benchmark datasets, source code links, datasets constructed
for attribute-based evaluation, and codes for evaluation will be made publicly
available at https://github.com/taozh2017/RGBDSODsurveyComment: 24 pages, 12 figures. Has been accepted by Computational Visual Medi
Thinking Twice: Clinical-Inspired Thyroid Ultrasound Lesion Detection Based on Feature Feedback
Accurate detection of thyroid lesions is a critical aspect of computer-aided
diagnosis. However, most existing detection methods perform only one feature
extraction process and then fuse multi-scale features, which can be affected by
noise and blurred features in ultrasound images. In this study, we propose a
novel detection network based on a feature feedback mechanism inspired by
clinical diagnosis. The mechanism involves first roughly observing the overall
picture and then focusing on the details of interest. It comprises two parts: a
feedback feature selection module and a feature feedback pyramid. The feedback
feature selection module efficiently selects the features extracted in the
first phase in both space and channel dimensions to generate high semantic
prior knowledge, which is similar to coarse observation. The feature feedback
pyramid then uses this high semantic prior knowledge to enhance feature
extraction in the second phase and adaptively fuses the two features, similar
to fine observation. Additionally, since radiologists often focus on the shape
and size of lesions for diagnosis, we propose an adaptive detection head
strategy to aggregate multi-scale features. Our proposed method achieves an AP
of 70.3% and AP50 of 99.0% on the thyroid ultrasound dataset and meets the
real-time requirement. The code is available at
https://github.com/HIT-wanglingtao/Thinking-Twice.Comment: 20 pages, 11 figures, released code for
https://github.com/HIT-wanglingtao/Thinking-Twic
MelNet: A Real-Time Deep Learning Algorithm for Object Detection
In this study, a novel deep learning algorithm for object detection, named
MelNet, was introduced. MelNet underwent training utilizing the KITTI dataset
for object detection. Following 300 training epochs, MelNet attained an mAP
(mean average precision) score of 0.732. Additionally, three alternative models
-YOLOv5, EfficientDet, and Faster-RCNN-MobileNetv3- were trained on the KITTI
dataset and juxtaposed with MelNet for object detection.
The outcomes underscore the efficacy of employing transfer learning in
certain instances. Notably, preexisting models trained on prominent datasets
(e.g., ImageNet, COCO, and Pascal VOC) yield superior results. Another finding
underscores the viability of creating a new model tailored to a specific
scenario and training it on a specific dataset. This investigation demonstrates
that training MelNet exclusively on the KITTI dataset also surpasses
EfficientDet after 150 epochs. Consequently, post-training, MelNet's
performance closely aligns with that of other pre-trained models.Comment: 11 pages, 9 figures, 5 table
- …