39 research outputs found
End-to-end training of object class detectors for mean average precision
We present a method for training CNN-based object class detectors directly
using mean average precision (mAP) as the training loss, in a truly end-to-end
fashion that includes non-maximum suppression (NMS) at training time. This
contrasts with the traditional approach of training a CNN for a window
classification loss, then applying NMS only at test time, when mAP is used as
the evaluation metric in place of classification accuracy. However, mAP
following NMS forms a piecewise-constant structured loss over thousands of
windows, with gradients that do not convey useful information for gradient
descent. Hence, we define new, general gradient-like quantities for piecewise
constant functions, which have wide applicability. We describe how to calculate
these efficiently for mAP following NMS, enabling to train a detector based on
Fast R-CNN directly for mAP. This model achieves equivalent performance to the
standard Fast R-CNN on the PASCAL VOC 2007 and 2012 datasets, while being
conceptually more appealing as the very same model and loss are used at both
training and test time.Comment: This version has minor additions to results (ablation study) and
discussio
Video Based Fish Species Detection Using Faster Region Convolution Neural Network
Fish recognition and classification represent significant challenges in marine biology and agriculture, promising fields for advancing research. Despite advancements in real-time data collection, underwater fish recognition and classification still require improvement due to challenges such as variations in fish size and shape, image quality issues, and environmental changes. Feature learning approaches, particularly utilizing convolutional neural networks (CNNs), have shown promise in addressing these challenges. This study focuses on video-based fish species classification, employing a feature learning-based extraction method through CNNs. The process involves two main stages: detection and classification. To address the detection and classification in video a Faster Region Convolutional Neural Network (RCNN) with transfer learning techniques are applied, achieving a mean average precision of 84% for detection and classification tasks. These techniques offer promising avenues for enhancing fish recognition and classification in diverse environment
Learning non-maximum suppression
Object detectors have hugely profited from moving towards an end-to-end
learning paradigm: proposals, features, and the classifier becoming one neural
network improved results two-fold on general object detection. One
indispensable component is non-maximum suppression (NMS), a post-processing
algorithm responsible for merging all detections that belong to the same
object. The de facto standard NMS algorithm is still fully hand-crafted,
suspiciously simple, and -- being based on greedy clustering with a fixed
distance threshold -- forces a trade-off between recall and precision. We
propose a new network architecture designed to perform NMS, using only boxes
and their score. We report experiments for person detection on PETS and for
general object categories on the COCO dataset. Our approach shows promise
providing improved localization and occlusion handling.Comment: Added "Supplementary material" titl
Ball detection for boccia game analysis
The present article proposes the training, testing and comparison of two models for ball detection, taking into account its final implementation in a Boccia game analysis computer-vision algorithm, within the 'iBoccia' framework. The goal is to have a versatile and flexible algorithm towards different game environments. The selected ball detectors were a Histogram-of-Oriented-Gradients feature based Support Vector Machine (HOG-SVM) and a Convolutional Neural Network (CNN) based on a less complex implementation of the You Only Look Once model (Tiny-YOLO). Both detectors were evaluated offline and in real-time. The subsequent results showed that their performance was similar in both evaluations, however, Tiny-YOLO outperformed HOG-SVM by a small margin in all the used metrics. In real-time, both detectors achieved an accuracy of approximately 90%. Despite the high accuracy values, the detector requires further improvement because a single non-detection can influence the computer-vision algorithm's output, making the system unreliable.FCT - Fundação para a Ciência e a Tecnologia(SFRH/BD/ SFRH/BD/133314/2017)This article is supported by the project Deus ex Machina: NORTE –
01 – 0145 – FEDER - 000026, supported by Norte Portugal Regional
Operational Programme (NORTE 2020), under the PORTUGAL 2020
Partnership Agreement, through the European Regional Development Fund
(ERDF). Vinicius Silva also thanks FCT for the PhD scholarship
SFRH/BD/ SFRH/BD/133314/2017
Towards Accurate One-Stage Object Detection with AP-Loss
One-stage object detectors are trained by optimizing classification-loss and
localization-loss simultaneously, with the former suffering much from extreme
foreground-background class imbalance issue due to the large number of anchors.
This paper alleviates this issue by proposing a novel framework to replace the
classification task in one-stage detectors with a ranking task, and adopting
the Average-Precision loss (AP-loss) for the ranking problem. Due to its
non-differentiability and non-convexity, the AP-loss cannot be optimized
directly. For this purpose, we develop a novel optimization algorithm, which
seamlessly combines the error-driven update scheme in perceptron learning and
backpropagation algorithm in deep networks. We verify good convergence property
of the proposed algorithm theoretically and empirically. Experimental results
demonstrate notable performance improvement in state-of-the-art one-stage
detectors based on AP-loss over different kinds of classification-losses on
various benchmarks, without changing the network architectures. Code is
available at https://github.com/cccorn/AP-loss.Comment: 13 pages, 7 figures, 4 tables, main paper + supplementary material,
accepted to CVPR 201