39 research outputs found

    End-to-end training of object class detectors for mean average precision

    Get PDF
    We present a method for training CNN-based object class detectors directly using mean average precision (mAP) as the training loss, in a truly end-to-end fashion that includes non-maximum suppression (NMS) at training time. This contrasts with the traditional approach of training a CNN for a window classification loss, then applying NMS only at test time, when mAP is used as the evaluation metric in place of classification accuracy. However, mAP following NMS forms a piecewise-constant structured loss over thousands of windows, with gradients that do not convey useful information for gradient descent. Hence, we define new, general gradient-like quantities for piecewise constant functions, which have wide applicability. We describe how to calculate these efficiently for mAP following NMS, enabling to train a detector based on Fast R-CNN directly for mAP. This model achieves equivalent performance to the standard Fast R-CNN on the PASCAL VOC 2007 and 2012 datasets, while being conceptually more appealing as the very same model and loss are used at both training and test time.Comment: This version has minor additions to results (ablation study) and discussio

    Video Based Fish Species Detection Using Faster Region Convolution Neural Network

    Get PDF
    Fish recognition and classification represent significant challenges in marine biology and agriculture, promising fields for advancing research. Despite advancements in real-time data collection, underwater fish recognition and classification still require improvement due to challenges such as variations in fish size and shape, image quality issues, and environmental changes. Feature learning approaches, particularly utilizing convolutional neural networks (CNNs), have shown promise in addressing these challenges. This study focuses on video-based fish species classification, employing a feature learning-based extraction method through CNNs. The process involves two main stages: detection and classification. To address the detection and classification in video a Faster Region Convolutional Neural Network (RCNN) with transfer learning techniques are applied, achieving a mean average precision of 84% for detection and classification tasks. These techniques offer promising avenues for enhancing fish recognition and classification in diverse environment

    Learning non-maximum suppression

    Full text link
    Object detectors have hugely profited from moving towards an end-to-end learning paradigm: proposals, features, and the classifier becoming one neural network improved results two-fold on general object detection. One indispensable component is non-maximum suppression (NMS), a post-processing algorithm responsible for merging all detections that belong to the same object. The de facto standard NMS algorithm is still fully hand-crafted, suspiciously simple, and -- being based on greedy clustering with a fixed distance threshold -- forces a trade-off between recall and precision. We propose a new network architecture designed to perform NMS, using only boxes and their score. We report experiments for person detection on PETS and for general object categories on the COCO dataset. Our approach shows promise providing improved localization and occlusion handling.Comment: Added "Supplementary material" titl

    Ball detection for boccia game analysis

    Get PDF
    The present article proposes the training, testing and comparison of two models for ball detection, taking into account its final implementation in a Boccia game analysis computer-vision algorithm, within the 'iBoccia' framework. The goal is to have a versatile and flexible algorithm towards different game environments. The selected ball detectors were a Histogram-of-Oriented-Gradients feature based Support Vector Machine (HOG-SVM) and a Convolutional Neural Network (CNN) based on a less complex implementation of the You Only Look Once model (Tiny-YOLO). Both detectors were evaluated offline and in real-time. The subsequent results showed that their performance was similar in both evaluations, however, Tiny-YOLO outperformed HOG-SVM by a small margin in all the used metrics. In real-time, both detectors achieved an accuracy of approximately 90%. Despite the high accuracy values, the detector requires further improvement because a single non-detection can influence the computer-vision algorithm's output, making the system unreliable.FCT - Fundação para a Ciência e a Tecnologia(SFRH/BD/ SFRH/BD/133314/2017)This article is supported by the project Deus ex Machina: NORTE – 01 – 0145 – FEDER - 000026, supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF). Vinicius Silva also thanks FCT for the PhD scholarship SFRH/BD/ SFRH/BD/133314/2017

    Towards Accurate One-Stage Object Detection with AP-Loss

    Full text link
    One-stage object detectors are trained by optimizing classification-loss and localization-loss simultaneously, with the former suffering much from extreme foreground-background class imbalance issue due to the large number of anchors. This paper alleviates this issue by proposing a novel framework to replace the classification task in one-stage detectors with a ranking task, and adopting the Average-Precision loss (AP-loss) for the ranking problem. Due to its non-differentiability and non-convexity, the AP-loss cannot be optimized directly. For this purpose, we develop a novel optimization algorithm, which seamlessly combines the error-driven update scheme in perceptron learning and backpropagation algorithm in deep networks. We verify good convergence property of the proposed algorithm theoretically and empirically. Experimental results demonstrate notable performance improvement in state-of-the-art one-stage detectors based on AP-loss over different kinds of classification-losses on various benchmarks, without changing the network architectures. Code is available at https://github.com/cccorn/AP-loss.Comment: 13 pages, 7 figures, 4 tables, main paper + supplementary material, accepted to CVPR 201
    corecore