Search CORE

9 research outputs found

Hierarchical shot detector

Author: Cao Jiale
Han Jungong
Li Xuelong
Pang Yanwei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/02/2020
Field of study

Single shot detector simultaneously predicts object categories and regression offsets of the default boxes. Despite of high efficiency, this structure has some inappropriate designs: (1) The classification result of the default box is improperly assigned to that of the regressed box during inference, (2) Only regression once is not good enough for accurate object detection. To solve the first problem, a novel reg-offset-cls (ROC) module is proposed. It contains three hierarchical steps: box regression, the feature sampling location predication, and the regressed box classification with the features of offset locations. To further solve the second problem, a hierarchical shot detector (HSD) is proposed, which stacks two ROC modules and one feature enhanced module. The second ROC treats the regressed boxes and the feature sampling locations of features in the first ROC as the inputs. Meanwhile, the feature enhanced module injected between two ROCs aims to extract the local and non-local context. Experiments on the MS COCO and PASCAL VOC datasets demonstrate the superiority of proposed HSD. Without the bells or whistles, HSD outperforms all one-stage methods at real-time speed

Crossref

Warwick Research Archives Portal Repository

Msb r‐cnn: A multi‐stage balanced defect detection network

Author: Cao Jiangzhong
Cheng Yongqiang
Lan Shangwei
Wu Zongze
Xu Zhihua
Yang Zhijing
Publication venue: 'MDPI AG'
Publication date: 01/08/2021
Field of study

Deep learning networks are applied for defect detection, among which Cascade R‐CNN is a multi‐stage object detection network and is state of the art in terms of accuracy and efficiency. However, it is still a challenge for Cascade R‐CNN to deal with complex and diverse defects, as the widely varied shapes of defects lead to inefficiency for the traditional convolution filter to extract features. Additionally, the imbalance in features, losses and samples cause lower accuracy. To address the above challenges, this paper proposes a multi‐stage balanced R‐CNN (MSB R‐CNN) for defect detection based on Cascade R‐CNN. Firstly, deformable convolution is adopted in different stages of the backbone network to improve its adaptability to the varying shapes of the defect. Then, the features obtained by the backbone network are refined and enhanced by the balanced feature pyramid. To overcome the imbalance of classification and regression loss, the balanced L1 loss is applied at different stages to correct it. Finally, for the sample selection, the interaction of union (IoU) balanced sampler and the online hard example mining (OHEM) sampler are combined at different stages to make the sampling more reasonable, which can bring a better accuracy and convergence effect to the model. The results of our experiments on the DAGM2007 dataset has shown that our network (MSB R‐CNN) can achieve a mean average precision (mAP) of 67.5%, an increase of 1.5% mAP, compared to Cascade R‐CNN

Multidisciplinary Digital Publishing Institute

Repository@Hull - Worktribe

Directory of Open Access Journals

YOLO-Former: YOLO Shakes Hand With ViT

Author: Borhani Yasamin
Ghanbarzadeh Armin
Khoramdel Javad
Moori Ahmad
Najafi Esmaeil
Publication venue
Publication date: 11/01/2024
Field of study

The proposed YOLO-Former method seamlessly integrates the ideas of transformer and YOLOv4 to create a highly accurate and efficient object detection system. The method leverages the fast inference speed of YOLOv4 and incorporates the advantages of the transformer architecture through the integration of convolutional attention and transformer modules. The results demonstrate the effectiveness of the proposed approach, with a mean average precision (mAP) of 85.76\% on the Pascal VOC dataset, while maintaining high prediction speed with a frame rate of 10.85 frames per second. The contribution of this work lies in the demonstration of how the innovative combination of these two state-of-the-art techniques can lead to further improvements in the field of object detection

arXiv.org e-Print Archive

A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection

Author: Akbas Emre
Cam Baris Can
Kalkan Sinan
Oksuz Kemal
Publication venue
Publication date: 06/12/2020
Field of study

We propose average Localisation-Recall-Precision (aLRP), a unified, bounded, balanced and ranking-based loss function for both classification and localisation tasks in object detection. aLRP extends the Localisation-Recall-Precision (LRP) performance metric (Oksuz et al., 2018) inspired from how Average Precision (AP) Loss extends precision to a ranking-based loss function for classification (Chen et al., 2020). aLRP has the following distinct advantages: (i) aLRP is the first ranking-based loss function for both classification and localisation tasks. (ii) Thanks to using ranking for both tasks, aLRP naturally enforces high-quality localisation for high-precision classification. (iii) aLRP provides provable balance between positives and negatives. (iv) Compared to on average

\sim

6 hyperparameters in the loss functions of state-of-the-art detectors, aLRP Loss has only one hyperparameter, which we did not tune in practice. On the COCO dataset, aLRP Loss improves its ranking-based predecessor, AP Loss, up to around

5

AP points, achieves

48.9

AP without test time augmentation and outperforms all one-stage detectors. Code available at: https://github.com/kemaloksuz/aLRPLoss .Comment: NeurIPS 2020 spotlight pape

arXiv.org e-Print Archive

OpenMETU (Middle East Technical University)