3 research outputs found

    Infrared Object Detection Method Based on DBD-YOLOv8

    No full text
    An innovative and improved method for infrared object detection, namely DBD-YOLOv8 (DCN-BiRA-DyHeads-YOLOv8), is presented. The inherent limitations of the YOLOv8 model in scenarios with a low signal-to-noise ratio and complex tasks are addressed, with a focus on improving the multi-scale feature representation within the YOLOv8 framework and effectively filtering out irrelevant regions. To achieve this, two crucial modules, D_C2f and D_SPPF, are integrated. Deformable convolutions (DCN) are utilized by these modules to dynamically adjust the visual receptive fields of the network. Furthermore, a Bi-level Routing Attention mechanism (BRA) and Dynamic Heads (DyHeads) are adapted within the feature fusion network, refining feature maps and enhancing semantic representation through attention mechanisms. Significant improvements are demonstrated by DBD-YOLOv8 when compared to the YOLOv8- n\s\m\l\x\text{n}\backslash \text{s}\backslash \text{m}\backslash \text{l}\backslash \text{x} series models. Notably, improved average [email protected] values on benchmark datasets, including FLIR, OTCBVS (Dataset 01), OTCBVS (Dataset 03), and VEDAI, are achieved by DBD-YOLOv8. The corresponding values are 84.8%, 96.3%, 99.7%, and 76.0%, respectively. These results represent increases of 7.9%, 1.5%, 0.1%, and 3.5%, respectively. Importantly, real-time requirements are met by the model’s inference times, which measure 10.9ms, 32.0ms, 37.3ms, and 28.4ms accordingly for the previous datasets

    DS-YOLOv8-Based Object Detection Method for Remote Sensing Images

    No full text
    The improved YOLOv8 model (DCN_C2f+SC_SA+YOLOv8, hereinafter referred to as DS-YOLOv8) is proposed to address object detection challenges in complex remote sensing image tasks. It aims to overcome limitations such as the restricted receptive field caused by fixed convolutional kernels in the YOLO backbone network and the inadequate multi-scale feature learning capabilities resulting from the spatial and channel attention fusion mechanism’s inability to adapt to the input data’s feature distribution. The DS-YOLOv8 model introduces the Deformable Convolution C2f (DCN_C2f) module in the backbone network to enable adaptive adjustment of the network’s receptive field. Additionally, a lightweight Self-Calibrating Shuffle Attention (SC_SA) module is designed for spatial and channel attention mechanisms. This design choice allows for adaptive encoding of contextual information, preventing the loss of feature details caused by convolution iterations and improving the representation capability of multi-scale, occluded, and small object features. Moreover, the DS-YOLOv8 model incorporates the dynamic non-monotonic focus mechanism of Wise-IoU and employs a position regression loss function to further enhance its performance. Experimental results demonstrate the excellent performance of the DS-YOLOv8 model on various public datasets, including RSOD, NWPU VHR-10, DIOR, and VEDAI. The average mAP@0.5 values achieved are 97.7%, 92.9%, 89.7%, and 78.9%, respectively. Similarly, the average mAP@0.5:0.95 values are observed to be 74.0%, 64.3%, 70.7%, and 51.1%. Importantly, the model maintains real-time inference capabilities. In comparison to the YOLOv8 series models, the DS-YOLOv8 model demonstrates significant performance improvements and outperforms other mainstream models in terms of detection accuracy
    corecore