1,263 research outputs found
Edge-guided Representation Learning for Underwater Object Detection
Underwater object detection (UOD) is crucial for marine economic development,
environmental protection, and the planet's sustainable development. The main
challenges of this task arise from low-contrast, small objects, and mimicry of
aquatic organisms. The key to addressing these challenges is to focus the model
on obtaining more discriminative information. We observe that the edges of
underwater objects are highly unique and can be distinguished from low-contrast
or mimicry environments based on their edges. Motivated by this observation, we
propose an Edge-guided Representation Learning Network, termed ERL-Net, that
aims to achieve discriminative representation learning and aggregation under
the guidance of edge cues. Firstly, we introduce an edge-guided attention
module to model the explicit boundary information, which generates more
discriminative features. Secondly, a feature aggregation module is proposed to
aggregate the multi-scale discriminative features by regrouping them into three
levels, effectively aggregating global and local information for locating and
recognizing underwater objects. Finally, we propose a wide and asymmetric
receptive field block to enable features to have a wider receptive field,
allowing the model to focus on more small object information. Comprehensive
experiments on three challenging underwater datasets show that our method
achieves superior performance on the UOD task
Semantic-aware Texture-Structure Feature Collaboration for Underwater Image Enhancement
Underwater image enhancement has become an attractive topic as a significant
technology in marine engineering and aquatic robotics. However, the limited
number of datasets and imperfect hand-crafted ground truth weaken its
robustness to unseen scenarios, and hamper the application to high-level vision
tasks. To address the above limitations, we develop an efficient and compact
enhancement network in collaboration with a high-level semantic-aware
pretrained model, aiming to exploit its hierarchical feature representation as
an auxiliary for the low-level underwater image enhancement. Specifically, we
tend to characterize the shallow layer features as textures while the deep
layer features as structures in the semantic-aware model, and propose a
multi-path Contextual Feature Refinement Module (CFRM) to refine features in
multiple scales and model the correlation between different features. In
addition, a feature dominative network is devised to perform channel-wise
modulation on the aggregated texture and structure features for the adaptation
to different feature patterns of the enhancement network. Extensive experiments
on benchmarks demonstrate that the proposed algorithm achieves more appealing
results and outperforms state-of-the-art methods by large margins. We also
apply the proposed algorithm to the underwater salient object detection task to
reveal the favorable semantic-aware ability for high-level vision tasks. The
code is available at STSC.Comment: Accepted by ICRA202
Underwater target detection based on improved YOLOv7
Underwater target detection is a crucial aspect of ocean exploration.
However, conventional underwater target detection methods face several
challenges such as inaccurate feature extraction, slow detection speed and lack
of robustness in complex underwater environments. To address these limitations,
this study proposes an improved YOLOv7 network (YOLOv7-AC) for underwater
target detection. The proposed network utilizes an ACmixBlock module to replace
the 3x3 convolution block in the E-ELAN structure, and incorporates jump
connections and 1x1 convolution architecture between ACmixBlock modules to
improve feature extraction and network reasoning speed. Additionally, a
ResNet-ACmix module is designed to avoid feature information loss and reduce
computation, while a Global Attention Mechanism (GAM) is inserted in the
backbone and head parts of the model to improve feature extraction.
Furthermore, the K-means++ algorithm is used instead of K-means to obtain
anchor boxes and enhance model accuracy. Experimental results show that the
improved YOLOv7 network outperforms the original YOLOv7 model and other popular
underwater target detection methods. The proposed network achieved a mean
average precision (mAP) value of 89.6% and 97.4% on the URPC dataset and
Brackish dataset, respectively, and demonstrated a higher frame per second
(FPS) compared to the original YOLOv7 model. The source code for this study is
publicly available at https://github.com/NZWANG/YOLOV7-AC. In conclusion, the
improved YOLOv7 network proposed in this study represents a promising solution
for underwater target detection and holds great potential for practical
applications in various underwater tasks
MDM-YOLO: Research on Object Detection Algorithm Based on Improved YOLOv4 for Marine Organisms
Vision-based underwater object detection technology is a hot topic of current research. In order to address the issues of low accuracy and high missed rate of marine life detection, an object detection algorithm called MDM-YOLO (Marine Detection Model with YOLO) for marine organisms based on improved YOLOv4 is proposed. To improve the network's capacity for feature extraction, a multi-branch architecture CSBM is integrated into the backbone. Based on this, the feature fusion structure introduces shuffle attention to reinforce the focus on important information. The experimental results demonstrate that the MDM-YOLO algorithm increases the mean average precision (mAP) by 2.31 % compared to the YOLOv4 algorithm on the Underwater Robot Picking Contest (URPC) dataset. Moreover, on the RSOD dataset and PASCAL VOC dataset, MDM-YOLO obtained an mAP of 87.54 % and 86.87 %, respectively. According to these advancements, the MDM-YOLO model is more suitable for the identification of items on the seafloor
Image Labels Are All You Need for Coarse Seagrass Segmentation
Seagrass meadows serve as critical carbon sinks, but estimating the amount of
carbon they store requires knowledge of the seagrass species present.
Underwater and surface vehicles equipped with machine learning algorithms can
help to accurately estimate the composition and extent of seagrass meadows at
scale. However, previous approaches for seagrass detection and classification
have required supervision from patch-level labels. In this paper, we reframe
seagrass classification as a weakly supervised coarse segmentation problem
where image-level labels are used during training (25 times fewer labels
compared to patch-level labeling) and patch-level outputs are obtained at
inference time. To this end, we introduce SeaFeats, an architecture that uses
unsupervised contrastive pre-training and feature similarity, and SeaCLIP, a
model that showcases the effectiveness of large language models as a
supervisory signal in domain-specific applications. We demonstrate that an
ensemble of SeaFeats and SeaCLIP leads to highly robust performance. Our method
outperforms previous approaches that require patch-level labels on the
multi-species 'DeepSeagrass' dataset by 6.8% (absolute) for the class-weighted
F1 score, and by 12.1% (absolute) for the seagrass presence/absence F1 score on
the 'Global Wetlands' dataset. We also present two case studies for real-world
deployment: outlier detection on the Global Wetlands dataset, and application
of our method on imagery collected by the FloatyBoat autonomous surface
vehicle.Comment: 10 pages, 4 figures, additional 3 pages of supplementary materia
Improving accuracy and efficiency in seagrass detection using state-of-the-art AI techniques
Seagrasses provide a wide range of ecosystem services in coastal marine environments. Despite their ecological and economic importance, these species are declining because of human impact. This decline has driven the need for monitoring and mapping to estimate the overall health and dynamics of seagrasses in coastal environments, often based on underwater images. However, seagrass detection from underwater digital images is not a trivial task; it requires taxonomic expertise and is time-consuming and expensive. Recently automatic approaches based on deep learning have revolutionised object detection performance in many computer vision applications, and there has been interest in applying this to automated seagrass detection from imagery. Deep learning–based techniques reduce the need for hardcore feature extraction by domain experts which is required in machine learning-based techniques. This study presents a YOLOv5-based one-stage detector and an EfficientDetD7–based two-stage detector for detecting seagrass, in this case, Halophila ovalis, one of the most widely distributed seagrass species. The EfficientDet-D7–based seagrass detector achieves the highest mAP of 0.484 on the ECUHO-2 dataset and mAP of 0.354 on the ECUHO-1 dataset, which are about 7% and 5% better than the state-of-the-art Halophila ovalis detection performance on those datasets, respectively. The proposed YOLOv5-based detector achieves an average inference time of 0.077 s and 0.043 s respectively which are much lower than the state-of-the-art approach on the same datasets
UnitModule: A Lightweight Joint Image Enhancement Module for Underwater Object Detection
Underwater object detection faces the problem of underwater image
degradation, which affects the performance of the detector. Underwater object
detection methods based on noise reduction and image enhancement usually do not
provide images preferred by the detector or require additional datasets. In
this paper, we propose a plug-and-play Underwater joint image enhancement
Module (UnitModule) that provides the input image preferred by the detector. We
design an unsupervised learning loss for the joint training of UnitModule with
the detector without additional datasets to improve the interaction between
UnitModule and the detector. Furthermore, a color cast predictor with the
assisting color cast loss and a data augmentation called Underwater Color
Random Transfer (UCRT) are designed to improve the performance of UnitModule on
underwater images with different color casts. Extensive experiments are
conducted on DUO for different object detection models, where UnitModule
achieves the highest performance improvement of 2.6 AP for YOLOv5-S and gains
the improvement of 3.3 AP on the brand-new test set (URPCtest). And UnitModule
significantly improves the performance of all object detection models we test,
especially for models with a small number of parameters. In addition,
UnitModule with a small number of parameters of 31K has little effect on the
inference speed of the original object detection model. Our quantitative and
visual analysis also demonstrates the effectiveness of UnitModule in enhancing
the input image and improving the perception ability of the detector for object
features
- …