Side-Scan Sonar Underwater Target Detection: Combining the Diffusion Model With an Improved YOLOv7 Model

Xin Wen; Feihu Zhang; Chensheng Cheng; Xujia Hou; Guang Pan

doi:10.1109/JOE.2024.3379481

Side-Scan Sonar Underwater Target Detection: Combining the Diffusion Model With an Improved YOLOv7 Model

Xin Wen, Feihu Zhang, Chensheng Cheng, Xujia Hou, Guang Pan

School of Marine Science and Technology

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising-diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.

Original language	English
Pages (from-to)	976-991
Number of pages	16
Journal	IEEE Journal of Oceanic Engineering
Volume	49
Issue number	3
DOIs	https://doi.org/10.1109/JOE.2024.3379481
State	Published - 2024

Keywords

Attention
diffusion
Scylla-Intersection over Union (SloU)
side-scan sonar (SSS)
You Only Look Once v7 (YOLOv7)

Access to Document

10.1109/JOE.2024.3379481

Cite this

@article{27e915da02074ec9a6834cb4775ebc96,

title = "Side-Scan Sonar Underwater Target Detection: Combining the Diffusion Model With an Improved YOLOv7 Model",

abstract = "Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising-diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.",

keywords = "Attention, diffusion, Scylla-Intersection over Union (SloU), side-scan sonar (SSS), You Only Look Once v7 (YOLOv7)",

author = "Xin Wen and Feihu Zhang and Chensheng Cheng and Xujia Hou and Guang Pan",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.",

year = "2024",

doi = "10.1109/JOE.2024.3379481",

language = "英语",

volume = "49",

pages = "976--991",

journal = "IEEE Journal of Oceanic Engineering",

issn = "0364-9059",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

TY - JOUR

T1 - Side-Scan Sonar Underwater Target Detection

T2 - Combining the Diffusion Model With an Improved YOLOv7 Model

AU - Wen, Xin

AU - Zhang, Feihu

AU - Cheng, Chensheng

AU - Hou, Xujia

AU - Pan, Guang

PY - 2024

Y1 - 2024

N2 - Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising-diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.

AB - Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising-diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.

KW - Attention

KW - diffusion

KW - Scylla-Intersection over Union (SloU)

KW - side-scan sonar (SSS)

KW - You Only Look Once v7 (YOLOv7)

UR - http://www.scopus.com/inward/record.url?scp=85194069680&partnerID=8YFLogxK

U2 - 10.1109/JOE.2024.3379481

DO - 10.1109/JOE.2024.3379481

M3 - 文章

AN - SCOPUS:85194069680

SN - 0364-9059

VL - 49

SP - 976

EP - 991

JO - IEEE Journal of Oceanic Engineering

JF - IEEE Journal of Oceanic Engineering

IS - 3

ER -

Side-Scan Sonar Underwater Target Detection: Combining the Diffusion Model With an Improved YOLOv7 Model

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this