TY - JOUR
T1 - MFSonar
T2 - Multiscale Frequency Domain Contextual Denoising for Forward-Looking Sonar Image Semantic Segmentation
AU - Li, Jiayuan
AU - Wang, Zhen
AU - Yuan, Shen Ao
AU - You, Zhuhong
N1 - Publisher Copyright:
© 2025 IEEE. All rights reserved.
PY - 2025
Y1 - 2025
N2 - Semantic segmentation of forward-looking sonar (FLS) images is crucial for enhancing situational awareness in marine environments. However, FLS images are often degraded by environmental noise, similarity noise, and shading effects, which result in low resolution, poor signal-to-noise ratio, and suboptimal image quality. These issues significantly hinder the accuracy of semantic segmentation in FLS images. To address these challenges, we propose a novel method called MFSonar, which is based on the Transformer-Mamba architecture. MFSonar incorporates a Context Channel Denoising Module that exploits the similarity characteristics of local and global features to effectively suppress similarity noise and enhance target features. Additionally, the Multiscale Frequency-Domain Decoding Module integrates multiscale frequency-domain convolution with visual state space blocks, leveraging frequency-domain characteristics to mitigate environmental noise and occlusion shadows. Furthermore, our approach prioritizes local features before global features to achieve effective fusion and enhancement of global semantic features and multiscale local visual information. Extensive comparative experiments across multiple datasets demonstrate that MFSonar achieves state-of-the-art performance. Moreover, ablation studies and visual comparisons on a primary dataset validate the superiority, effectiveness, and uniqueness of our approach.
AB - Semantic segmentation of forward-looking sonar (FLS) images is crucial for enhancing situational awareness in marine environments. However, FLS images are often degraded by environmental noise, similarity noise, and shading effects, which result in low resolution, poor signal-to-noise ratio, and suboptimal image quality. These issues significantly hinder the accuracy of semantic segmentation in FLS images. To address these challenges, we propose a novel method called MFSonar, which is based on the Transformer-Mamba architecture. MFSonar incorporates a Context Channel Denoising Module that exploits the similarity characteristics of local and global features to effectively suppress similarity noise and enhance target features. Additionally, the Multiscale Frequency-Domain Decoding Module integrates multiscale frequency-domain convolution with visual state space blocks, leveraging frequency-domain characteristics to mitigate environmental noise and occlusion shadows. Furthermore, our approach prioritizes local features before global features to achieve effective fusion and enhancement of global semantic features and multiscale local visual information. Extensive comparative experiments across multiple datasets demonstrate that MFSonar achieves state-of-the-art performance. Moreover, ablation studies and visual comparisons on a primary dataset validate the superiority, effectiveness, and uniqueness of our approach.
KW - contextual channel denoising
KW - Forward-looking sonar (FLS)
KW - multiscale frequency domain
KW - semantic segmentation
UR - http://www.scopus.com/inward/record.url?scp=86000322545&partnerID=8YFLogxK
U2 - 10.1109/JSEN.2025.3545146
DO - 10.1109/JSEN.2025.3545146
M3 - 文章
AN - SCOPUS:86000322545
SN - 1530-437X
JO - IEEE Sensors Journal
JF - IEEE Sensors Journal
ER -