摘要
Forward-looking sonar (FLS) image segmentation is essential for underwater exploration with remaining challenges including low contrast, ambient noise, and complex backgrounds, which both existing traditional and deep learning-based methods fail to address effectively. This paper presents TriEncoderNet, a novel model that simultaneously extracts local, global, and edge-related features through three parallel encoders. Specifically, the model integrates a convolutional neural network (CNN) for local feature extraction, a transformer for global context modeling, and a histogram of oriented gradients (HOG) encoder for edge and shape detection. The key innovations of TriEncoderNet include the CrossFusionTransformer (CFT) module, which effectively integrates local and global features to capture both fine details and comprehensive context, and the HOG attention gate (HAG) module, which enhances edge detection and preserves semantic consistency across diverse feature types. Additionally, TriEncoderNet introduces the hierarchical efficient transformer (HETransformer) with a lightweight multi-head self-attention mechanism to reduce computational overhead while maintaining global context modeling capability. Experimental results on the marine debris dataset and UATD dataset demonstrate the superior performance of TriEncoderNet. Specifically, it achieves an mIoU of 0.793 and mAP of 0.916 on the marine debris dataset, and an mIoU of 0.582 and mAP of 0.687 on the UATD Dataset, outperforming state-of-the-art methods in both segmentation accuracy and robustness in challenging underwater environments.
| 源语言 | 英语 |
|---|---|
| 文章编号 | 2295 |
| 期刊 | Journal of Marine Science and Engineering |
| 卷 | 13 |
| 期 | 12 |
| DOI | |
| 出版状态 | 已出版 - 12月 2025 |
联合国可持续发展目标
此成果有助于实现下列可持续发展目标:
-
可持续发展目标 14 水下生物
指纹
探究 'TriEncoderNet: Multi-Stage Fusion of CNN, Transformer, and HOG Features for Forward-Looking Sonar Image Segmentation' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver