跳到主要导航 跳到搜索 跳到主要内容

TriEncoderNet: Multi-Stage Fusion of CNN, Transformer, and HOG Features for Forward-Looking Sonar Image Segmentation

  • Jie Liu
  • , Yan Dong
  • , Guofang Chen
  • , Yimin Chen
  • , Jian Gao
  • , Fubin Zhang
  • Northwestern Polytechnical University Xian

科研成果: 期刊稿件文章同行评审

摘要

Forward-looking sonar (FLS) image segmentation is essential for underwater exploration with remaining challenges including low contrast, ambient noise, and complex backgrounds, which both existing traditional and deep learning-based methods fail to address effectively. This paper presents TriEncoderNet, a novel model that simultaneously extracts local, global, and edge-related features through three parallel encoders. Specifically, the model integrates a convolutional neural network (CNN) for local feature extraction, a transformer for global context modeling, and a histogram of oriented gradients (HOG) encoder for edge and shape detection. The key innovations of TriEncoderNet include the CrossFusionTransformer (CFT) module, which effectively integrates local and global features to capture both fine details and comprehensive context, and the HOG attention gate (HAG) module, which enhances edge detection and preserves semantic consistency across diverse feature types. Additionally, TriEncoderNet introduces the hierarchical efficient transformer (HETransformer) with a lightweight multi-head self-attention mechanism to reduce computational overhead while maintaining global context modeling capability. Experimental results on the marine debris dataset and UATD dataset demonstrate the superior performance of TriEncoderNet. Specifically, it achieves an mIoU of 0.793 and mAP of 0.916 on the marine debris dataset, and an mIoU of 0.582 and mAP of 0.687 on the UATD Dataset, outperforming state-of-the-art methods in both segmentation accuracy and robustness in challenging underwater environments.

源语言英语
文章编号2295
期刊Journal of Marine Science and Engineering
13
12
DOI
出版状态已出版 - 12月 2025

联合国可持续发展目标

此成果有助于实现下列可持续发展目标:

  1. 可持续发展目标 14 - 水下生物
    可持续发展目标 14 水下生物

指纹

探究 'TriEncoderNet: Multi-Stage Fusion of CNN, Transformer, and HOG Features for Forward-Looking Sonar Image Segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此