TY - JOUR
T1 - Hierarchical Grafting Network With Structural Alignment for Ultra-High Resolution Image Segmentation
AU - Liu, Ting
AU - Yang, Jing
AU - Wei, Shikui
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Ultra-high resolution (UHR) image segmentation is a challenging task that requires efficient processing of large images while maintaining high accuracy. Existing approaches usually employ both shallow and deep networks to extract high-resolution details and global context from different-resolution inputs, achieving a balance between performance, memory, and speed. However, these methods still rely on preserving relatively high-resolution features within the deep network, leading to increased time and memory costs. This also indicates that the full potential of the high-resolution information from the shallow network remains underexplored. To address this, we propose a novel framework called the Hierarchical Grafting Network (HGN), wherein the shallow network is hierarchically grafted to the deep network from multiple perspectives, enabling comprehensive utilization of the features from the shallow network. Our framework involves carefully designed global structure aggregated grafting and local structure aligned grafting mechanism, which progressively integrate semantic details and spatial structure from the shallow network to the deep network. In addition, to enhance the discriminative power of the high-resolution local features extracted by the shallow network, we introduce a shallow-deep contrastive loss to encourage the shallow network to learn semantically similar features to those of the deep network. Extensive experiments on several UHR image segmentation datasets demonstrate that our approach outperforms state-of-the-art UHR methods. The results demonstrate an overall improvement in terms of memory efficiency, accuracy, and speed.
AB - Ultra-high resolution (UHR) image segmentation is a challenging task that requires efficient processing of large images while maintaining high accuracy. Existing approaches usually employ both shallow and deep networks to extract high-resolution details and global context from different-resolution inputs, achieving a balance between performance, memory, and speed. However, these methods still rely on preserving relatively high-resolution features within the deep network, leading to increased time and memory costs. This also indicates that the full potential of the high-resolution information from the shallow network remains underexplored. To address this, we propose a novel framework called the Hierarchical Grafting Network (HGN), wherein the shallow network is hierarchically grafted to the deep network from multiple perspectives, enabling comprehensive utilization of the features from the shallow network. Our framework involves carefully designed global structure aggregated grafting and local structure aligned grafting mechanism, which progressively integrate semantic details and spatial structure from the shallow network to the deep network. In addition, to enhance the discriminative power of the high-resolution local features extracted by the shallow network, we introduce a shallow-deep contrastive loss to encourage the shallow network to learn semantically similar features to those of the deep network. Extensive experiments on several UHR image segmentation datasets demonstrate that our approach outperforms state-of-the-art UHR methods. The results demonstrate an overall improvement in terms of memory efficiency, accuracy, and speed.
KW - Semantic segmentation
KW - structural alignment
KW - ultra-high resolution semantic segmentation
UR - https://www.scopus.com/pages/publications/105015073185
U2 - 10.1109/TMM.2025.3604913
DO - 10.1109/TMM.2025.3604913
M3 - 文章
AN - SCOPUS:105015073185
SN - 1520-9210
VL - 27
SP - 8106
EP - 8117
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -