TY - JOUR
T1 - SA-MixNet
T2 - Structure-Aware Mixup and Invariance Learning for Scribble-Supervised Road Extraction in Remote Sensing Images
AU - Feng, Jie
AU - Huang, Hao
AU - Zhang, Junpeng
AU - Dong, Weisheng
AU - Zhang, Dingwen
AU - Jiao, Licheng
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Mainstreamed weakly supervised road extractors rely on highly confident pseudo-labels propagated from scribbles, and their performance often degrades gradually as the image scenes tend to vary. We argue that such degradation is due to the poor model's invariance to scenes with different complexities, whereas existing solutions to this problem are commonly based on crafted priors that cannot be derived from scribbles. To eliminate the reliance on such priors, we propose a novel structure-aware mixup and invariance learning framework (SA-MixNet) for weakly supervised road extraction that improves the model invariance in a data-driven manner. Specifically, we design a structure-aware mixup (SA-Mix) scheme to paste road regions from one image onto another to create an image scene with increased complexity while preserving the road's structural integrity. Then, an invariance regularization is imposed on the predictions of constructed and origin images to minimize their conflicts, which thus forces the model to behave consistently in various scenes. Moreover, a discriminator-based regularization is designed to enhance connectivity while preserving the structure of roads. Combining these designs, our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets, outperforming the state-of-the-art techniques by 1.47%, 2.12%, and 4.09%, respectively, in IoU metrics, and showing its potential as a plug-and-play solution. Our source code is available at https://github.com/xdu-jjgs.
AB - Mainstreamed weakly supervised road extractors rely on highly confident pseudo-labels propagated from scribbles, and their performance often degrades gradually as the image scenes tend to vary. We argue that such degradation is due to the poor model's invariance to scenes with different complexities, whereas existing solutions to this problem are commonly based on crafted priors that cannot be derived from scribbles. To eliminate the reliance on such priors, we propose a novel structure-aware mixup and invariance learning framework (SA-MixNet) for weakly supervised road extraction that improves the model invariance in a data-driven manner. Specifically, we design a structure-aware mixup (SA-Mix) scheme to paste road regions from one image onto another to create an image scene with increased complexity while preserving the road's structural integrity. Then, an invariance regularization is imposed on the predictions of constructed and origin images to minimize their conflicts, which thus forces the model to behave consistently in various scenes. Moreover, a discriminator-based regularization is designed to enhance connectivity while preserving the structure of roads. Combining these designs, our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets, outperforming the state-of-the-art techniques by 1.47%, 2.12%, and 4.09%, respectively, in IoU metrics, and showing its potential as a plug-and-play solution. Our source code is available at https://github.com/xdu-jjgs.
KW - Adversarial learning
KW - cross-view consistency
KW - remote sensing
KW - road extraction
KW - weakly supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85205056144&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3514839
DO - 10.1109/TGRS.2024.3514839
M3 - 文章
AN - SCOPUS:85205056144
SN - 0196-2892
VL - 63
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5602214
ER -