SA-MixNet: Structure-Aware Mixup and Invariance Learning for Scribble-Supervised Road Extraction in Remote Sensing Images

Jie Feng; Hao Huang; Junpeng Zhang; Weisheng Dong; Dingwen Zhang; Licheng Jiao

doi:10.1109/TGRS.2024.3514839

SA-MixNet: Structure-Aware Mixup and Invariance Learning for Scribble-Supervised Road Extraction in Remote Sensing Images

Jie Feng, Hao Huang, Junpeng Zhang, Weisheng Dong, Dingwen Zhang, Licheng Jiao

School of Automation

Xidian University

Research output: Contribution to journal › Article › peer-review

Abstract

Mainstreamed weakly supervised road extractors rely on highly confident pseudo-labels propagated from scribbles, and their performance often degrades gradually as the image scenes tend to vary. We argue that such degradation is due to the poor model's invariance to scenes with different complexities, whereas existing solutions to this problem are commonly based on crafted priors that cannot be derived from scribbles. To eliminate the reliance on such priors, we propose a novel structure-aware mixup and invariance learning framework (SA-MixNet) for weakly supervised road extraction that improves the model invariance in a data-driven manner. Specifically, we design a structure-aware mixup (SA-Mix) scheme to paste road regions from one image onto another to create an image scene with increased complexity while preserving the road's structural integrity. Then, an invariance regularization is imposed on the predictions of constructed and origin images to minimize their conflicts, which thus forces the model to behave consistently in various scenes. Moreover, a discriminator-based regularization is designed to enhance connectivity while preserving the structure of roads. Combining these designs, our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets, outperforming the state-of-the-art techniques by 1.47%, 2.12%, and 4.09%, respectively, in IoU metrics, and showing its potential as a plug-and-play solution. Our source code is available at https://github.com/xdu-jjgs.

Original language	English
Article number	5602214
Journal	IEEE Transactions on Geoscience and Remote Sensing
Volume	63
DOIs	https://doi.org/10.1109/TGRS.2024.3514839
State	Published - 2025

Keywords

Adversarial learning
cross-view consistency
remote sensing
road extraction
weakly supervised learning

Access to Document

10.1109/TGRS.2024.3514839

Cite this

@article{805b326fc7c34c91bfefbdf5e76dc636,

title = "SA-MixNet: Structure-Aware Mixup and Invariance Learning for Scribble-Supervised Road Extraction in Remote Sensing Images",

abstract = "Mainstreamed weakly supervised road extractors rely on highly confident pseudo-labels propagated from scribbles, and their performance often degrades gradually as the image scenes tend to vary. We argue that such degradation is due to the poor model's invariance to scenes with different complexities, whereas existing solutions to this problem are commonly based on crafted priors that cannot be derived from scribbles. To eliminate the reliance on such priors, we propose a novel structure-aware mixup and invariance learning framework (SA-MixNet) for weakly supervised road extraction that improves the model invariance in a data-driven manner. Specifically, we design a structure-aware mixup (SA-Mix) scheme to paste road regions from one image onto another to create an image scene with increased complexity while preserving the road's structural integrity. Then, an invariance regularization is imposed on the predictions of constructed and origin images to minimize their conflicts, which thus forces the model to behave consistently in various scenes. Moreover, a discriminator-based regularization is designed to enhance connectivity while preserving the structure of roads. Combining these designs, our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets, outperforming the state-of-the-art techniques by 1.47%, 2.12%, and 4.09%, respectively, in IoU metrics, and showing its potential as a plug-and-play solution. Our source code is available at https://github.com/xdu-jjgs.",

keywords = "Adversarial learning, cross-view consistency, remote sensing, road extraction, weakly supervised learning",

author = "Jie Feng and Hao Huang and Junpeng Zhang and Weisheng Dong and Dingwen Zhang and Licheng Jiao",

note = "Publisher Copyright: {\textcopyright} 1980-2012 IEEE.",

year = "2025",

doi = "10.1109/TGRS.2024.3514839",

language = "英语",

volume = "63",

journal = "IEEE Transactions on Geoscience and Remote Sensing",

issn = "0196-2892",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - SA-MixNet

T2 - Structure-Aware Mixup and Invariance Learning for Scribble-Supervised Road Extraction in Remote Sensing Images

AU - Feng, Jie

AU - Huang, Hao

AU - Zhang, Junpeng

AU - Dong, Weisheng

AU - Zhang, Dingwen

AU - Jiao, Licheng

PY - 2025

Y1 - 2025

N2 - Mainstreamed weakly supervised road extractors rely on highly confident pseudo-labels propagated from scribbles, and their performance often degrades gradually as the image scenes tend to vary. We argue that such degradation is due to the poor model's invariance to scenes with different complexities, whereas existing solutions to this problem are commonly based on crafted priors that cannot be derived from scribbles. To eliminate the reliance on such priors, we propose a novel structure-aware mixup and invariance learning framework (SA-MixNet) for weakly supervised road extraction that improves the model invariance in a data-driven manner. Specifically, we design a structure-aware mixup (SA-Mix) scheme to paste road regions from one image onto another to create an image scene with increased complexity while preserving the road's structural integrity. Then, an invariance regularization is imposed on the predictions of constructed and origin images to minimize their conflicts, which thus forces the model to behave consistently in various scenes. Moreover, a discriminator-based regularization is designed to enhance connectivity while preserving the structure of roads. Combining these designs, our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets, outperforming the state-of-the-art techniques by 1.47%, 2.12%, and 4.09%, respectively, in IoU metrics, and showing its potential as a plug-and-play solution. Our source code is available at https://github.com/xdu-jjgs.

AB - Mainstreamed weakly supervised road extractors rely on highly confident pseudo-labels propagated from scribbles, and their performance often degrades gradually as the image scenes tend to vary. We argue that such degradation is due to the poor model's invariance to scenes with different complexities, whereas existing solutions to this problem are commonly based on crafted priors that cannot be derived from scribbles. To eliminate the reliance on such priors, we propose a novel structure-aware mixup and invariance learning framework (SA-MixNet) for weakly supervised road extraction that improves the model invariance in a data-driven manner. Specifically, we design a structure-aware mixup (SA-Mix) scheme to paste road regions from one image onto another to create an image scene with increased complexity while preserving the road's structural integrity. Then, an invariance regularization is imposed on the predictions of constructed and origin images to minimize their conflicts, which thus forces the model to behave consistently in various scenes. Moreover, a discriminator-based regularization is designed to enhance connectivity while preserving the structure of roads. Combining these designs, our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets, outperforming the state-of-the-art techniques by 1.47%, 2.12%, and 4.09%, respectively, in IoU metrics, and showing its potential as a plug-and-play solution. Our source code is available at https://github.com/xdu-jjgs.

KW - Adversarial learning

KW - cross-view consistency

KW - remote sensing

KW - road extraction

KW - weakly supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85205056144&partnerID=8YFLogxK

U2 - 10.1109/TGRS.2024.3514839

DO - 10.1109/TGRS.2024.3514839

M3 - 文章

AN - SCOPUS:85205056144

SN - 0196-2892

VL - 63

JO - IEEE Transactions on Geoscience and Remote Sensing

JF - IEEE Transactions on Geoscience and Remote Sensing

M1 - 5602214

ER -

SA-MixNet: Structure-Aware Mixup and Invariance Learning for Scribble-Supervised Road Extraction in Remote Sensing Images

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this