TY - JOUR
T1 - S2Net
T2 - A Multitask Learning Network for Semantic Stereo of Satellite Image Pairs
AU - Liao, Puyun
AU - Zhang, Xiaodong
AU - Chen, Guanzhou
AU - Wang, Tong
AU - Li, Xianwei
AU - Yang, Haobo
AU - Zhou, Wenlin
AU - He, Chanjuan
AU - Wang, Qing
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Stereo matching and semantic segmentation are two significant tasks in remote sensing. Recently, deep learning approaches have been applied to these tasks separately. However, the lack of semantic supervision makes the training of stereo matching model susceptible to data disturbance, resulting in inferior generalization ability; foreground objects are sometimes confused with background pixels in RGB images, limiting the classification accuracy. By exploring the relationship between these two tasks, semantic stereo solves these problems simultaneously with multitask learning. Previous methods took semantic stereo as two parallel processing tasks, so they did not take full advantages of the additional information from both tasks and only obtained slight improvement. In this work, we designed a multitask learning framework semantic stereo network (S2Net). The proposed network generates cost volumes with feature maps supervised by semantic information to estimate disparity maps and fuses RGB-D feature maps to predict classification maps, therefore gathering multitask learning information. To enhance the performance of trained model, we also considered the continuity of disparity values and the duality of stereo image pairs in data augmentation. When applied in datasets without training, S2Net obtained 2.937% D1-Error in the WHU dataset, lower than 4.297% of the previous best method, depicting the generalization ability improvement from semantic supervision. In terms of semantic segmentation, the introduction of disparity maps increases the mean intersection over union (mIoU) from 61.375% to 69.096% in the US3D datasets. The experiments on the KITTI semantics benchmark show that our proposed method obtains 60.76% mIoU, achieving state-of-the-art among multitask learning methods .
AB - Stereo matching and semantic segmentation are two significant tasks in remote sensing. Recently, deep learning approaches have been applied to these tasks separately. However, the lack of semantic supervision makes the training of stereo matching model susceptible to data disturbance, resulting in inferior generalization ability; foreground objects are sometimes confused with background pixels in RGB images, limiting the classification accuracy. By exploring the relationship between these two tasks, semantic stereo solves these problems simultaneously with multitask learning. Previous methods took semantic stereo as two parallel processing tasks, so they did not take full advantages of the additional information from both tasks and only obtained slight improvement. In this work, we designed a multitask learning framework semantic stereo network (S2Net). The proposed network generates cost volumes with feature maps supervised by semantic information to estimate disparity maps and fuses RGB-D feature maps to predict classification maps, therefore gathering multitask learning information. To enhance the performance of trained model, we also considered the continuity of disparity values and the duality of stereo image pairs in data augmentation. When applied in datasets without training, S2Net obtained 2.937% D1-Error in the WHU dataset, lower than 4.297% of the previous best method, depicting the generalization ability improvement from semantic supervision. In terms of semantic segmentation, the introduction of disparity maps increases the mean intersection over union (mIoU) from 61.375% to 69.096% in the US3D datasets. The experiments on the KITTI semantics benchmark show that our proposed method obtains 60.76% mIoU, achieving state-of-the-art among multitask learning methods .
KW - Convolutional neural network
KW - multitask learning
KW - semantic segmentation
KW - stereo image pairs
KW - stereo matching
UR - http://www.scopus.com/inward/record.url?scp=85179063378&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2023.3335997
DO - 10.1109/TGRS.2023.3335997
M3 - 文章
AN - SCOPUS:85179063378
SN - 0196-2892
VL - 62
SP - 1
EP - 13
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5601313
ER -