Rethinking Training Strategy in Stereo Matching

Zhibo Rao, Yuchao Dai, Zhelun Shen, Renjie He

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

In stereo matching, various learning-based approaches have shown impressive performance in solving traditional difficulties on multiple datasets. While most progress is obtained on a specific dataset with a dataset-specific network design, the performance on the single dataset and cross dataset affected by training strategy is often ignored. In this article, we analyze the relationship between different training strategies and performance by retraining some representative state-of-the-art methods (e.g., geometry and context network (GC-Net), pyramid stereo matching network (PSM-Net), and guided aggregation network (GA-Net), etc.). According to our research, it is surprising that the performance of networks on single or cross datasets is significantly improved by pre-training and data augmentation without any particular structure acquirement. Based on this discovery, we improve our previous non-local context attention network (NLCA-Net) to NLCA-Net v2 and train it with the novel strategy and rethink the training strategy of stereo matching concurrently. The quantitative experiments demonstrate that: 1) our model is capable of reaching top performance on both the single dataset and the multiple datasets with the same parameters in this study, which also won the 2nd place in the stereo task of the ECCV Robust vision Challenge 2020 (RVC 2020); and 2) on small datasets (e.g., KITTI, ETH3D, and Middlebury), the model's generalization and robustness are significantly affected by pre-training and data augmentation, even exceeding the network structure's influence in some cases. These observations present a challenge to the conventional wisdom of network architectures in this stage. We expect these discoveries to encourage researchers to rethink the current paradigm of 'excessive attention on the performance of a single small dataset' in stereo matching.

Original languageEnglish
Pages (from-to)7796-7809
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume34
Issue number10
DOIs
StatePublished - 1 Oct 2023

Keywords

  • Data augmentations
  • pre-training
  • robust vision challenge
  • stereo matching

Fingerprint

Dive into the research topics of 'Rethinking Training Strategy in Stereo Matching'. Together they form a unique fingerprint.

Cite this