Constraining multimodal distribution for domain adaptation in stereo matching

Zhelun Shen; Zhuo Li; Chenming Wu; Zhibo Rao; Lina Liu; Yuchao Dai; Liangjun Zhang

doi:10.1016/j.patcog.2025.111727

Constraining multimodal distribution for domain adaptation in stereo matching

Zhelun Shen, Zhuo Li, Chenming Wu, Zhibo Rao, Lina Liu, Yuchao Dai, Liangjun Zhang

电子信息学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Recently, learning-based stereo matching methods have achieved great improvement in public benchmarks, where soft argmin and smooth L1 loss play core contributions to its success. However, in unsupervised domain adaptation scenarios, we observe that these two operations often yield multimodal disparity probability distributions in target domains, resulting in degraded generalization. In this paper, we propose a novel approach, Constrain Multi-modal Distribution (CMD), to address this issue. Specifically, we introduce uncertainty-regularized minimization and anisotropic soft argmin to encourage the network to produce predominantly unimodal disparity distributions in the target domain, thereby improving prediction accuracy. Experimentally, we apply the proposed method to multiple representative stereo-matching networks and conduct domain adaptation from synthetic data to unlabeled real-world scenes. Results consistently demonstrate improved generalization in both top-performing and domain-adaptable stereo-matching models. The code for CMD will be available at: https://github.com/gallenszl/CMD.

源语言	英语
文章编号	111727
期刊	Pattern Recognition
卷	167
DOI	https://doi.org/10.1016/j.patcog.2025.111727
出版状态	已出版 - 11月 2025

访问文件

10.1016/j.patcog.2025.111727

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{2a3b955133874634870c863891df97f3,

title = "Constraining multimodal distribution for domain adaptation in stereo matching",

abstract = "Recently, learning-based stereo matching methods have achieved great improvement in public benchmarks, where soft argmin and smooth L1 loss play core contributions to its success. However, in unsupervised domain adaptation scenarios, we observe that these two operations often yield multimodal disparity probability distributions in target domains, resulting in degraded generalization. In this paper, we propose a novel approach, Constrain Multi-modal Distribution (CMD), to address this issue. Specifically, we introduce uncertainty-regularized minimization and anisotropic soft argmin to encourage the network to produce predominantly unimodal disparity distributions in the target domain, thereby improving prediction accuracy. Experimentally, we apply the proposed method to multiple representative stereo-matching networks and conduct domain adaptation from synthetic data to unlabeled real-world scenes. Results consistently demonstrate improved generalization in both top-performing and domain-adaptable stereo-matching models. The code for CMD will be available at: https://github.com/gallenszl/CMD.",

keywords = "Domain adaptation, Stereo matching, Unimodal distribution",

author = "Zhelun Shen and Zhuo Li and Chenming Wu and Zhibo Rao and Lina Liu and Yuchao Dai and Liangjun Zhang",

note = "Publisher Copyright: {\textcopyright} 2025",

year = "2025",

month = nov,

doi = "10.1016/j.patcog.2025.111727",

language = "英语",

volume = "167",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - Constraining multimodal distribution for domain adaptation in stereo matching

AU - Shen, Zhelun

AU - Li, Zhuo

AU - Wu, Chenming

AU - Rao, Zhibo

AU - Liu, Lina

AU - Dai, Yuchao

AU - Zhang, Liangjun

PY - 2025/11

Y1 - 2025/11

N2 - Recently, learning-based stereo matching methods have achieved great improvement in public benchmarks, where soft argmin and smooth L1 loss play core contributions to its success. However, in unsupervised domain adaptation scenarios, we observe that these two operations often yield multimodal disparity probability distributions in target domains, resulting in degraded generalization. In this paper, we propose a novel approach, Constrain Multi-modal Distribution (CMD), to address this issue. Specifically, we introduce uncertainty-regularized minimization and anisotropic soft argmin to encourage the network to produce predominantly unimodal disparity distributions in the target domain, thereby improving prediction accuracy. Experimentally, we apply the proposed method to multiple representative stereo-matching networks and conduct domain adaptation from synthetic data to unlabeled real-world scenes. Results consistently demonstrate improved generalization in both top-performing and domain-adaptable stereo-matching models. The code for CMD will be available at: https://github.com/gallenszl/CMD.

AB - Recently, learning-based stereo matching methods have achieved great improvement in public benchmarks, where soft argmin and smooth L1 loss play core contributions to its success. However, in unsupervised domain adaptation scenarios, we observe that these two operations often yield multimodal disparity probability distributions in target domains, resulting in degraded generalization. In this paper, we propose a novel approach, Constrain Multi-modal Distribution (CMD), to address this issue. Specifically, we introduce uncertainty-regularized minimization and anisotropic soft argmin to encourage the network to produce predominantly unimodal disparity distributions in the target domain, thereby improving prediction accuracy. Experimentally, we apply the proposed method to multiple representative stereo-matching networks and conduct domain adaptation from synthetic data to unlabeled real-world scenes. Results consistently demonstrate improved generalization in both top-performing and domain-adaptable stereo-matching models. The code for CMD will be available at: https://github.com/gallenszl/CMD.

KW - Domain adaptation

KW - Stereo matching

KW - Unimodal distribution

UR - http://www.scopus.com/inward/record.url?scp=105004264827&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2025.111727

DO - 10.1016/j.patcog.2025.111727

M3 - 文章

AN - SCOPUS:105004264827

SN - 0031-3203

VL - 167

JO - Pattern Recognition

JF - Pattern Recognition

M1 - 111727

ER -

Constraining multimodal distribution for domain adaptation in stereo matching

摘要

访问文件

其它文件与链接

指纹

引用此