Inter-view dual-domain guided stable diffusion for real-world stereo image super-resolution

Research output: Contribution to journalArticlepeer-review

Abstract

Recently, pre-trained text-to-image diffusion models have shown promising potential in restoring texture details lost in low-resolution images due to their powerful generative ability. However, directly applying these models to real-world stereo image super-resolution (Real-SSR) neglects the essential consistency between left and right views, resulting in inconsistent stereo content and amplified visual artifacts. To address this problem, we propose a Complementary Semantic-aware Inter-view Dual-domain Guided Stable Diffusion (CSID-Diff) network for Real-SSR, which leverages complementary texture, structural, and semantic information from low-resolution stereo images to guide the diffusion model to generate high-quality results with inter-view consistency. Specifically, we propose a Dual-domain Guided ControlNet that establishes complementary interactions in both the image feature domain and the disparity domain, and fuses dual-domain features to enforce inter-view texture and structural consistency. To further address viewpoint-induced discrepancies in semantic information between left and right images, we introduce a Complementary Semantic Feature Extraction Module (CSFEM) to enforce inter-view semantic consistency. Extensive experiments demonstrate that our approach delivers superior stereo image reconstruction, achieving both high quality and inter-view consistency, outperforming state-of-the-art methods on both synthetic and real-world datasets.

Original languageEnglish
Article number129201
JournalExpert Systems with Applications
Volume296
DOIs
StatePublished - 15 Jan 2026

Keywords

  • Controlnet
  • Stable diffusion
  • Stereo image super-resolution

Fingerprint

Dive into the research topics of 'Inter-view dual-domain guided stable diffusion for real-world stereo image super-resolution'. Together they form a unique fingerprint.

Cite this