Self-Supervised Learning for Rolling Shutter Temporal Super-Resolution

Bin Fan, Ying Guo, Yuchao Dai, Chao Xu, Boxin Shi

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Most cameras on portable devices adopt a rolling shutter (RS) mechanism, encoding sufficient temporal dynamic information through sequential readouts. This advantage can be exploited to recover a temporal sequence of latent global shutter (GS) images. Existing methods rely on fully supervised learning, necessitating specialized optical devices to collect paired RS-GS images as ground-truth, which is too costly to scale. In this paper, we propose a self-supervised learning framework for the first time to produce a high frame rate GS video from two consecutive RS images, unleashing the potential of RS cameras. Specifically, we first develop the unified warping model of RS2GS and GS2RS, enabling the complement conversions of RS2GS and GS2RS to be incorporated into a uniform network model. Then, based on the cycle consistency constraint, given a triplet of consecutive RS frames, we minimize the discrepancy between the input middle RS frame and its cycle reconstruction, generated by interpolating back from the predicted two intermediate GS frames. Experiments on various benchmarks show that our approach achieves comparable or better performance than state-of-the-art supervised methods while enjoying stronger generalization capabilities. Moreover, our approach makes it possible to recover smooth and distortion-free videos from two adjacent RS frames in the real-world BS-RSC dataset, surpassing prior limitations.

Original languageEnglish
Pages (from-to)769-782
Number of pages14
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume35
Issue number1
DOIs
StatePublished - 2025

Keywords

  • cycle consistency
  • Rolling shutter
  • self-supervised learning
  • temporal super-resolution

Fingerprint

Dive into the research topics of 'Self-Supervised Learning for Rolling Shutter Temporal Super-Resolution'. Together they form a unique fingerprint.

Cite this