TY - GEN
T1 - Joint Appearance and Motion Learning for Efficient Rolling Shutter Correction
AU - Fan, Bin
AU - Mao, Yuxin
AU - Dai, Yuchao
AU - Wan, Zhexiong
AU - Liu, Qi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Rolling shutter correction (RSC) is becoming increasingly popular for RS cameras that are widely used in commercial and industrial applications. Despite the promising performance, existing RSC methods typically employ a two-stage network structure that ignores intrinsic infor-mation interactions and hinders fast inference. In this pa-per, we propose a single-stage encoder-decoder-based network, named JAMNet, for efficient RSC. It first extracts pyramid features from consecutive RS inputs, and then simultaneously refines the two complementary information (i.e., global shutter appearance and undistortion motion field) to achieve mutual promotion in a joint learning de-coder. To inject sufficient motion cues for guiding joint learning, we introduce a transformer-based motion embed-ding module and propose to pass hidden states across pyra-mid levels. Moreover, we present a new data augmentation strategy 'vertical flip + inverse order' to release the potential of the RSC datasets. Experiments on various benchmarks show that our approach surpasses the state-of-the-art methods by a large margin, especially with a 4.7 dB PSNR leap on real-world RSC. Code is available at https://github.com/GitCVfb/JAMNet.
AB - Rolling shutter correction (RSC) is becoming increasingly popular for RS cameras that are widely used in commercial and industrial applications. Despite the promising performance, existing RSC methods typically employ a two-stage network structure that ignores intrinsic infor-mation interactions and hinders fast inference. In this pa-per, we propose a single-stage encoder-decoder-based network, named JAMNet, for efficient RSC. It first extracts pyramid features from consecutive RS inputs, and then simultaneously refines the two complementary information (i.e., global shutter appearance and undistortion motion field) to achieve mutual promotion in a joint learning de-coder. To inject sufficient motion cues for guiding joint learning, we introduce a transformer-based motion embed-ding module and propose to pass hidden states across pyra-mid levels. Moreover, we present a new data augmentation strategy 'vertical flip + inverse order' to release the potential of the RSC datasets. Experiments on various benchmarks show that our approach surpasses the state-of-the-art methods by a large margin, especially with a 4.7 dB PSNR leap on real-world RSC. Code is available at https://github.com/GitCVfb/JAMNet.
KW - Low-level vision
UR - http://www.scopus.com/inward/record.url?scp=85173917100&partnerID=8YFLogxK
U2 - 10.1109/CVPR52729.2023.00549
DO - 10.1109/CVPR52729.2023.00549
M3 - 会议稿件
AN - SCOPUS:85173917100
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 5671
EP - 5681
BT - Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
PB - IEEE Computer Society
T2 - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
Y2 - 18 June 2023 through 22 June 2023
ER -