Skip to main navigation Skip to search Skip to main content

Curriculum learning-based slimmable cross-component prediction for video coding

  • Chengyi Zou
  • , Shuai Wan
  • , Marc Gorriz Blanch
  • , Luka Murn
  • , Juil Sock
  • , Fei Yang
  • , Luis Herranz
  • Northwestern Polytechnical University Xian
  • BBC
  • Nankai University
  • Technical University of Madrid

Research output: Contribution to journalArticlepeer-review

Abstract

Cross-component prediction plays an important role in video coding, which aims to eliminate redundancy between color components under the guidance of luma information. Recently, learning-based cross-component prediction has made significant strides in performance. However, current cross-component prediction methods typically train models directly on a dataset with different types of data, which generally results in overfitting for the flat textured data and underfitting for the complex textured data. To improve coding performance without excessively increasing the complexity, a cost-effective attention-based slimmable cross-component prediction network (SCCPN) is proposed. Although trained as a single model, SCCPN is capable of being executed at different levels of capacity, resulting in varying prediction results tailored to data with different characteristics. With the goal of further improving the generalization capability and prediction accuracy of the network, a curriculum learning strategy combined with slimmable convolutions is then designed, which employs the classification of prediction difficulty to represent whether the texture is flat or complex, and fits complex data with a small number of additional parameters. An adaptive search strategy is also introduced to speed up the selection of channels for slimmable convolutions. Experimental results demonstrate that when integrated into H.266/Versatile Video Coding (VVC), SCCPN achieves up to −0.62 %/−3.34 %/−2.68 % BD-rate reductions on Y/Cb/Cr components, respectively, over the H.266/VVC anchor. The performance gain outperforms the state-of-the-art learning-based cross-component prediction methods, while the increased complexity in both encoding and decoding is lower than the other compared cross-component prediction methods using neural networks. Moreover, performance gain can also be observed when SCCPN is integrated into the latest reference software of Beyond VVC, with BD-rate reductions of −0.17 %/−1.00 %/−1.02 % on Y/Cb/Cr components respectively.

Original languageEnglish
Article number131463
JournalNeurocomputing
Volume656
DOIs
StatePublished - 1 Dec 2025

Keywords

  • Cross-component prediction
  • Curriculum learning
  • Slimmable neural network
  • Video coding

Fingerprint

Dive into the research topics of 'Curriculum learning-based slimmable cross-component prediction for video coding'. Together they form a unique fingerprint.

Cite this