A slimmable framework for practical neural video compression

Zhaocheng Liu; Fei Yang; Defa Wang; Marc Górriz Blanch; Luka Murn; Shuai Wan; Saiping Zhang; Marta Mrak; Luis Herranz

doi:10.1016/j.neucom.2024.128525

A slimmable framework for practical neural video compression

Zhaocheng Liu, Fei Yang, Defa Wang, Marc Górriz Blanch, Luka Murn, Shuai Wan, Saiping Zhang, Marta Mrak, Luis Herranz

School of Electronics and Information

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Deep learning is being increasingly applied to image and video compression in a new paradigm known as neural video compression. While achieving impressive rate–distortion (RD) performance, neural video codecs (NVC) require heavy neural networks, which in turn have large memory and computational costs and often lack important functionalities such as variable rate. These are significant limitations to their practical application. Addressing these problems, recent slimmable image codecs can dynamically adjust their model capacity to elegantly reduce the memory and computation requirements, without harming RD performance. However, the extension to video is not straightforward due to the non-trivial interplay with complex motion estimation and compensation modules in most NVC architectures. In this paper we propose the slimmable video codec framework (SlimVC) that integrates an slimmable autoencoder and a motion-free conditional entropy model. We show that the slimming mechanism is also applicable to the more complex case of video architectures, providing SlimVC with simultaneous control of the computational cost, memory and rate, which are all important requirements in practice. We further provide detailed experimental analysis, and describe application scenarios that can benefit from slimmable video codecs.

Original language	English
Article number	128525
Journal	Neurocomputing
Volume	610
DOIs	https://doi.org/10.1016/j.neucom.2024.128525
State	Published - 28 Dec 2024

Keywords

Deep learning
Feature modulation
Neural video compression
Slimmable codec
Slimmable network
Variable rate

Access to Document

10.1016/j.neucom.2024.128525

Cite this

@article{d1601ff380ac411b954222fcd64b3202,

title = "A slimmable framework for practical neural video compression",

abstract = "Deep learning is being increasingly applied to image and video compression in a new paradigm known as neural video compression. While achieving impressive rate–distortion (RD) performance, neural video codecs (NVC) require heavy neural networks, which in turn have large memory and computational costs and often lack important functionalities such as variable rate. These are significant limitations to their practical application. Addressing these problems, recent slimmable image codecs can dynamically adjust their model capacity to elegantly reduce the memory and computation requirements, without harming RD performance. However, the extension to video is not straightforward due to the non-trivial interplay with complex motion estimation and compensation modules in most NVC architectures. In this paper we propose the slimmable video codec framework (SlimVC) that integrates an slimmable autoencoder and a motion-free conditional entropy model. We show that the slimming mechanism is also applicable to the more complex case of video architectures, providing SlimVC with simultaneous control of the computational cost, memory and rate, which are all important requirements in practice. We further provide detailed experimental analysis, and describe application scenarios that can benefit from slimmable video codecs.",

keywords = "Deep learning, Feature modulation, Neural video compression, Slimmable codec, Slimmable network, Variable rate",

author = "Zhaocheng Liu and Fei Yang and Defa Wang and {G{\'o}rriz Blanch}, Marc and Luka Murn and Shuai Wan and Saiping Zhang and Marta Mrak and Luis Herranz",

note = "Publisher Copyright: {\textcopyright} 2024 The Author(s)",

year = "2024",

month = dec,

day = "28",

doi = "10.1016/j.neucom.2024.128525",

language = "英语",

volume = "610",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - A slimmable framework for practical neural video compression

AU - Liu, Zhaocheng

AU - Yang, Fei

AU - Wang, Defa

AU - Górriz Blanch, Marc

AU - Murn, Luka

AU - Wan, Shuai

AU - Zhang, Saiping

AU - Mrak, Marta

AU - Herranz, Luis

PY - 2024/12/28

Y1 - 2024/12/28

N2 - Deep learning is being increasingly applied to image and video compression in a new paradigm known as neural video compression. While achieving impressive rate–distortion (RD) performance, neural video codecs (NVC) require heavy neural networks, which in turn have large memory and computational costs and often lack important functionalities such as variable rate. These are significant limitations to their practical application. Addressing these problems, recent slimmable image codecs can dynamically adjust their model capacity to elegantly reduce the memory and computation requirements, without harming RD performance. However, the extension to video is not straightforward due to the non-trivial interplay with complex motion estimation and compensation modules in most NVC architectures. In this paper we propose the slimmable video codec framework (SlimVC) that integrates an slimmable autoencoder and a motion-free conditional entropy model. We show that the slimming mechanism is also applicable to the more complex case of video architectures, providing SlimVC with simultaneous control of the computational cost, memory and rate, which are all important requirements in practice. We further provide detailed experimental analysis, and describe application scenarios that can benefit from slimmable video codecs.

AB - Deep learning is being increasingly applied to image and video compression in a new paradigm known as neural video compression. While achieving impressive rate–distortion (RD) performance, neural video codecs (NVC) require heavy neural networks, which in turn have large memory and computational costs and often lack important functionalities such as variable rate. These are significant limitations to their practical application. Addressing these problems, recent slimmable image codecs can dynamically adjust their model capacity to elegantly reduce the memory and computation requirements, without harming RD performance. However, the extension to video is not straightforward due to the non-trivial interplay with complex motion estimation and compensation modules in most NVC architectures. In this paper we propose the slimmable video codec framework (SlimVC) that integrates an slimmable autoencoder and a motion-free conditional entropy model. We show that the slimming mechanism is also applicable to the more complex case of video architectures, providing SlimVC with simultaneous control of the computational cost, memory and rate, which are all important requirements in practice. We further provide detailed experimental analysis, and describe application scenarios that can benefit from slimmable video codecs.

KW - Deep learning

KW - Feature modulation

KW - Neural video compression

KW - Slimmable codec

KW - Slimmable network

KW - Variable rate

UR - http://www.scopus.com/inward/record.url?scp=85203406845&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2024.128525

DO - 10.1016/j.neucom.2024.128525

M3 - 文章

AN - SCOPUS:85203406845

SN - 0925-2312

VL - 610

JO - Neurocomputing

JF - Neurocomputing

M1 - 128525

ER -

A slimmable framework for practical neural video compression

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this