Attention-Based Dual-Scale CNN In-Loop Filter for Versatile Video Coding

Ming Ze Wang; Shuai Wan; Hao Gong; Ming Yang Ma

doi:10.1109/ACCESS.2019.2944473

Attention-Based Dual-Scale CNN In-Loop Filter for Versatile Video Coding

Ming Ze Wang, Shuai Wan, Hao Gong, Ming Yang Ma

电子信息学院

科研成果: 期刊稿件 › 文章 › 同行评审

45 引用（Scopus）

摘要

As the upcoming video coding standard, Versatile Video Coding (i.e., VVC) achieves up to 30% Bjontegaard delta bit-rate (BD-rate) reduction compared with High Efficiency Video Coding (H.265/HEVC). To eliminate or alleviate different kinds of compression artifacts like blocking, ringing, blurring and contouring effects, three in-loop filters, i.e. de-blocking filter (DBF), sample adaptive offset (SAO) and adaptive loop filter (ALF), have been involved in VVC. Recently, Convolutional Neural Network (CNN) has attracted tremendous attention and shows great potential in many tasks in image processing. In this work, we design a CNN-based in-loop filter as an integrated single-model solution which is adaptive to almost any scenarios in video coding. An architecture named as ADCNN (i.e., Attention based Dual-scale CNN) with an attention based processing block is proposed to reduce artifacts of I frames and B frames, which take advantage of informative priors such as the quantization parameter (QP) and partitioning information. Different from existing CNN-based filtering methods, which are mainly designed for the luma component and may need to train different models for different QPs, the proposed filter is adapted to different QPs and different frame types, and all the components (i.e., both luma and chroma) are processed simultaneously with feature exchange and fusion between components for information supplementary. Experimental results show that the proposed ADCNN filter can achieve 6.54%, 13.27%, 15.72% BD-rate savings for Y, U, V respectively under the all intra configuration and 2.81%, 7.86%, 8.60% BD-rate savings under the random access configuration. It can be used to replace all the conventional in-loop filters and also outperforms them without increase in encoding time.

源语言	英语
文章编号	8852743
页（从-至）	145214-145226
页数	13
期刊	IEEE Access
卷	7
DOI	https://doi.org/10.1109/ACCESS.2019.2944473
出版状态	已出版 - 2019

访问文件

10.1109/ACCESS.2019.2944473

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f5f4c51d8a7d40b88c23350c7c6ea90a,

title = "Attention-Based Dual-Scale CNN In-Loop Filter for Versatile Video Coding",

abstract = "As the upcoming video coding standard, Versatile Video Coding (i.e., VVC) achieves up to 30% Bjontegaard delta bit-rate (BD-rate) reduction compared with High Efficiency Video Coding (H.265/HEVC). To eliminate or alleviate different kinds of compression artifacts like blocking, ringing, blurring and contouring effects, three in-loop filters, i.e. de-blocking filter (DBF), sample adaptive offset (SAO) and adaptive loop filter (ALF), have been involved in VVC. Recently, Convolutional Neural Network (CNN) has attracted tremendous attention and shows great potential in many tasks in image processing. In this work, we design a CNN-based in-loop filter as an integrated single-model solution which is adaptive to almost any scenarios in video coding. An architecture named as ADCNN (i.e., Attention based Dual-scale CNN) with an attention based processing block is proposed to reduce artifacts of I frames and B frames, which take advantage of informative priors such as the quantization parameter (QP) and partitioning information. Different from existing CNN-based filtering methods, which are mainly designed for the luma component and may need to train different models for different QPs, the proposed filter is adapted to different QPs and different frame types, and all the components (i.e., both luma and chroma) are processed simultaneously with feature exchange and fusion between components for information supplementary. Experimental results show that the proposed ADCNN filter can achieve 6.54%, 13.27%, 15.72% BD-rate savings for Y, U, V respectively under the all intra configuration and 2.81%, 7.86%, 8.60% BD-rate savings under the random access configuration. It can be used to replace all the conventional in-loop filters and also outperforms them without increase in encoding time.",

keywords = "Coding artifacts, convolutional neural network (CNN), in-loop filter, versatile video coding (VVC), video coding",

author = "Wang, {Ming Ze} and Shuai Wan and Hao Gong and Ma, {Ming Yang}",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2019",

doi = "10.1109/ACCESS.2019.2944473",

language = "英语",

volume = "7",

pages = "145214--145226",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Attention-Based Dual-Scale CNN In-Loop Filter for Versatile Video Coding

AU - Wang, Ming Ze

AU - Wan, Shuai

AU - Gong, Hao

AU - Ma, Ming Yang

PY - 2019

Y1 - 2019

N2 - As the upcoming video coding standard, Versatile Video Coding (i.e., VVC) achieves up to 30% Bjontegaard delta bit-rate (BD-rate) reduction compared with High Efficiency Video Coding (H.265/HEVC). To eliminate or alleviate different kinds of compression artifacts like blocking, ringing, blurring and contouring effects, three in-loop filters, i.e. de-blocking filter (DBF), sample adaptive offset (SAO) and adaptive loop filter (ALF), have been involved in VVC. Recently, Convolutional Neural Network (CNN) has attracted tremendous attention and shows great potential in many tasks in image processing. In this work, we design a CNN-based in-loop filter as an integrated single-model solution which is adaptive to almost any scenarios in video coding. An architecture named as ADCNN (i.e., Attention based Dual-scale CNN) with an attention based processing block is proposed to reduce artifacts of I frames and B frames, which take advantage of informative priors such as the quantization parameter (QP) and partitioning information. Different from existing CNN-based filtering methods, which are mainly designed for the luma component and may need to train different models for different QPs, the proposed filter is adapted to different QPs and different frame types, and all the components (i.e., both luma and chroma) are processed simultaneously with feature exchange and fusion between components for information supplementary. Experimental results show that the proposed ADCNN filter can achieve 6.54%, 13.27%, 15.72% BD-rate savings for Y, U, V respectively under the all intra configuration and 2.81%, 7.86%, 8.60% BD-rate savings under the random access configuration. It can be used to replace all the conventional in-loop filters and also outperforms them without increase in encoding time.

AB - As the upcoming video coding standard, Versatile Video Coding (i.e., VVC) achieves up to 30% Bjontegaard delta bit-rate (BD-rate) reduction compared with High Efficiency Video Coding (H.265/HEVC). To eliminate or alleviate different kinds of compression artifacts like blocking, ringing, blurring and contouring effects, three in-loop filters, i.e. de-blocking filter (DBF), sample adaptive offset (SAO) and adaptive loop filter (ALF), have been involved in VVC. Recently, Convolutional Neural Network (CNN) has attracted tremendous attention and shows great potential in many tasks in image processing. In this work, we design a CNN-based in-loop filter as an integrated single-model solution which is adaptive to almost any scenarios in video coding. An architecture named as ADCNN (i.e., Attention based Dual-scale CNN) with an attention based processing block is proposed to reduce artifacts of I frames and B frames, which take advantage of informative priors such as the quantization parameter (QP) and partitioning information. Different from existing CNN-based filtering methods, which are mainly designed for the luma component and may need to train different models for different QPs, the proposed filter is adapted to different QPs and different frame types, and all the components (i.e., both luma and chroma) are processed simultaneously with feature exchange and fusion between components for information supplementary. Experimental results show that the proposed ADCNN filter can achieve 6.54%, 13.27%, 15.72% BD-rate savings for Y, U, V respectively under the all intra configuration and 2.81%, 7.86%, 8.60% BD-rate savings under the random access configuration. It can be used to replace all the conventional in-loop filters and also outperforms them without increase in encoding time.

KW - Coding artifacts

KW - convolutional neural network (CNN)

KW - in-loop filter

KW - versatile video coding (VVC)

KW - video coding

UR - http://www.scopus.com/inward/record.url?scp=85073699260&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2019.2944473

DO - 10.1109/ACCESS.2019.2944473

M3 - 文章

AN - SCOPUS:85073699260

SN - 2169-3536

VL - 7

SP - 145214

EP - 145226

JO - IEEE Access

JF - IEEE Access

M1 - 8852743

ER -

Attention-Based Dual-Scale CNN In-Loop Filter for Versatile Video Coding

摘要

访问文件

其它文件与链接

指纹

引用此