A fast block-matching algorithm based on adaptive search area and its VLSI architecture for H.264/AVC

Ying Lai Xi, Chong Yang Hao, Yang Yu Fan, Hong Qi Hu

科研成果: 期刊稿件文章同行评审

21 引用 (Scopus)

摘要

In this paper, we propose a fast block-matching algorithm based on search center prediction and search early termination, called center-prediction and early-termination based motion search algorithm (CPETS). The CPETS satisfies high performance and efficient VLSI implementation. It makes use of the spatial and temporal correlation in motion vector (MV) fields and feature of all-zero blocks to accelerate the searching process. This paper describes the CPETS with three levels. At the coarsest level, which happens when center prediction fails, the search area is defined to enclose all original search range. At the middle level, the search area is defined as a 7×7-pels square area around the predicted center. At the finest level, a 5×5-pels search area around the predicted center is adopted. At each level, 9-points uniformly allocated search pattern is adopted. The experiment results show that the CPETS is able to achieve a reduction of 95.67% encoding time in average compared with full-search scheme, with a negligible peak signal-noise ratio (PSNR) loss and bitrate increase. Also, the efficiency of CPETS outperforms some popular fast algorithms such as: three-step search, new three-step search, four-step search evidently. This paper also describes an efficient four-way pipelined VLSI architecture based on the CPETS for H.264/AVC coding. The proposed architecture divides current block and search area into four sub-regions, respectively, with 4:1 sub-sampling and processes them in parallel. Also, each sub-region is processed by a pipelined structure to ensure the search for nine candidate points is performed simultaneously. By adopting search early-termination strategy, the architecture can compute one MV for 16×16 block in 81 clock cycles in the best case and 901 clock cycles in the poorest case. The architecture has been designed and simulated with VHDL language. Simulation results show that the proposed architecture achieves a high performance for real-time motion estimation. Only 47.3 K gates and 1624×8 bits on-chip RAM are needed for a search range of (-15, +15) with three reference frames and four candidate block modes by using 36 processing elements.

源语言英语
页(从-至)626-646
页数21
期刊Signal Processing: Image Communication
21
8
DOI
出版状态已出版 - 9月 2006

指纹

探究 'A fast block-matching algorithm based on adaptive search area and its VLSI architecture for H.264/AVC' 的科研主题。它们共同构成独一无二的指纹。

引用此