Deep Learning-Based Automated Lip-Reading: A Survey

Souheil Fenghour, Daqing Chen, Kun Guo, Bo Li, Perry Xiao

科研成果: 期刊稿件文献综述同行评审

43 引用 (Scopus)

摘要

A survey on automated lip-reading approaches is presented in this paper with the main focus being on deep learning related methodologies which have proven to be more fruitful for both feature extraction and classification. This survey also provides comparisons of all the different components that make up automated lip-reading systems including the audio-visual databases, feature extraction, classification networks and classification schemas. The main contributions and unique insights of this survey are: 1) A comparison of Convolutional Neural Networks with other neural network architectures for feature extraction; 2) A critical review on the advantages of Attention-Transformers and Temporal Convolutional Networks to Recurrent Neural Networks for classification; 3) A comparison of different classification schemas used for lip-reading including ASCII characters, phonemes and visemes, and 4) A review of the most up-to-date lip-reading systems up until early 2021.

源语言英语
文章编号9522117
页(从-至)121184-121205
页数22
期刊IEEE Access
9
DOI
出版状态已出版 - 2021

指纹

探究 'Deep Learning-Based Automated Lip-Reading: A Survey' 的科研主题。它们共同构成独一无二的指纹。

引用此