Challenges and solutions for vision-based hand gesture interpretation: A review

Kun Gao; Haoyang Zhang; Xiaolong Liu; Xinyi Wang; Liang Xie; Bowen Ji; Ye Yan; Erwei Yin

doi:10.1016/j.cviu.2024.104095

Challenges and solutions for vision-based hand gesture interpretation: A review

Kun Gao, Haoyang Zhang, Xiaolong Liu, Xinyi Wang, Liang Xie, Bowen Ji, Ye Yan, Erwei Yin

无人系统技术研究院

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.

源语言	英语
文章编号	104095
期刊	Computer Vision and Image Understanding
卷	248
DOI	https://doi.org/10.1016/j.cviu.2024.104095
出版状态	已出版 - 11月 2024

访问文件

10.1016/j.cviu.2024.104095

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{7c7004cb3ae04685ac1dc5c6ef272dd9,

title = "Challenges and solutions for vision-based hand gesture interpretation: A review",

abstract = "Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.",

keywords = "Hand gesture interpretation, Hand pose estimation, Human–computer interaction, Visual sensor",

author = "Kun Gao and Haoyang Zhang and Xiaolong Liu and Xinyi Wang and Liang Xie and Bowen Ji and Ye Yan and Erwei Yin",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Inc.",

year = "2024",

month = nov,

doi = "10.1016/j.cviu.2024.104095",

language = "英语",

volume = "248",

journal = "Computer Vision and Image Understanding",

issn = "1077-3142",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Challenges and solutions for vision-based hand gesture interpretation

T2 - A review

AU - Gao, Kun

AU - Zhang, Haoyang

AU - Liu, Xiaolong

AU - Wang, Xinyi

AU - Xie, Liang

AU - Ji, Bowen

AU - Yan, Ye

AU - Yin, Erwei

PY - 2024/11

Y1 - 2024/11

N2 - Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.

AB - Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.

KW - Hand gesture interpretation

KW - Hand pose estimation

KW - Human–computer interaction

KW - Visual sensor

UR - http://www.scopus.com/inward/record.url?scp=85201286196&partnerID=8YFLogxK

U2 - 10.1016/j.cviu.2024.104095

DO - 10.1016/j.cviu.2024.104095

M3 - 文章

AN - SCOPUS:85201286196

SN - 1077-3142

VL - 248

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

M1 - 104095

ER -

Challenges and solutions for vision-based hand gesture interpretation: A review

摘要

访问文件

其它文件与链接

指纹

引用此