Discriminative two-level feature selection for realistic human action recognition

Qiuxia Wu; Zhiyong Wang; Feiqi Deng; Yong Xia; Wenxiong Kang; David Dagan Feng

doi:10.1016/j.jvcir.2013.07.001

Discriminative two-level feature selection for realistic human action recognition

Qiuxia Wu, Zhiyong Wang, Feiqi Deng, Yong Xia, Wenxiong Kang, David Dagan Feng

科研成果: 期刊稿件 › 文章 › 同行评审

9 引用（Scopus）

摘要

Constructing the bag-of-features model from Space-time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.

源语言	英语
页（从-至）	1064-1074
页数	11
期刊	Journal of Visual Communication and Image Representation
卷	24
期	7
DOI	https://doi.org/10.1016/j.jvcir.2013.07.001
出版状态	已出版 - 2013
已对外发布	是

访问文件

10.1016/j.jvcir.2013.07.001

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{bcf0a14e7de241e9ae2992cd9c8a1020,

title = "Discriminative two-level feature selection for realistic human action recognition",

abstract = "Constructing the bag-of-features model from Space-time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.",

keywords = "Bag-of-features model, Maximal information compression index, Realistic human action recognition, Saliency map, Space-time interest points, Support vector machine, Unsupervised codeword selection, Visual saliency",

author = "Qiuxia Wu and Zhiyong Wang and Feiqi Deng and Yong Xia and Wenxiong Kang and Feng, {David Dagan}",

year = "2013",

doi = "10.1016/j.jvcir.2013.07.001",

language = "英语",

volume = "24",

pages = "1064--1074",

journal = "Journal of Visual Communication and Image Representation",

issn = "1047-3203",

publisher = "Academic Press Inc.",

number = "7",

}

TY - JOUR

T1 - Discriminative two-level feature selection for realistic human action recognition

AU - Wu, Qiuxia

AU - Wang, Zhiyong

AU - Deng, Feiqi

AU - Xia, Yong

AU - Kang, Wenxiong

AU - Feng, David Dagan

PY - 2013

Y1 - 2013

N2 - Constructing the bag-of-features model from Space-time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.

AB - Constructing the bag-of-features model from Space-time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.

KW - Bag-of-features model

KW - Maximal information compression index

KW - Realistic human action recognition

KW - Saliency map

KW - Space-time interest points

KW - Support vector machine

KW - Unsupervised codeword selection

KW - Visual saliency

UR - http://www.scopus.com/inward/record.url?scp=84881193309&partnerID=8YFLogxK

U2 - 10.1016/j.jvcir.2013.07.001

DO - 10.1016/j.jvcir.2013.07.001

M3 - 文章

AN - SCOPUS:84881193309

SN - 1047-3203

VL - 24

SP - 1064

EP - 1074

JO - Journal of Visual Communication and Image Representation

JF - Journal of Visual Communication and Image Representation

IS - 7

ER -

Discriminative two-level feature selection for realistic human action recognition

摘要

访问文件

其它文件与链接

指纹

引用此