TY - JOUR
T1 - Spatiotemporal interest point detector exploiting appearance and motion-variation information
AU - Li, Yanshan
AU - Li, Qingteng
AU - Huang, Qinghua
AU - Xia, Rongjie
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2019 SPIE and IS&T.
PY - 2019/5/1
Y1 - 2019/5/1
N2 - As a local invariant feature of videos, the spatiotemporal interest point (STIP) has been widely used in computer vision and pattern recognition. However, existing STIP detectors are generally extended from detection algorithms constructed for local invariant features of two-dimensional images, which does not explicitly exploit the motion information inherent in the temporal domain of videos, thus weakening the performance of existing STIP detectors in a video context. To remedy this, we aim to develop an STIP detector that uniformly captures appearance and motion information for video, thus yielding substantial performance improvement. Specifically, under the framework of geometric algebra, we first develop a spatiotemporal unified model of appearance and motion-variation information (UMAMV), and then a UMAMV-based scale space of the spatiotemporal domain is proposed to synthetically analyze appearance information and motion information in a video. Based on this model, we propose an STIP feature of UMAMV-SIFT that embraces both appearance and motion variation information of the videos. Three datasets with different sizes are utilized to evaluate the proposed model and the STIP detector. We present experimental results to show that the UMAMV-SIFT achieves state-of-the-art performance and is particularly effective when dataset is small.
AB - As a local invariant feature of videos, the spatiotemporal interest point (STIP) has been widely used in computer vision and pattern recognition. However, existing STIP detectors are generally extended from detection algorithms constructed for local invariant features of two-dimensional images, which does not explicitly exploit the motion information inherent in the temporal domain of videos, thus weakening the performance of existing STIP detectors in a video context. To remedy this, we aim to develop an STIP detector that uniformly captures appearance and motion information for video, thus yielding substantial performance improvement. Specifically, under the framework of geometric algebra, we first develop a spatiotemporal unified model of appearance and motion-variation information (UMAMV), and then a UMAMV-based scale space of the spatiotemporal domain is proposed to synthetically analyze appearance information and motion information in a video. Based on this model, we propose an STIP feature of UMAMV-SIFT that embraces both appearance and motion variation information of the videos. Three datasets with different sizes are utilized to evaluate the proposed model and the STIP detector. We present experimental results to show that the UMAMV-SIFT achieves state-of-the-art performance and is particularly effective when dataset is small.
KW - Geometric algebra
KW - Spatiotemporal interest point
KW - Spatiotemporal interest point detector
KW - Video
UR - http://www.scopus.com/inward/record.url?scp=85069449505&partnerID=8YFLogxK
U2 - 10.1117/1.JEI.28.3.033002
DO - 10.1117/1.JEI.28.3.033002
M3 - 文章
AN - SCOPUS:85069449505
SN - 1017-9909
VL - 28
JO - Journal of Electronic Imaging
JF - Journal of Electronic Imaging
IS - 3
M1 - 033002
ER -