Deep object tracking with multi-modal data

Xuezhi Zhang; Yuan Yuan; Xiaoqiang Lu

doi:10.1109/CITS.2016.7546403

Deep object tracking with multi-modal data

Xuezhi Zhang, Yuan Yuan, Xiaoqiang Lu

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

1 Scopus citations

Abstract

Object tracking is a challenging topic in the field of computer vision since its performance is easily disturbed by occlusion, illumination change, background clutter, scale variation, etc. In this paper, we introduce a robust tracking algorithm that fuses information from both visible images and infrared (IR) images. The proposed tracking algorithm not only incorporates convolutional feature maps from the visible channel, but also employs a scale pyramid representation from IR channel. We estimate the target location by fusing multilayer convolutional feature maps, and predict the target scale from a scale pyramid. The pipeline of the proposed method is as follows. First, the hierarchical convolutional feature maps are obtained from visible images using VGG-Nets. Then, the accurate target location is predicted by the maximum response of correlation filters with the visible image feature maps. Finally, we obtain the precise object scale with a scale pyramid from infrared images where the difference between the target and the background is clear. In order to verify the performance of the proposed method, we capture six video sequences under different conditions. These sequences contain both visible channel and IR channel. Ten state-of-the-art tracking algorithms are compared with our method, and the experimental results show the effectiveness of the proposed tracker.

Original language	English
Title of host publication	IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems
Editors	Fei Gao, Zan Li, Daniel Cascado Caballero, Jing Fan, Mohammad S. Obaidat, Petros Nicoploitidis, Kuei Fang Hsiao
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781509034406
DOIs	https://doi.org/10.1109/CITS.2016.7546403
State	Published - 16 Aug 2016
Externally published	Yes
Event	2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016 - Kunming, China Duration: 6 Jul 2016 → 8 Jul 2016

Publication series

Name	IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems

Conference

Conference	2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016
Country/Territory	China
City	Kunming
Period	6/07/16 → 8/07/16

Access to Document

10.1109/CITS.2016.7546403

Cite this

Zhang, X., Yuan, Y., & Lu, X. (2016). Deep object tracking with multi-modal data. In F. Gao, Z. Li, D. C. Caballero, J. Fan, M. S. Obaidat, P. Nicoploitidis, & K. F. Hsiao (Eds.), IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems Article 7546403 (IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CITS.2016.7546403

Zhang, Xuezhi ; Yuan, Yuan ; Lu, Xiaoqiang. / Deep object tracking with multi-modal data. IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems. editor / Fei Gao ; Zan Li ; Daniel Cascado Caballero ; Jing Fan ; Mohammad S. Obaidat ; Petros Nicoploitidis ; Kuei Fang Hsiao. Institute of Electrical and Electronics Engineers Inc., 2016. (IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems).

@inproceedings{e4b116c0ef7145e185b2a4e23d8e35ce,

title = "Deep object tracking with multi-modal data",

abstract = "Object tracking is a challenging topic in the field of computer vision since its performance is easily disturbed by occlusion, illumination change, background clutter, scale variation, etc. In this paper, we introduce a robust tracking algorithm that fuses information from both visible images and infrared (IR) images. The proposed tracking algorithm not only incorporates convolutional feature maps from the visible channel, but also employs a scale pyramid representation from IR channel. We estimate the target location by fusing multilayer convolutional feature maps, and predict the target scale from a scale pyramid. The pipeline of the proposed method is as follows. First, the hierarchical convolutional feature maps are obtained from visible images using VGG-Nets. Then, the accurate target location is predicted by the maximum response of correlation filters with the visible image feature maps. Finally, we obtain the precise object scale with a scale pyramid from infrared images where the difference between the target and the background is clear. In order to verify the performance of the proposed method, we capture six video sequences under different conditions. These sequences contain both visible channel and IR channel. Ten state-of-the-art tracking algorithms are compared with our method, and the experimental results show the effectiveness of the proposed tracker.",

author = "Xuezhi Zhang and Yuan Yuan and Xiaoqiang Lu",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.; 2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016 ; Conference date: 06-07-2016 Through 08-07-2016",

year = "2016",

month = aug,

day = "16",

doi = "10.1109/CITS.2016.7546403",

language = "英语",

series = "IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

editor = "Fei Gao and Zan Li and Caballero, {Daniel Cascado} and Jing Fan and Obaidat, {Mohammad S.} and Petros Nicoploitidis and Hsiao, {Kuei Fang}",

booktitle = "IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems",

}

Zhang, X, Yuan, Y & Lu, X 2016, Deep object tracking with multi-modal data. in F Gao, Z Li, DC Caballero, J Fan, MS Obaidat, P Nicoploitidis & KF Hsiao (eds), IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems., 7546403, IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems, Institute of Electrical and Electronics Engineers Inc., 2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016, Kunming, China, 6/07/16. https://doi.org/10.1109/CITS.2016.7546403

Deep object tracking with multi-modal data. / Zhang, Xuezhi; Yuan, Yuan; Lu, Xiaoqiang.
IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems. ed. / Fei Gao; Zan Li; Daniel Cascado Caballero; Jing Fan; Mohammad S. Obaidat; Petros Nicoploitidis; Kuei Fang Hsiao. Institute of Electrical and Electronics Engineers Inc., 2016. 7546403 (IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Deep object tracking with multi-modal data

AU - Zhang, Xuezhi

AU - Yuan, Yuan

AU - Lu, Xiaoqiang

PY - 2016/8/16

Y1 - 2016/8/16

N2 - Object tracking is a challenging topic in the field of computer vision since its performance is easily disturbed by occlusion, illumination change, background clutter, scale variation, etc. In this paper, we introduce a robust tracking algorithm that fuses information from both visible images and infrared (IR) images. The proposed tracking algorithm not only incorporates convolutional feature maps from the visible channel, but also employs a scale pyramid representation from IR channel. We estimate the target location by fusing multilayer convolutional feature maps, and predict the target scale from a scale pyramid. The pipeline of the proposed method is as follows. First, the hierarchical convolutional feature maps are obtained from visible images using VGG-Nets. Then, the accurate target location is predicted by the maximum response of correlation filters with the visible image feature maps. Finally, we obtain the precise object scale with a scale pyramid from infrared images where the difference between the target and the background is clear. In order to verify the performance of the proposed method, we capture six video sequences under different conditions. These sequences contain both visible channel and IR channel. Ten state-of-the-art tracking algorithms are compared with our method, and the experimental results show the effectiveness of the proposed tracker.

AB - Object tracking is a challenging topic in the field of computer vision since its performance is easily disturbed by occlusion, illumination change, background clutter, scale variation, etc. In this paper, we introduce a robust tracking algorithm that fuses information from both visible images and infrared (IR) images. The proposed tracking algorithm not only incorporates convolutional feature maps from the visible channel, but also employs a scale pyramid representation from IR channel. We estimate the target location by fusing multilayer convolutional feature maps, and predict the target scale from a scale pyramid. The pipeline of the proposed method is as follows. First, the hierarchical convolutional feature maps are obtained from visible images using VGG-Nets. Then, the accurate target location is predicted by the maximum response of correlation filters with the visible image feature maps. Finally, we obtain the precise object scale with a scale pyramid from infrared images where the difference between the target and the background is clear. In order to verify the performance of the proposed method, we capture six video sequences under different conditions. These sequences contain both visible channel and IR channel. Ten state-of-the-art tracking algorithms are compared with our method, and the experimental results show the effectiveness of the proposed tracker.

UR - http://www.scopus.com/inward/record.url?scp=84987673757&partnerID=8YFLogxK

U2 - 10.1109/CITS.2016.7546403

DO - 10.1109/CITS.2016.7546403

M3 - 会议稿件

AN - SCOPUS:84987673757

T3 - IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems

BT - IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems

A2 - Gao, Fei

A2 - Li, Zan

A2 - Caballero, Daniel Cascado

A2 - Fan, Jing

A2 - Obaidat, Mohammad S.

A2 - Nicoploitidis, Petros

A2 - Hsiao, Kuei Fang

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2016 International Conference on Computer, Information and Telecommunication Systems, CITS 2016

Y2 - 6 July 2016 through 8 July 2016

ER -

Zhang X, Yuan Y, Lu X. Deep object tracking with multi-modal data. In Gao F, Li Z, Caballero DC, Fan J, Obaidat MS, Nicoploitidis P, Hsiao KF, editors, IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems. Institute of Electrical and Electronics Engineers Inc. 2016. 7546403. (IEEE CITS 2016 - 2016 International Conference on Computer, Information and Telecommunication Systems). doi: 10.1109/CITS.2016.7546403

Deep object tracking with multi-modal data

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this