Robust Multiobject Tracking Using Vision Sensor With Fine-Grained Cues in Occluded and Dynamic Scenes

Yaoqi Hu; Jinqiu Sun; Hao Jin; Axi Niu; Qingsen Yan; Yu Zhu; Yanning Zhang

doi:10.1109/JSEN.2025.3558588

Robust Multiobject Tracking Using Vision Sensor With Fine-Grained Cues in Occluded and Dynamic Scenes

Yaoqi Hu, Jinqiu Sun, Hao Jin, Axi Niu, Qingsen Yan, Yu Zhu, Yanning Zhang

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Multiobject tracking (MOT) using vision sensors remains a challenging problem, particularly in dynamic backgrounds and severe occlusions. Existing methods, relying on holistic appearance or spatial cues, fail to capture detailed information within object regions and the background, resulting in inaccurate and inconsistent tracking. To address these issues, we propose a novel detail-driven MOT (DD-MOT) method that revisits and leverages fine-grained cues to discover the dynamics of both object regions and the background, facilitating the recovery and association of trajectories. Specifically, the proposed method consists of three key modules: 1) points trajectories generator (PTG) module; 2) camera otion and occlusion compensation (CMOC) module; and 3) fine- and coarse-grained association (FCGA) module. The PTG module is responsible for generating fine-grained cues by sampling an initial set of points, generating point trajectories, and refining the initial set of points. The CMOC module utilizes the background and object point trajectories to correct background motion and recover occluded object bounding boxes. Finally, the FCGA module leverages point trajectory cues to assist in establishing a more effective association strategy, combining both fine-grained and coarse-grained spatial cues from the object bounding boxes. We evaluate our DD-MOT method on several benchmark datasets, including MOT17 and MOT20, demonstrating that it consistently outperforms state-of-the-art (SOTA) methods in key metrics such as HOTA (64.2, 62.6), MOTA (80.8, 78.1), and IDF1 (78.8, 77.1).

源语言	英语
页（从-至）	20547-20560
页数	14
期刊	IEEE Sensors Journal
卷	25
期	11
DOI	https://doi.org/10.1109/JSEN.2025.3558588
出版状态	已出版 - 2025

访问文件

10.1109/JSEN.2025.3558588

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{20be5f4600fd4059b3ee98a1f6696484,

title = "Robust Multiobject Tracking Using Vision Sensor With Fine-Grained Cues in Occluded and Dynamic Scenes",

abstract = "Multiobject tracking (MOT) using vision sensors remains a challenging problem, particularly in dynamic backgrounds and severe occlusions. Existing methods, relying on holistic appearance or spatial cues, fail to capture detailed information within object regions and the background, resulting in inaccurate and inconsistent tracking. To address these issues, we propose a novel detail-driven MOT (DD-MOT) method that revisits and leverages fine-grained cues to discover the dynamics of both object regions and the background, facilitating the recovery and association of trajectories. Specifically, the proposed method consists of three key modules: 1) points trajectories generator (PTG) module; 2) camera otion and occlusion compensation (CMOC) module; and 3) fine- and coarse-grained association (FCGA) module. The PTG module is responsible for generating fine-grained cues by sampling an initial set of points, generating point trajectories, and refining the initial set of points. The CMOC module utilizes the background and object point trajectories to correct background motion and recover occluded object bounding boxes. Finally, the FCGA module leverages point trajectory cues to assist in establishing a more effective association strategy, combining both fine-grained and coarse-grained spatial cues from the object bounding boxes. We evaluate our DD-MOT method on several benchmark datasets, including MOT17 and MOT20, demonstrating that it consistently outperforms state-of-the-art (SOTA) methods in key metrics such as HOTA (64.2, 62.6), MOTA (80.8, 78.1), and IDF1 (78.8, 77.1).",

keywords = "Multiobject tracking (MOT), online inference, point tracking, tracking by detection",

author = "Yaoqi Hu and Jinqiu Sun and Hao Jin and Axi Niu and Qingsen Yan and Yu Zhu and Yanning Zhang",

note = "Publisher Copyright: {\textcopyright} 2001-2012 IEEE.",

year = "2025",

doi = "10.1109/JSEN.2025.3558588",

language = "英语",

volume = "25",

pages = "20547--20560",

journal = "IEEE Sensors Journal",

issn = "1530-437X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "11",

}

TY - JOUR

T1 - Robust Multiobject Tracking Using Vision Sensor With Fine-Grained Cues in Occluded and Dynamic Scenes

AU - Hu, Yaoqi

AU - Sun, Jinqiu

AU - Jin, Hao

AU - Niu, Axi

AU - Yan, Qingsen

AU - Zhu, Yu

AU - Zhang, Yanning

PY - 2025

Y1 - 2025

N2 - Multiobject tracking (MOT) using vision sensors remains a challenging problem, particularly in dynamic backgrounds and severe occlusions. Existing methods, relying on holistic appearance or spatial cues, fail to capture detailed information within object regions and the background, resulting in inaccurate and inconsistent tracking. To address these issues, we propose a novel detail-driven MOT (DD-MOT) method that revisits and leverages fine-grained cues to discover the dynamics of both object regions and the background, facilitating the recovery and association of trajectories. Specifically, the proposed method consists of three key modules: 1) points trajectories generator (PTG) module; 2) camera otion and occlusion compensation (CMOC) module; and 3) fine- and coarse-grained association (FCGA) module. The PTG module is responsible for generating fine-grained cues by sampling an initial set of points, generating point trajectories, and refining the initial set of points. The CMOC module utilizes the background and object point trajectories to correct background motion and recover occluded object bounding boxes. Finally, the FCGA module leverages point trajectory cues to assist in establishing a more effective association strategy, combining both fine-grained and coarse-grained spatial cues from the object bounding boxes. We evaluate our DD-MOT method on several benchmark datasets, including MOT17 and MOT20, demonstrating that it consistently outperforms state-of-the-art (SOTA) methods in key metrics such as HOTA (64.2, 62.6), MOTA (80.8, 78.1), and IDF1 (78.8, 77.1).

AB - Multiobject tracking (MOT) using vision sensors remains a challenging problem, particularly in dynamic backgrounds and severe occlusions. Existing methods, relying on holistic appearance or spatial cues, fail to capture detailed information within object regions and the background, resulting in inaccurate and inconsistent tracking. To address these issues, we propose a novel detail-driven MOT (DD-MOT) method that revisits and leverages fine-grained cues to discover the dynamics of both object regions and the background, facilitating the recovery and association of trajectories. Specifically, the proposed method consists of three key modules: 1) points trajectories generator (PTG) module; 2) camera otion and occlusion compensation (CMOC) module; and 3) fine- and coarse-grained association (FCGA) module. The PTG module is responsible for generating fine-grained cues by sampling an initial set of points, generating point trajectories, and refining the initial set of points. The CMOC module utilizes the background and object point trajectories to correct background motion and recover occluded object bounding boxes. Finally, the FCGA module leverages point trajectory cues to assist in establishing a more effective association strategy, combining both fine-grained and coarse-grained spatial cues from the object bounding boxes. We evaluate our DD-MOT method on several benchmark datasets, including MOT17 and MOT20, demonstrating that it consistently outperforms state-of-the-art (SOTA) methods in key metrics such as HOTA (64.2, 62.6), MOTA (80.8, 78.1), and IDF1 (78.8, 77.1).

KW - Multiobject tracking (MOT)

KW - online inference

KW - point tracking

KW - tracking by detection

UR - http://www.scopus.com/inward/record.url?scp=105003223939&partnerID=8YFLogxK

U2 - 10.1109/JSEN.2025.3558588

DO - 10.1109/JSEN.2025.3558588

M3 - 文章

AN - SCOPUS:105003223939

SN - 1530-437X

VL - 25

SP - 20547

EP - 20560

JO - IEEE Sensors Journal

JF - IEEE Sensors Journal

IS - 11

ER -

Robust Multiobject Tracking Using Vision Sensor With Fine-Grained Cues in Occluded and Dynamic Scenes

摘要

访问文件

其它文件与链接

指纹

引用此