TY - JOUR
T1 - Robust Multi-Object Tracking Using Vision Sensor with Fine-Grained Cues in Occluded and Dynamic Scenes
AU - Hu, Yaoqi
AU - Sun, Jinqiu
AU - Jin, Hao
AU - Niu, Axi
AU - Yan, Qingsen
AU - Zhu, Yu
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Multi-object tracking (MOT) using vision sensors remains a challenging problem, particularly in dynamic backgrounds and severe occlusions. Existing methods, relying on holistic appearance or spatial cues, fail to capture detailed information within object regions and the background, resulting in inaccurate and inconsistent tracking. To address these issues, we propose a novel Detail-Driven Multi-Object Tracking (DD-MOT) method that revisits and leverages fine-grained cues to discover the dynamics of both object regions and the background, facilitating the recovery and association of trajectories. Specifically, the proposed method consists of three key modules: (i) Points Trajectories Generator, (ii) Camera Motion and Occlusion Compensation, and (iii) Fine- and Coarsegrained Association. The Points Trajectories Generator is responsible for generating fine-grained cues by sampling an initial set of points, generating point trajectories, and refining the initial set of points. The Camera Motion and Occlusion Compensation module utilizes the background and object point trajectories to correct background motion and recover occluded object bounding boxes. Finally, the Fine- and Coarse-grained Association module leverages point trajectory cues to assist in establishing a more effective association strategy, combining both fine-grained and coarse-grained spatial cues from the object bounding boxes. We evaluate our DD-MOT method on several benchmark datasets, including MOT17, MOT20, demonstrating that it consistently outperforms state-of-the-art (SOTA) methods in key metrics such as HOTA (64.2, 62.6), MOTA (80.8, 78.1), and IDF1 (78.8, 77.1).
AB - Multi-object tracking (MOT) using vision sensors remains a challenging problem, particularly in dynamic backgrounds and severe occlusions. Existing methods, relying on holistic appearance or spatial cues, fail to capture detailed information within object regions and the background, resulting in inaccurate and inconsistent tracking. To address these issues, we propose a novel Detail-Driven Multi-Object Tracking (DD-MOT) method that revisits and leverages fine-grained cues to discover the dynamics of both object regions and the background, facilitating the recovery and association of trajectories. Specifically, the proposed method consists of three key modules: (i) Points Trajectories Generator, (ii) Camera Motion and Occlusion Compensation, and (iii) Fine- and Coarsegrained Association. The Points Trajectories Generator is responsible for generating fine-grained cues by sampling an initial set of points, generating point trajectories, and refining the initial set of points. The Camera Motion and Occlusion Compensation module utilizes the background and object point trajectories to correct background motion and recover occluded object bounding boxes. Finally, the Fine- and Coarse-grained Association module leverages point trajectory cues to assist in establishing a more effective association strategy, combining both fine-grained and coarse-grained spatial cues from the object bounding boxes. We evaluate our DD-MOT method on several benchmark datasets, including MOT17, MOT20, demonstrating that it consistently outperforms state-of-the-art (SOTA) methods in key metrics such as HOTA (64.2, 62.6), MOTA (80.8, 78.1), and IDF1 (78.8, 77.1).
KW - Multi-Object Tracking
KW - Online Inference
KW - Point Tracking
KW - Tracking by Detection
UR - http://www.scopus.com/inward/record.url?scp=105003223939&partnerID=8YFLogxK
U2 - 10.1109/JSEN.2025.3558588
DO - 10.1109/JSEN.2025.3558588
M3 - 文章
AN - SCOPUS:105003223939
SN - 1530-437X
JO - IEEE Sensors Journal
JF - IEEE Sensors Journal
ER -