TY - JOUR
T1 - LightTrack-ReID
T2 - A lightweight and occlusion-robust framework for multi-object tracking
AU - Khan, Said Baz Jahfar
AU - Zhang, Peng
AU - Kamal, Mian Muhammad
AU - Saudagar, Abdul Khader Jilani
N1 - Publisher Copyright:
Copyright: © 2026 Khan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2026/3
Y1 - 2026/3
N2 - This paper presents LightTrack-ReID, an advanced, lightweight, and occlusion-resistant framework for MOT, designed for real-time performance in resource-limited environments. The framework includes a Lightweight Appearance Encoder (LAE) using MobileNetV3-Small, Transformer-Based Similarity Scoring (TBSS), Context Memory for Occlusion Handling (CMOH), and Adaptive Similarity Weighting (ASW) to enhance tracklet association in situations of heavy occlusion. These components offer compact 32-dimensional ReID features, adaptive similarity metrics, and continuous tracking within an efficient single-stage detection-to-tracklet association system. The proposed similarity and association model operates at approximately 0.6 GFLOPs per frame (LAE approximately 0.5 GFLOPs + TBSS approximately 0.1 GFLOPs). When integrated with the YOLOX-S detector, which remains the dominant computation, the full pipeline maintains approximately 30 FPS real-time performance on a GTX1080 GPU. It demonstrates robust performance on the MOT17 and MOT20 benchmarks, achieving Higher Order Tracking Accuracy(HOTA) scores of 66.92 and 66.6 and IDentity F1 score(IDF1) scores of 82.52 and 82.2, respectively, while significantly reducing identity switches. These results confirm its strength and appropriateness for use in real-world applications.
AB - This paper presents LightTrack-ReID, an advanced, lightweight, and occlusion-resistant framework for MOT, designed for real-time performance in resource-limited environments. The framework includes a Lightweight Appearance Encoder (LAE) using MobileNetV3-Small, Transformer-Based Similarity Scoring (TBSS), Context Memory for Occlusion Handling (CMOH), and Adaptive Similarity Weighting (ASW) to enhance tracklet association in situations of heavy occlusion. These components offer compact 32-dimensional ReID features, adaptive similarity metrics, and continuous tracking within an efficient single-stage detection-to-tracklet association system. The proposed similarity and association model operates at approximately 0.6 GFLOPs per frame (LAE approximately 0.5 GFLOPs + TBSS approximately 0.1 GFLOPs). When integrated with the YOLOX-S detector, which remains the dominant computation, the full pipeline maintains approximately 30 FPS real-time performance on a GTX1080 GPU. It demonstrates robust performance on the MOT17 and MOT20 benchmarks, achieving Higher Order Tracking Accuracy(HOTA) scores of 66.92 and 66.6 and IDentity F1 score(IDF1) scores of 82.52 and 82.2, respectively, while significantly reducing identity switches. These results confirm its strength and appropriateness for use in real-world applications.
UR - https://www.scopus.com/pages/publications/105034390500
U2 - 10.1371/journal.pone.0342246
DO - 10.1371/journal.pone.0342246
M3 - 文章
C2 - 41886277
AN - SCOPUS:105034390500
SN - 1932-6203
VL - 21
JO - PLoS ONE
JF - PLoS ONE
IS - 3 March
M1 - e0342246
ER -