TY - JOUR
T1 - MOSAIC-Tracker
T2 - Mutual-enhanced Occlusion-aware Spatiotemporal Adaptive Identity Consistency network for aerial multi-object tracking
AU - Zou, Jian
AU - Zhang, Wei
AU - Li, Qiang
AU - Wang, Qi
N1 - Publisher Copyright:
© 2025
PY - 2025/11
Y1 - 2025/11
N2 - Multi-Object Tracking (MOT) in aerial imagery remains challenging due to small object sizes, occlusions, and dynamic environments. Existing approaches predominantly rely on high precision detection and Re ID matching but neglect spatiotemporal cues and global temporal modeling of occlusion. Their static confidence weighting during association cannot adapt to real time detector confidence fluctuations, resulting in mismatches and ID switches. To alleviate these limitations, we propose MOSAIC-Tracker, a Mutual-enhanced Occlusion-aware Spatiotemporal Adaptive Identity Conservation Network with three key dimensions. First, a Spatiotemporal Occlusion Enhancement (STOE) module integrates multi-frame temporal dependencies to model global motion patterns and local dynamic features, mitigating identity switches during occlusions. Then, an Adaptive Multi-scale Feature Enhancement (AMFE) mechanism combines a Local Enhancement Mechanism with multi-scale feature aggregation to improve small object discrimination. Finally, a Dynamic Confidence Matrix Adjustment (DCMA) strategy adaptively weights detection confidence in trajectory matching to minimize association errors. Together, the three modules reduce occlusion-induced identity switches. Extensive evaluations on UAVDT and VisDrone2019 datasets demonstrate advanced performance. The code is released at: https://github.com/aJanm/MOSAIC-Tracker.
AB - Multi-Object Tracking (MOT) in aerial imagery remains challenging due to small object sizes, occlusions, and dynamic environments. Existing approaches predominantly rely on high precision detection and Re ID matching but neglect spatiotemporal cues and global temporal modeling of occlusion. Their static confidence weighting during association cannot adapt to real time detector confidence fluctuations, resulting in mismatches and ID switches. To alleviate these limitations, we propose MOSAIC-Tracker, a Mutual-enhanced Occlusion-aware Spatiotemporal Adaptive Identity Conservation Network with three key dimensions. First, a Spatiotemporal Occlusion Enhancement (STOE) module integrates multi-frame temporal dependencies to model global motion patterns and local dynamic features, mitigating identity switches during occlusions. Then, an Adaptive Multi-scale Feature Enhancement (AMFE) mechanism combines a Local Enhancement Mechanism with multi-scale feature aggregation to improve small object discrimination. Finally, a Dynamic Confidence Matrix Adjustment (DCMA) strategy adaptively weights detection confidence in trajectory matching to minimize association errors. Together, the three modules reduce occlusion-induced identity switches. Extensive evaluations on UAVDT and VisDrone2019 datasets demonstrate advanced performance. The code is released at: https://github.com/aJanm/MOSAIC-Tracker.
KW - Data association
KW - Multi-layer feature aggregation
KW - Multi-object tracking
KW - Spatiotemporal fusion
KW - Unmanned aerial vehicle video
UR - https://www.scopus.com/pages/publications/105014603073
U2 - 10.1016/j.isprsjprs.2025.08.013
DO - 10.1016/j.isprsjprs.2025.08.013
M3 - 文章
AN - SCOPUS:105014603073
SN - 0924-2716
VL - 229
SP - 138
EP - 154
JO - ISPRS Journal of Photogrammetry and Remote Sensing
JF - ISPRS Journal of Photogrammetry and Remote Sensing
ER -