FocusTrack: Enhancing object detection and tracking for small and ambiguous objects

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-object tracking (MOT) is an essential task in computer vision, but it still faces significant challenges in real-world applications, especially with small, ambiguous, and occluded objects in crowded environments. The research study introduces FocusTrack, an innovative and robust one-stage multi-object tracking system to improve object detection and trajectory association in challenging conditions. FocusTrack initiates by fine-tuning YOLOv10, a modern high-performance detector, across many datasets (MOT17, MOT20, CityPersons, ETHZ, and CrowdHuman). We use copy-paste augmentation on essential training datasets to improve the detection of small and distant objects, therefore significantly improving performance in intricate visual environments. To ensure precise and consistent tracking, FocusTrack introduces several vital modules: Modified Soft Buffered IoU (MS-BIoU) for adaptive IoU matching dependent on object sizes and detection confidence; Adaptive Similarity Enhancement (ASE) for the improvement of similarity matrices through occlusion-aware, motion-scaled, and size-weighted adjustments; and Spatial-Temporal Confidence Enhancement (STCE) to dynamically improve detection confidence using spatial overlap, motion patterns, and crowd density. Furthermore, our Track Recovery and Association Refinement (TRAR) module recovers missing objects via adaptive re-association techniques, while SV-Link ensures motion-aware, occlusion-resistant associations, and SOTS improves trajectories using Gaussian Process Regression specific for object dimensions and occlusion intensity. After evaluation using the challenging MOT17 and MOT20 benchmarks, FocusTrack achieves HOTA scores of 66.91 and 66.5, MOTA scores of 82.32 and 77.9, and IDF1 scores of 82.96 and 82.1, respectively—exceeding other leading online trackers such as BoostTrack++ and BoT-SORT. The results confirm FocusTrack as a very efficient, real-time MOT framework, especially successful at handling complex and crowded environments with small or partially hidden objects.

Original languageEnglish
Article number104549
JournalJournal of Visual Communication and Image Representation
Volume111
DOIs
StatePublished - Sep 2025

Keywords

  • Association
  • MOT
  • Occlusion
  • Tracking
  • Trajectory
  • Yolov10

Fingerprint

Dive into the research topics of 'FocusTrack: Enhancing object detection and tracking for small and ambiguous objects'. Together they form a unique fingerprint.

Cite this