Spatio-Temporal Synergy with ViT: Enhancing Collaborative Perception and Object Detection for Heterogeneous Agents

Yuan Gao, Sicong Liu, Xiangrui Xu, Zhiyang Ding, Bin Guo, Zhiwen Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

To address the limitations of traditional heterogeneous agent cooperative sensing methods in terms of feedback latency and spatiotemporal dependencies, this paper proposes a heterogeneous agents enhancing cooperative perception and object detection system. The system is based on the Vision Transformer (ViT) model, leveraging its superior global context awareness and multimodal data fusion capabilities. Additionally, it incorporates the proposed adaptive delay position sensing module and spatiotemporal dependency dynamic modeling module, effectively resolving issues related to data transmission latency and complex spatiotemporal dependencies between agents. This significantly enhances the accuracy and timeliness of heterogeneous multi-agent collaborative sensing systems.

Original languageEnglish
Title of host publicationRMELS 2024 - Proceedings of the 1st ACM International Workshop on Resource-efficient Mobile and Embedded LLM System in AIoT, Part of
Subtitle of host publicationACM Sensys 2024
PublisherAssociation for Computing Machinery, Inc
Pages3-5
Number of pages3
ISBN (Electronic)9798400712951
DOIs
StatePublished - 4 Nov 2024
Event1st ACM International Workshop on Resource-efficient Mobile and Embedded LLM System in AIoT, RMELS 2024 - Hangzhou, China
Duration: 4 Nov 2024 → …

Publication series

NameRMELS 2024 - Proceedings of the 1st ACM International Workshop on Resource-efficient Mobile and Embedded LLM System in AIoT, Part of: ACM Sensys 2024

Conference

Conference1st ACM International Workshop on Resource-efficient Mobile and Embedded LLM System in AIoT, RMELS 2024
Country/TerritoryChina
CityHangzhou
Period4/11/24 → …

Keywords

  • Cooperative perception
  • Deep learning
  • Heterogeneous agents
  • Vision transformer

Fingerprint

Dive into the research topics of 'Spatio-Temporal Synergy with ViT: Enhancing Collaborative Perception and Object Detection for Heterogeneous Agents'. Together they form a unique fingerprint.

Cite this