跳到主要导航 跳到搜索 跳到主要内容

Audio denoising and audio-visual localization for unmanned aerial vehicles

  • Zhaojian Li
  • , Jianbo Li
  • , Dengdian Huang
  • , Jianbin Jiao
  • , Jingzhu Li
  • , Bin Zhao
  • Qingdao University
  • Northwestern Polytechnical University Xian
  • University of Chinese Academy of Sciences
  • National Key Laboratory on Near-Surface Detection

科研成果: 期刊稿件文章同行评审

摘要

The unmanned aerial vehicle (UAV) is increasingly drawing attention for its broad potential in disaster response, public safety, and intelligent monitoring. However, noise interference from UAV poses significant challenges to its audio-visual perception capabilities. To address these challenges, we introduce a novel task that integrates UAV audio-visual denoising and localization. To facilitate this research, we collect and construct a UAV audio-visual dataset in real-world environments. The dataset comprises audio and video captured by UAV, along with synchronized ground-based audio, providing a high-quality audio-visual benchmark for this task. We propose a visually guided audio denoising (VGAD) model, which generates a noise suppression mask through visual guidance to effectively attenuate UAV noise. To alleviate the perceptual similarity bias caused by single-anchor modeling, we propose an audio-visual anchor interaction (AVAI) localization model composed of an audio anchor localization (AAL) module and a visual anchor localization (VAL) module. The two modules leverage unsupervised dual contrastive learning to comprehensively capture perceptual similarities between the audio and visual modalities, thereby enhancing cross-modal semantic consistency and improving audio-visual localization performance. Extensive experiments on UAV audio-visual denoising and localization demonstrate that the proposed models significantly suppress UAV noise and improve localization performance. This work is the first to extend audio-visual localization to UAV scenarios, facilitating the advancement of UAV multimodal perception.

源语言英语
文章编号113931
期刊Pattern Recognition
179
DOI
出版状态已出版 - 11月 2026

指纹

探究 'Audio denoising and audio-visual localization for unmanned aerial vehicles' 的科研主题。它们共同构成独一无二的指纹。

引用此