Abstract
Absolute visual localization methods with visible sensors have been widely used for Unmanned Aerial Vehicles (UAVs) in GPS-denied environments. However, the visible real-time images could be easily influenced by illumination in applications, and it makes the localization system unable to work. In this paper, an image registration network (NIVnet) is proposed to deal with near-infrared real-time and visible reference images for the multimodal visual localization system of UAVs in GPS-denied environments. In NIVnet, a new feature extraction strategy is first developed to reduce the modality differences of input images. The input images are embedded into a common feature space with disentangled representations. Then, a new bidirectional matching layer is proposed by matching a pair of input images twice in one registration process. Such matching layer can effectively handle large geometric deformations between the images to be registered. Finally, an intensity loss is introduced to further enhance performance by measuring the similarity of monomodal images rather than multimodal images. The proposed NIVnet can predict the affine transformation parameters in an end-to-end way, and thus the localization of UAVs is accelerated. Extensive experiments on three synthetic datasets are conducted to demonstrate the validity of NIVnet, and experimental results show that NIVnet can effectively improve localization accuracy.
Original language | English |
---|---|
Pages (from-to) | 16402-16415 |
Number of pages | 14 |
Journal | IEEE Transactions on Vehicular Technology |
Volume | 73 |
Issue number | 11 |
DOIs | |
State | Published - 2024 |
Keywords
- UAV localization
- cross-modality
- deep learning
- image registration
- visual localization