Multimodal Absolute Visual Localization for Unmanned Aerial Vehicles

Zhunga Liu; Huandong Li; Zuowei Zhang; Yanyi Lyu; Jiexuan Xiong

doi:10.1109/TVT.2024.3426538

Multimodal Absolute Visual Localization for Unmanned Aerial Vehicles

Zhunga Liu, Huandong Li, Zuowei Zhang, Yanyi Lyu, Jiexuan Xiong

School of Automation

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

Abstract

Absolute visual localization methods with visible sensors have been widely used for Unmanned Aerial Vehicles (UAVs) in GPS-denied environments. However, the visible real-time images could be easily influenced by illumination in applications, and it makes the localization system unable to work. In this paper, an image registration network (NIVnet) is proposed to deal with near-infrared real-time and visible reference images for the multimodal visual localization system of UAVs in GPS-denied environments. In NIVnet, a new feature extraction strategy is first developed to reduce the modality differences of input images. The input images are embedded into a common feature space with disentangled representations. Then, a new bidirectional matching layer is proposed by matching a pair of input images twice in one registration process. Such matching layer can effectively handle large geometric deformations between the images to be registered. Finally, an intensity loss is introduced to further enhance performance by measuring the similarity of monomodal images rather than multimodal images. The proposed NIVnet can predict the affine transformation parameters in an end-to-end way, and thus the localization of UAVs is accelerated. Extensive experiments on three synthetic datasets are conducted to demonstrate the validity of NIVnet, and experimental results show that NIVnet can effectively improve localization accuracy.

Original language	English
Pages (from-to)	16402-16415
Number of pages	14
Journal	IEEE Transactions on Vehicular Technology
Volume	73
Issue number	11
DOIs	https://doi.org/10.1109/TVT.2024.3426538
State	Published - 2024

Keywords

UAV localization
cross-modality
deep learning
image registration
visual localization

Access to Document

10.1109/TVT.2024.3426538

Cite this

@article{bad6f9d2fd854cbb997bd8fd6d15b96d,

title = "Multimodal Absolute Visual Localization for Unmanned Aerial Vehicles",

abstract = "Absolute visual localization methods with visible sensors have been widely used for Unmanned Aerial Vehicles (UAVs) in GPS-denied environments. However, the visible real-time images could be easily influenced by illumination in applications, and it makes the localization system unable to work. In this paper, an image registration network (NIVnet) is proposed to deal with near-infrared real-time and visible reference images for the multimodal visual localization system of UAVs in GPS-denied environments. In NIVnet, a new feature extraction strategy is first developed to reduce the modality differences of input images. The input images are embedded into a common feature space with disentangled representations. Then, a new bidirectional matching layer is proposed by matching a pair of input images twice in one registration process. Such matching layer can effectively handle large geometric deformations between the images to be registered. Finally, an intensity loss is introduced to further enhance performance by measuring the similarity of monomodal images rather than multimodal images. The proposed NIVnet can predict the affine transformation parameters in an end-to-end way, and thus the localization of UAVs is accelerated. Extensive experiments on three synthetic datasets are conducted to demonstrate the validity of NIVnet, and experimental results show that NIVnet can effectively improve localization accuracy.",

keywords = "UAV localization, cross-modality, deep learning, image registration, visual localization",

author = "Zhunga Liu and Huandong Li and Zuowei Zhang and Yanyi Lyu and Jiexuan Xiong",

note = "Publisher Copyright: {\textcopyright} 1967-2012 IEEE.",

year = "2024",

doi = "10.1109/TVT.2024.3426538",

language = "英语",

volume = "73",

pages = "16402--16415",

journal = "IEEE Transactions on Vehicular Technology",

issn = "0018-9545",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "11",

}

TY - JOUR

T1 - Multimodal Absolute Visual Localization for Unmanned Aerial Vehicles

AU - Liu, Zhunga

AU - Li, Huandong

AU - Zhang, Zuowei

AU - Lyu, Yanyi

AU - Xiong, Jiexuan

PY - 2024

Y1 - 2024

N2 - Absolute visual localization methods with visible sensors have been widely used for Unmanned Aerial Vehicles (UAVs) in GPS-denied environments. However, the visible real-time images could be easily influenced by illumination in applications, and it makes the localization system unable to work. In this paper, an image registration network (NIVnet) is proposed to deal with near-infrared real-time and visible reference images for the multimodal visual localization system of UAVs in GPS-denied environments. In NIVnet, a new feature extraction strategy is first developed to reduce the modality differences of input images. The input images are embedded into a common feature space with disentangled representations. Then, a new bidirectional matching layer is proposed by matching a pair of input images twice in one registration process. Such matching layer can effectively handle large geometric deformations between the images to be registered. Finally, an intensity loss is introduced to further enhance performance by measuring the similarity of monomodal images rather than multimodal images. The proposed NIVnet can predict the affine transformation parameters in an end-to-end way, and thus the localization of UAVs is accelerated. Extensive experiments on three synthetic datasets are conducted to demonstrate the validity of NIVnet, and experimental results show that NIVnet can effectively improve localization accuracy.

AB - Absolute visual localization methods with visible sensors have been widely used for Unmanned Aerial Vehicles (UAVs) in GPS-denied environments. However, the visible real-time images could be easily influenced by illumination in applications, and it makes the localization system unable to work. In this paper, an image registration network (NIVnet) is proposed to deal with near-infrared real-time and visible reference images for the multimodal visual localization system of UAVs in GPS-denied environments. In NIVnet, a new feature extraction strategy is first developed to reduce the modality differences of input images. The input images are embedded into a common feature space with disentangled representations. Then, a new bidirectional matching layer is proposed by matching a pair of input images twice in one registration process. Such matching layer can effectively handle large geometric deformations between the images to be registered. Finally, an intensity loss is introduced to further enhance performance by measuring the similarity of monomodal images rather than multimodal images. The proposed NIVnet can predict the affine transformation parameters in an end-to-end way, and thus the localization of UAVs is accelerated. Extensive experiments on three synthetic datasets are conducted to demonstrate the validity of NIVnet, and experimental results show that NIVnet can effectively improve localization accuracy.

KW - UAV localization

KW - cross-modality

KW - deep learning

KW - image registration

KW - visual localization

UR - http://www.scopus.com/inward/record.url?scp=85198259696&partnerID=8YFLogxK

U2 - 10.1109/TVT.2024.3426538

DO - 10.1109/TVT.2024.3426538

M3 - 文章

AN - SCOPUS:85198259696

SN - 0018-9545

VL - 73

SP - 16402

EP - 16415

JO - IEEE Transactions on Vehicular Technology

JF - IEEE Transactions on Vehicular Technology

IS - 11

ER -

Multimodal Absolute Visual Localization for Unmanned Aerial Vehicles

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this