LoFLAT: Local Feature Matching using Focused Linear Attention Transformer

Naijian Cao; Renjie He; Yuchao Dai; Mingyi He

doi:10.1109/APSIPAASC63619.2025.10848757

LoFLAT: Local Feature Matching using Focused Linear Attention Transformer

Naijian Cao, Renjie He, Yuchao Dai, Mingyi He

电子信息学院

Northwestern Polytechnical University Xian

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

Local feature matching is an essential technique in image matching and plays a critical role in a wide range of vision-based applications. However, existing Transformer-based detector-free local feature matching methods encounter challenges due to the quadratic computational complexity of attention mechanisms, especially at high resolutions. However, while existing Transformer-based detector-free local feature matching methods have reduced computational costs using linear attention mechanisms, they still struggle to capture detailed local interactions, which affects the accuracy and robustness of precise local correspondences. In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper. Our LoFLAT consists of three main modules: the Feature Extraction Module, the Feature Transformer Module, and the Matching Module. Specifically, the Feature Extraction Module firstly uses ResNet and a Feature Pyramid Network to extract hierarchical features. The Feature Transformer Module further employs the Focused Linear Attention to refine attention distribution with a focused mapping function and to enhance feature diversity with a depthwise convolution. Finally, the Matching Module predicts accurate and robust matches through a coarse-to-fine strategy. Extensive experimental evaluations demonstrate that the proposed LoFLAT outperforms the LoFTR method in terms of both efficiency and accuracy.

源语言	英语
主期刊名	APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024
出版商	Institute of Electrical and Electronics Engineers Inc.
ISBN（电子版）	9798350367331
DOI	https://doi.org/10.1109/APSIPAASC63619.2025.10848757
出版状态	已出版 - 2024
活动	2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024 - Macau, 中国期限: 3 12月 2024 → 6 12月 2024

出版系列

姓名	APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024

会议

会议	2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024
国家/地区	中国
市	Macau
时期	3/12/24 → 6/12/24

访问文件

10.1109/APSIPAASC63619.2025.10848757

其它文件与链接

链接到 Scopus 的出版物

引用此

Cao, N., He, R., Dai, Y., & He, M. (2024). LoFLAT: Local Feature Matching using Focused Linear Attention Transformer. 在 APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024 (APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPAASC63619.2025.10848757

Cao, Naijian ; He, Renjie ; Dai, Yuchao 等. / LoFLAT : Local Feature Matching using Focused Linear Attention Transformer. APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024. Institute of Electrical and Electronics Engineers Inc., 2024. (APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024).

@inproceedings{1920e8efdb2d4b128e75dc3537a210dd,

title = "LoFLAT: Local Feature Matching using Focused Linear Attention Transformer",

abstract = "Local feature matching is an essential technique in image matching and plays a critical role in a wide range of vision-based applications. However, existing Transformer-based detector-free local feature matching methods encounter challenges due to the quadratic computational complexity of attention mechanisms, especially at high resolutions. However, while existing Transformer-based detector-free local feature matching methods have reduced computational costs using linear attention mechanisms, they still struggle to capture detailed local interactions, which affects the accuracy and robustness of precise local correspondences. In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper. Our LoFLAT consists of three main modules: the Feature Extraction Module, the Feature Transformer Module, and the Matching Module. Specifically, the Feature Extraction Module firstly uses ResNet and a Feature Pyramid Network to extract hierarchical features. The Feature Transformer Module further employs the Focused Linear Attention to refine attention distribution with a focused mapping function and to enhance feature diversity with a depthwise convolution. Finally, the Matching Module predicts accurate and robust matches through a coarse-to-fine strategy. Extensive experimental evaluations demonstrate that the proposed LoFLAT outperforms the LoFTR method in terms of both efficiency and accuracy.",

author = "Naijian Cao and Renjie He and Yuchao Dai and Mingyi He",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024 ; Conference date: 03-12-2024 Through 06-12-2024",

year = "2024",

doi = "10.1109/APSIPAASC63619.2025.10848757",

language = "英语",

series = "APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024",

}

Cao, N, He, R, Dai, Y & He, M 2024, LoFLAT: Local Feature Matching using Focused Linear Attention Transformer. 在 APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024. APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024, Institute of Electrical and Electronics Engineers Inc., 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024, Macau, 中国, 3/12/24. https://doi.org/10.1109/APSIPAASC63619.2025.10848757

LoFLAT: Local Feature Matching using Focused Linear Attention Transformer. / Cao, Naijian; He, Renjie; Dai, Yuchao 等.
APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024. Institute of Electrical and Electronics Engineers Inc., 2024. (APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - LoFLAT

T2 - 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024

AU - Cao, Naijian

AU - He, Renjie

AU - Dai, Yuchao

AU - He, Mingyi

PY - 2024

Y1 - 2024

N2 - Local feature matching is an essential technique in image matching and plays a critical role in a wide range of vision-based applications. However, existing Transformer-based detector-free local feature matching methods encounter challenges due to the quadratic computational complexity of attention mechanisms, especially at high resolutions. However, while existing Transformer-based detector-free local feature matching methods have reduced computational costs using linear attention mechanisms, they still struggle to capture detailed local interactions, which affects the accuracy and robustness of precise local correspondences. In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper. Our LoFLAT consists of three main modules: the Feature Extraction Module, the Feature Transformer Module, and the Matching Module. Specifically, the Feature Extraction Module firstly uses ResNet and a Feature Pyramid Network to extract hierarchical features. The Feature Transformer Module further employs the Focused Linear Attention to refine attention distribution with a focused mapping function and to enhance feature diversity with a depthwise convolution. Finally, the Matching Module predicts accurate and robust matches through a coarse-to-fine strategy. Extensive experimental evaluations demonstrate that the proposed LoFLAT outperforms the LoFTR method in terms of both efficiency and accuracy.

AB - Local feature matching is an essential technique in image matching and plays a critical role in a wide range of vision-based applications. However, existing Transformer-based detector-free local feature matching methods encounter challenges due to the quadratic computational complexity of attention mechanisms, especially at high resolutions. However, while existing Transformer-based detector-free local feature matching methods have reduced computational costs using linear attention mechanisms, they still struggle to capture detailed local interactions, which affects the accuracy and robustness of precise local correspondences. In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper. Our LoFLAT consists of three main modules: the Feature Extraction Module, the Feature Transformer Module, and the Matching Module. Specifically, the Feature Extraction Module firstly uses ResNet and a Feature Pyramid Network to extract hierarchical features. The Feature Transformer Module further employs the Focused Linear Attention to refine attention distribution with a focused mapping function and to enhance feature diversity with a depthwise convolution. Finally, the Matching Module predicts accurate and robust matches through a coarse-to-fine strategy. Extensive experimental evaluations demonstrate that the proposed LoFLAT outperforms the LoFTR method in terms of both efficiency and accuracy.

UR - http://www.scopus.com/inward/record.url?scp=85218182155&partnerID=8YFLogxK

U2 - 10.1109/APSIPAASC63619.2025.10848757

DO - 10.1109/APSIPAASC63619.2025.10848757

M3 - 会议稿件

AN - SCOPUS:85218182155

T3 - APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024

BT - APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 3 December 2024 through 6 December 2024

ER -

Cao N, He R, Dai Y, He M. LoFLAT: Local Feature Matching using Focused Linear Attention Transformer. 在 APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024. Institute of Electrical and Electronics Engineers Inc. 2024. (APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024). doi: 10.1109/APSIPAASC63619.2025.10848757

LoFLAT: Local Feature Matching using Focused Linear Attention Transformer

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此