GGBDCA:Scene graph generation based on Global Gradient Balanced Distribution and Compound Attention

Jiajia Liu, Linan Zhao, Guoqing Zhang, Linna Zhang, Yigang Cen

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Scene Graph Generation (SGG) is a key task in computer vision, aimed at automatically extracting objects and their relationships from images. Despite significant advances in SGG, the challenge of long-tail distribution continues to impede accurate prediction of rare relationships. To address this, we propose a novel SGG method based on Global Gradient Balanced Distribution and Compound Attention (GGBDCA). First, we introduce a Transformer-based framework that leverages a compound attention mechanism to extract detailed features. To tackle the long-tail identification problem, we propose a Global Gradient Balanced Distribution (GGBD) algorithm, which converts the long-tail issue into a multi-objective optimization problem. Through Gradient Balance Grouping (GBG), similar categories are clustered to enhance attention on rare classes. Then, the Multi-Gradient Descent Algorithm (MGDA) is employed to solve the multi-objective optimization, while the Adaptive Calibration Function (ACF) dynamically adjusts classification scores to improve the model's generalization. These three core modules - GBG, MGDA, and ACF - work together to balance learning between head and tail classes, focusing more on rare categories and boosting overall performance.

源语言英语
主期刊名ICSP 2024 - 2024 IEEE 17th International Conference on Signal Processing, Proceedings
编辑Yuan Baozong, Ruan Qiuqi, Wei Shikui, An Gaoyun
出版商Institute of Electrical and Electronics Engineers Inc.
526-531
页数6
ISBN(电子版)9798350387384
DOI
出版状态已出版 - 2024
已对外发布
活动17th IEEE International Conference on Signal Processing, ICSP 2024 - Suzhou, 中国
期限: 28 10月 202431 10月 2024

出版系列

姓名International Conference on Signal Processing Proceedings, ICSP
ISSN(印刷版)2164-5221
ISSN(电子版)2164-523X

会议

会议17th IEEE International Conference on Signal Processing, ICSP 2024
国家/地区中国
Suzhou
时期28/10/2431/10/24

指纹

探究 'GGBDCA:Scene graph generation based on Global Gradient Balanced Distribution and Compound Attention' 的科研主题。它们共同构成独一无二的指纹。

引用此