TY - GEN
T1 - GGBDCA:Scene graph generation based on Global Gradient Balanced Distribution and Compound Attention
AU - Liu, Jiajia
AU - Zhao, Linan
AU - Zhang, Guoqing
AU - Zhang, Linna
AU - Cen, Yigang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Scene Graph Generation (SGG) is a key task in computer vision, aimed at automatically extracting objects and their relationships from images. Despite significant advances in SGG, the challenge of long-tail distribution continues to impede accurate prediction of rare relationships. To address this, we propose a novel SGG method based on Global Gradient Balanced Distribution and Compound Attention (GGBDCA). First, we introduce a Transformer-based framework that leverages a compound attention mechanism to extract detailed features. To tackle the long-tail identification problem, we propose a Global Gradient Balanced Distribution (GGBD) algorithm, which converts the long-tail issue into a multi-objective optimization problem. Through Gradient Balance Grouping (GBG), similar categories are clustered to enhance attention on rare classes. Then, the Multi-Gradient Descent Algorithm (MGDA) is employed to solve the multi-objective optimization, while the Adaptive Calibration Function (ACF) dynamically adjusts classification scores to improve the model's generalization. These three core modules - GBG, MGDA, and ACF - work together to balance learning between head and tail classes, focusing more on rare categories and boosting overall performance.
AB - Scene Graph Generation (SGG) is a key task in computer vision, aimed at automatically extracting objects and their relationships from images. Despite significant advances in SGG, the challenge of long-tail distribution continues to impede accurate prediction of rare relationships. To address this, we propose a novel SGG method based on Global Gradient Balanced Distribution and Compound Attention (GGBDCA). First, we introduce a Transformer-based framework that leverages a compound attention mechanism to extract detailed features. To tackle the long-tail identification problem, we propose a Global Gradient Balanced Distribution (GGBD) algorithm, which converts the long-tail issue into a multi-objective optimization problem. Through Gradient Balance Grouping (GBG), similar categories are clustered to enhance attention on rare classes. Then, the Multi-Gradient Descent Algorithm (MGDA) is employed to solve the multi-objective optimization, while the Adaptive Calibration Function (ACF) dynamically adjusts classification scores to improve the model's generalization. These three core modules - GBG, MGDA, and ACF - work together to balance learning between head and tail classes, focusing more on rare categories and boosting overall performance.
KW - Scene graph generation
KW - global gradient balance distribution
KW - long-tail distribution
UR - http://www.scopus.com/inward/record.url?scp=85218356572&partnerID=8YFLogxK
U2 - 10.1109/ICSP62129.2024.10846100
DO - 10.1109/ICSP62129.2024.10846100
M3 - 会议稿件
AN - SCOPUS:85218356572
T3 - International Conference on Signal Processing Proceedings, ICSP
SP - 526
EP - 531
BT - ICSP 2024 - 2024 IEEE 17th International Conference on Signal Processing, Proceedings
A2 - Baozong, Yuan
A2 - Qiuqi, Ruan
A2 - Shikui, Wei
A2 - Gaoyun, An
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE International Conference on Signal Processing, ICSP 2024
Y2 - 28 October 2024 through 31 October 2024
ER -