TY - JOUR
T1 - Learning Intrinsic Hierarchy for Generalized Category Discovery
AU - Duan, Yu
AU - He, Junzhi
AU - Hu, Zhanxuan
AU - Ji, Mengda
AU - Wang, Rong
AU - Gao, Quanxue
N1 - Publisher Copyright:
© 2026, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2026
Y1 - 2026
N2 - Generalized Category Discovery (GCD) aims to classify unlabeled data by leveraging knowledge from labeled categories. While existing methods have achieved remarkable progress, they often treat images as flat feature sets, neglecting the intrinsic hierarchy: where key objects dominate meaning and backgrounds serve as context. For instance, in images of a dog either standing on grass or lying on a bed, the dog remains the central semantic element, whereas the background varies. Motivated by this, we propose LEArning Intrinsic Hierarchy (LEAH), a lightweight module designed to model hierarchical structure within images. LEAH consists of two components: a pruner that filters task irrelevant tokens to extract key objects, and a constructor that embeds key objects and full images into hyperbolic space using adaptive entailment cones to capture compositional semantics. LEAH can be easily integrated into existing GCD frameworks with minimal modification. When applied to SimGCD, it achieves up to 13.2% accuracy improvement on fine-grained benchmarks, demonstrating its effectiveness in discovering subtle inter-class differences through hierarchical modeling.
AB - Generalized Category Discovery (GCD) aims to classify unlabeled data by leveraging knowledge from labeled categories. While existing methods have achieved remarkable progress, they often treat images as flat feature sets, neglecting the intrinsic hierarchy: where key objects dominate meaning and backgrounds serve as context. For instance, in images of a dog either standing on grass or lying on a bed, the dog remains the central semantic element, whereas the background varies. Motivated by this, we propose LEArning Intrinsic Hierarchy (LEAH), a lightweight module designed to model hierarchical structure within images. LEAH consists of two components: a pruner that filters task irrelevant tokens to extract key objects, and a constructor that embeds key objects and full images into hyperbolic space using adaptive entailment cones to capture compositional semantics. LEAH can be easily integrated into existing GCD frameworks with minimal modification. When applied to SimGCD, it achieves up to 13.2% accuracy improvement on fine-grained benchmarks, demonstrating its effectiveness in discovering subtle inter-class differences through hierarchical modeling.
UR - https://www.scopus.com/pages/publications/105034737021
U2 - 10.1609/aaai.v40i25.39236
DO - 10.1609/aaai.v40i25.39236
M3 - 会议文章
AN - SCOPUS:105034737021
SN - 2159-5399
VL - 40
SP - 20950
EP - 20958
JO - Proceedings of the AAAI Conference on Artificial Intelligence
JF - Proceedings of the AAAI Conference on Artificial Intelligence
IS - 25
T2 - 40th AAAI Conference on Artificial Intelligence, AAAI 2026
Y2 - 20 January 2026 through 27 January 2026
ER -