TY - JOUR
T1 - Generalized Weakly Supervised Object Localization
AU - Zhang, Dingwen
AU - Guo, Guangyu
AU - Zeng, Wenyuan
AU - Li, Lei
AU - Han, Junwei
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2024/4/1
Y1 - 2024/4/1
N2 - With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.
AB - With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.
KW - Object localization
KW - unseen object category
KW - weakly supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85139425830&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2022.3204337
DO - 10.1109/TNNLS.2022.3204337
M3 - 文章
C2 - 36129872
AN - SCOPUS:85139425830
SN - 2162-237X
VL - 35
SP - 5395
EP - 5406
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 4
ER -