Generalized Weakly Supervised Object Localization

Dingwen Zhang; Guangyu Guo; Wenyuan Zeng; Lei Li; Junwei Han

doi:10.1109/TNNLS.2022.3204337

Generalized Weakly Supervised Object Localization

Dingwen Zhang, Guangyu Guo, Wenyuan Zeng, Lei Li, Junwei Han

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

19 引用（Scopus）

摘要

With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.

源语言	英语
页（从-至）	5395-5406
页数	12
期刊	IEEE Transactions on Neural Networks and Learning Systems
卷	35
期	4
DOI	https://doi.org/10.1109/TNNLS.2022.3204337
出版状态	已出版 - 1 4月 2024

访问文件

10.1109/TNNLS.2022.3204337

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{d2fa31c9d9df4c318967b3ab75df28a0,

title = "Generalized Weakly Supervised Object Localization",

abstract = "With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.",

keywords = "Object localization, unseen object category, weakly supervised learning",

author = "Dingwen Zhang and Guangyu Guo and Wenyuan Zeng and Lei Li and Junwei Han",

note = "Publisher Copyright: {\textcopyright} 2012 IEEE.",

year = "2024",

month = apr,

day = "1",

doi = "10.1109/TNNLS.2022.3204337",

language = "英语",

volume = "35",

pages = "5395--5406",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-237X",

publisher = "IEEE Computational Intelligence Society",

number = "4",

}

TY - JOUR

T1 - Generalized Weakly Supervised Object Localization

AU - Zhang, Dingwen

AU - Guo, Guangyu

AU - Zeng, Wenyuan

AU - Li, Lei

AU - Han, Junwei

PY - 2024/4/1

Y1 - 2024/4/1

N2 - With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.

AB - With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.

KW - Object localization

KW - unseen object category

KW - weakly supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85139425830&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2022.3204337

DO - 10.1109/TNNLS.2022.3204337

M3 - 文章

C2 - 36129872

AN - SCOPUS:85139425830

SN - 2162-237X

VL - 35

SP - 5395

EP - 5406

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

IS - 4

ER -

Generalized Weakly Supervised Object Localization

摘要

访问文件

其它文件与链接

指纹

引用此