Advanced Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection

Long Li, Huichao Xie, Nian Liu, Dingwen Zhang, Rao Muhammad Anwer, Hisham Cholakkal, Junwei Han

科研成果: 期刊稿件文章同行评审

摘要

Most existing CoSOD models focus solely on extracting co-saliency cues while neglecting explicit exploration of background regions, potentially leading to difficulties in handling interference from complex background areas. To address this, this paper proposes a Discriminative co-saliency and background Mining Transformer framework (DMT) to explicitly mine both co-saliency and background information and effectively model their discriminability. DMT first learns two types of tokens by disjointly extracting co-saliency and background information from segmentation features, then performs discriminability within the segmentation features guided by these well-learned tokens. In the first phase, we propose economic multi-grained correlation modules for efficient detection information extraction, including Region-to-Region (R2R), Contrast-induced Pixel-to-Token (CtP2T), and Co-saliency Token-to-Token (CoT2T) correlation modules. In the subsequent phase, we introduce Token-Guided Feature Refinement (TGFR) modules to enhance discriminability within the segmentation features. To further enhance the discriminative modeling and practicality of DMT, we first upgrade the original TGFR's intra-image modeling approach to an intra-group one, thus proposing Group TGFR (G-TGFR), which is more suitable for the co-saliency task. Subsequently, we designed a Noise Propagation Suppression (NPS) mechanism to apply our model to a more practical open-world scenario, ultimately presenting our extended version, i.e., DMT+O. Extensive experimental results on both conventional CoSOD and open-world CoSOD benchmark datasets demonstrate the effectiveness of our proposed model.

指纹

探究 'Advanced Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection' 的科研主题。它们共同构成独一无二的指纹。

引用此