NTRENet++: Unleashing the Power of Non-Target Knowledge for Few-Shot Semantic Segmentation

Yuanwei Liu, Nian Liu, Yi Wu, Hisham Cholakkal, Rao Muhammad Anwer, Xiwen Yao, Junwei Han

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Few-shot semantic segmentation (FSS) aims to segment the target object under the condition of a few annotated samples. However, current studies on FSS primarily concentrate on extracting information related to the object, resulting in inadequate identification of ambiguous regions, particularly in non-target areas, including the background (BG) and Distracting Objects (DOs). Intuitively, to alleviate this problem, we propose a novel framework, namely NTRENet++, to explicitly mine and eliminate BG and DO regions in the query. First, we introduce a BG Mining Module (BGMM) to extract BG information and generate a comprehensive BG prototype from all images. For this purpose, a BG mining loss is formulated to supervise the learning of BGMM, utilizing only the known target object segmentation ground truth. Subsequently, based on this BG prototype, we employ a BG Eliminating Module to filter out the BG information from the query and obtain a BG-free result. Following this, the target information is utilized in the target matching module to generate the initial segmentation result. Finally, a DO Eliminating Module is proposed to further mine and eliminate DO regions, based on which we can obtain a BG and DO-free target object segmentation result. Moreover, we present a prototypical-pixel contrastive learning algorithm to enhance the model’s capability to differentiate the target object from DOs. Extensive experiments conducted on both PASCAL-5i and COCO-20i datasets demonstrate the effectiveness of our approach despite its simplicity. Additionally, we extend our method to the few-shot video object segmentation task and achieve improved performance on a baseline model, demonstrating its generalization ability.

Original languageEnglish
Pages (from-to)4314-4328
Number of pages15
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume35
Issue number5
DOIs
StatePublished - 2025

Keywords

  • Few-shot learning
  • few-shot segmentation
  • semantic segmentation
  • video object segmentation

Fingerprint

Dive into the research topics of 'NTRENet++: Unleashing the Power of Non-Target Knowledge for Few-Shot Semantic Segmentation'. Together they form a unique fingerprint.

Cite this