跳到主要导航 跳到搜索 跳到主要内容

Generative Transformer for Accurate and Reliable Salient Object Detection

  • Yuxin Mao
  • , Jing Zhang
  • , Zhexiong Wan
  • , Xinyu Tian
  • , Aixuan Li
  • , Yunqiu Lv
  • , Yuchao Dai
  • Northwestern Polytechnical University Xian
  • Shaanxi Key Laboratory of Information Acquisition and Processing
  • Australian National University

科研成果: 期刊稿件文章同行评审

21 引用 (Scopus)

摘要

We explore the impact of transformers on accurate and reliable salient object detection. For accuracy, we integrate the transformer with a deterministic model and delineate its advantages in structural modeling. Regarding reliability, we address the transformer's tendency to produce overly confident, incorrect predictions. To gauge reliability implicitly, we introduce a latent variable model within the transformer framework, termed the inferential generative adversarial network (iGAN). The stochastic nature of the latent variable facilitates the estimation of predictive uncertainty, which serves as an auxiliary measure of the model's prediction reliability. Different from the conventional GAN, which defines the distribution of the latent variable as fixed standard normal distribution N0,I. The proposed iGAN infers the latent variable by gradient-based Markov Chain Monte Carlo (MCMC), namely Langevin dynamics, leading to an input-dependent latent variable model. We apply our proposed iGAN to fully supervised salient object detection, explaining that iGAN within the transformer framework leads to both accurate and reliable salient object detection. The source code and experimental results are publicly available via our project page: https://npucvr.github.io/TransformerSOD.

源语言英语
页(从-至)1041-1054
页数14
期刊IEEE Transactions on Circuits and Systems for Video Technology
35
2
DOI
出版状态已出版 - 2025

指纹

探究 'Generative Transformer for Accurate and Reliable Salient Object Detection' 的科研主题。它们共同构成独一无二的指纹。

引用此