Smooth fusion of multi-spectral images via total variation minimization for traffic scene semantic segmentation

Ying Li; Aiqing Fang; Yangming Guo; Wei Sun; Xiaobao Yang; Xiaodong Wang

doi:10.1016/j.engappai.2023.107741

Smooth fusion of multi-spectral images via total variation minimization for traffic scene semantic segmentation

Ying Li, Aiqing Fang, Yangming Guo, Wei Sun, Xiaobao Yang, Xiaodong Wang

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

Achieving precise semantic segmentation for traffic scenes relies on adopting multi-spectral image fusion techniques to attain high-quality images. Many existing fusion solutions often aim to enhance the similarity between the input and fusion results at the pixel intensity and texture details stage. However, this can result in smoothness issues that limit semantic segmentation performance. To address these issues, we present a smooth representation learning optimization mechanism (SFLM) that conducts image fusion on two dimensions: inter- and intra-image levels. The former overcomes over- or under-smoothing problems via the mutual information maximization between the fusion result and image samples (i.e., negative and positive). The latter balances under and over-smoothing for fusion results by minimizing the total variation in pixel space and maximizing the total variation in gradient space based on contrast learning. In this way, the proposed method effectively overcomes the fusion quality issues, providing better feature representations for semantic segmentation in autonomous vehicles. Experimental results on four public datasets validate our method's effectiveness, robustness, and overall superiority.

源语言	英语
文章编号	107741
期刊	Engineering Applications of Artificial Intelligence
卷	130
DOI	https://doi.org/10.1016/j.engappai.2023.107741
出版状态	已出版 - 4月 2024

访问文件

10.1016/j.engappai.2023.107741

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{37c8e52e9c0d4a74b89fc838b01b30b7,

title = "Smooth fusion of multi-spectral images via total variation minimization for traffic scene semantic segmentation",

abstract = "Achieving precise semantic segmentation for traffic scenes relies on adopting multi-spectral image fusion techniques to attain high-quality images. Many existing fusion solutions often aim to enhance the similarity between the input and fusion results at the pixel intensity and texture details stage. However, this can result in smoothness issues that limit semantic segmentation performance. To address these issues, we present a smooth representation learning optimization mechanism (SFLM) that conducts image fusion on two dimensions: inter- and intra-image levels. The former overcomes over- or under-smoothing problems via the mutual information maximization between the fusion result and image samples (i.e., negative and positive). The latter balances under and over-smoothing for fusion results by minimizing the total variation in pixel space and maximizing the total variation in gradient space based on contrast learning. In this way, the proposed method effectively overcomes the fusion quality issues, providing better feature representations for semantic segmentation in autonomous vehicles. Experimental results on four public datasets validate our method's effectiveness, robustness, and overall superiority.",

keywords = "Image fusion and segmentation, Neural network, Self-supervised learning, Total variation theory",

author = "Ying Li and Aiqing Fang and Yangming Guo and Wei Sun and Xiaobao Yang and Xiaodong Wang",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",

year = "2024",

month = apr,

doi = "10.1016/j.engappai.2023.107741",

language = "英语",

volume = "130",

journal = "Engineering Applications of Artificial Intelligence",

issn = "0952-1976",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - Smooth fusion of multi-spectral images via total variation minimization for traffic scene semantic segmentation

AU - Li, Ying

AU - Fang, Aiqing

AU - Guo, Yangming

AU - Sun, Wei

AU - Yang, Xiaobao

AU - Wang, Xiaodong

PY - 2024/4

Y1 - 2024/4

N2 - Achieving precise semantic segmentation for traffic scenes relies on adopting multi-spectral image fusion techniques to attain high-quality images. Many existing fusion solutions often aim to enhance the similarity between the input and fusion results at the pixel intensity and texture details stage. However, this can result in smoothness issues that limit semantic segmentation performance. To address these issues, we present a smooth representation learning optimization mechanism (SFLM) that conducts image fusion on two dimensions: inter- and intra-image levels. The former overcomes over- or under-smoothing problems via the mutual information maximization between the fusion result and image samples (i.e., negative and positive). The latter balances under and over-smoothing for fusion results by minimizing the total variation in pixel space and maximizing the total variation in gradient space based on contrast learning. In this way, the proposed method effectively overcomes the fusion quality issues, providing better feature representations for semantic segmentation in autonomous vehicles. Experimental results on four public datasets validate our method's effectiveness, robustness, and overall superiority.

AB - Achieving precise semantic segmentation for traffic scenes relies on adopting multi-spectral image fusion techniques to attain high-quality images. Many existing fusion solutions often aim to enhance the similarity between the input and fusion results at the pixel intensity and texture details stage. However, this can result in smoothness issues that limit semantic segmentation performance. To address these issues, we present a smooth representation learning optimization mechanism (SFLM) that conducts image fusion on two dimensions: inter- and intra-image levels. The former overcomes over- or under-smoothing problems via the mutual information maximization between the fusion result and image samples (i.e., negative and positive). The latter balances under and over-smoothing for fusion results by minimizing the total variation in pixel space and maximizing the total variation in gradient space based on contrast learning. In this way, the proposed method effectively overcomes the fusion quality issues, providing better feature representations for semantic segmentation in autonomous vehicles. Experimental results on four public datasets validate our method's effectiveness, robustness, and overall superiority.

KW - Image fusion and segmentation

KW - Neural network

KW - Self-supervised learning

KW - Total variation theory

UR - http://www.scopus.com/inward/record.url?scp=85180555219&partnerID=8YFLogxK

U2 - 10.1016/j.engappai.2023.107741

DO - 10.1016/j.engappai.2023.107741

M3 - 文章

AN - SCOPUS:85180555219

SN - 0952-1976

VL - 130

JO - Engineering Applications of Artificial Intelligence

JF - Engineering Applications of Artificial Intelligence

M1 - 107741

ER -

Smooth fusion of multi-spectral images via total variation minimization for traffic scene semantic segmentation

摘要

访问文件

其它文件与链接

指纹

引用此