Enhancing Unimodal Features Matters: A Multimodal Framework for Building Extraction

Xiaofeng Shi, Junyu Gao, Yuan Yuan

科研成果: 期刊稿件文章同行评审

5 引用 (Scopus)

摘要

In recent years, deep learning and multimodal data have substantially propelled the development of building extraction models. However, prevailing multimodal methods are difficult to cope with two challenges: 1) modal laziness: the training error is minimized before the model has learned extensive unimodal patterns and 2) modal imbalance: the backpropagation process is easily dominated by a certain modality. As a result, the unimodal features learning is insufficient, leading to limited performance of the model when dealing with the intricate foreground and background contexts surrounding the buildings. In this article, we deal with this problem from the perspective of algorithm and model evaluation. At the algorithmic level, we propose a unimodal feature enhancement (UFE) framework. Specifically, UFE is model-agnostic, comprising two distinct components: adaptive gradient enhancement (AGE) for modal laziness and consistency constraint loss (CCL) for modal imbalance. AGE dynamically modulates the original gradient by monitoring the representation effects of unimodal features and multimodal fusion features. CCL imposes mutual constraints on diverse modal branches at the semantic level to reconcile the optimization process. At the model evaluation level, a new metric, named unimodal utilization ratio (UUR), is presented to assess models through the learning efficacy of unimodal features. The experimental results including the variants of UUR on two building extraction datasets demonstrate a substantial performance improvement by UFE. Moreover, UFE also exhibits its adaptability when integrated with various model components and its generalization on other multimodal image-related tasks.

源语言英语
文章编号5622013
页(从-至)1-13
页数13
期刊IEEE Transactions on Geoscience and Remote Sensing
62
DOI
出版状态已出版 - 2024

指纹

探究 'Enhancing Unimodal Features Matters: A Multimodal Framework for Building Extraction' 的科研主题。它们共同构成独一无二的指纹。

引用此