SL-Seg: A CNN-Transformer Fusion Network for Road Surface and Lane Segmentation in Complex Scenarios

Chenlin Meng, Xin Wang, Qinhao Tu, Zhaoyong Mao, Junge Shen

Research output: Contribution to journalArticlepeer-review

Abstract

Road image segmentation plays a pivotal role in traffic video surveillance for environmental perception. Precise segmentation of roads and lanes is essential for effective traffic monitoring and management. However, unlike the perspective encountered in autonomous driving, the surveillance perspective poses unique challenges due to its wider scope and susceptibility to complex environments. This complexity makes the segmentation task in road surveillance videos particularly demanding. To overcome these challenges, we introduce an end-to-end semantic segmentation network that leverages a CNN-Transformer architecture. Firstly, a spatial pyramid attention-style convolution (SP-AttnConv) module, built upon the Transformer is introduced, to ensure accurate segmentation across long distances while preserving fine boundary information. This module enhances local information and fosters a “global-local” feature fusion framework. Secondly, to tackle the issue of scale imbalance during segmentation, a lightweight multi-scale (LMS) module is introduced to capture multi-scale feature. Additionally, an occlusion relief branch (ORB) module is integrated into the decoder, specifically addressing occlusions caused by irrelevant objects. Recognizing the need for a dedicated benchmark dataset for road surface and lane segmentation, surface-lane (SL) for complex scenarios is built in our paper to promote the development of traffic surveillance system. Comparative experiments demonstrate that our method achieves the best overall performance on the SL dataset.

Original languageEnglish
JournalIEEE Transactions on Intelligent Transportation Systems
DOIs
StateAccepted/In press - 2025

Keywords

  • computer vision
  • intelligent traffic systems
  • Road segmentation
  • segmentation dataset

Fingerprint

Dive into the research topics of 'SL-Seg: A CNN-Transformer Fusion Network for Road Surface and Lane Segmentation in Complex Scenarios'. Together they form a unique fingerprint.

Cite this