A Joint Convolutional Neural Networks and Context Transfer for Street Scenes Labeling

Research output: Contribution to journalArticlepeer-review

123 Scopus citations

Abstract

Street scene understanding is an essential task for autonomous driving. One important step toward this direction is scene labeling, which annotates each pixel in the images with a correct class label. Although many approaches have been developed, there are still some weak points. First, many methods are based on the hand-crafted features whose image representation ability is limited. Second, they cannot label foreground objects accurately due to the data set bias. Third, in the refinement stage, the traditional Markov random filed inference is prone to over smoothness. For improving the above problems, this paper proposes a joint method of priori convolutional neural networks at superpixel level (called as 'priori s-CNNs') and soft restricted context transfer. Our contributions are threefold: 1) a priori s-CNNs model that learns priori location information at superpixel level is proposed to describe various objects discriminatingly; 2) a hierarchical data augmentation method is presented to alleviate data set bias in the priori s-CNNs training stage, which improves foreground objects labeling significantly; and 3) a soft restricted MRF energy function is defined to improve the priori s-CNNs model's labeling performance and reduce the over smoothness at the same time. The proposed approach is verified on CamVid data set (11 classes) and SIFT Flow Street data set (16 classes) and achieves a competitive performance.

Original languageEnglish
Pages (from-to)1457-1470
Number of pages14
JournalIEEE Transactions on Intelligent Transportation Systems
Volume19
Issue number5
DOIs
StatePublished - May 2018

Keywords

  • convolutional neural networks
  • data augmentation
  • deep learning
  • label transfer
  • Scene labeling
  • street scenes

Fingerprint

Dive into the research topics of 'A Joint Convolutional Neural Networks and Context Transfer for Street Scenes Labeling'. Together they form a unique fingerprint.

Cite this