TY - JOUR
T1 - VSSA-NET
T2 - Vertical Spatial Sequence Attention Network for Traffic Sign Detection
AU - Yuan, Yuan
AU - Xiong, Zhitong
AU - Wang, Qi
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Although traffic sign detection has been studied for years and great progress has been made with the rise of deep learning technique, there are still many problems remaining to be addressed. For complicated real-world traffic scenes, there are two main challenges. First, traffic signs are usually small-sized objects, which makes them more difficult to detect than large ones; second, it is hard to distinguish false targets which resemble real traffic signs in complex street scenes without context information. To handle these problems, we propose a novel end-to-end deep learning method for traffic sign detection in complex environments. Our contributions are as follows: 1) we propose a multi-resolution feature fusion network architecture which exploits densely connected deconvolution layers with skip connections, and can learn more effective features for a small-size object and 2) we frame the traffic sign detection as a spatial sequence classification and regression task, and propose a vertical spatial sequence attention module to gain more context information for better detection performance. To comprehensively evaluate the proposed method, we experiment on several traffic sign datasets as well as the general object detection dataset, and the results have shown the effectiveness of our proposed method.
AB - Although traffic sign detection has been studied for years and great progress has been made with the rise of deep learning technique, there are still many problems remaining to be addressed. For complicated real-world traffic scenes, there are two main challenges. First, traffic signs are usually small-sized objects, which makes them more difficult to detect than large ones; second, it is hard to distinguish false targets which resemble real traffic signs in complex street scenes without context information. To handle these problems, we propose a novel end-to-end deep learning method for traffic sign detection in complex environments. Our contributions are as follows: 1) we propose a multi-resolution feature fusion network architecture which exploits densely connected deconvolution layers with skip connections, and can learn more effective features for a small-size object and 2) we frame the traffic sign detection as a spatial sequence classification and regression task, and propose a vertical spatial sequence attention module to gain more context information for better detection performance. To comprehensively evaluate the proposed method, we experiment on several traffic sign datasets as well as the general object detection dataset, and the results have shown the effectiveness of our proposed method.
KW - context modeling
KW - sequence attention model
KW - small object
KW - Traffic sign detection
UR - http://www.scopus.com/inward/record.url?scp=85066410855&partnerID=8YFLogxK
U2 - 10.1109/TIP.2019.2896952
DO - 10.1109/TIP.2019.2896952
M3 - 文章
C2 - 30716035
AN - SCOPUS:85066410855
SN - 1057-7149
VL - 28
SP - 3423
EP - 3434
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 7
M1 - 8632977
ER -