TY - JOUR
T1 - Foreground fisher vector
T2 - Encoding class-relevant foreground to improve image classification
AU - Pan, Yongsheng
AU - Xia, Yong
AU - Shen, Dinggang
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.
AB - Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.
KW - convolutional neural networks
KW - feature encoding
KW - foreground Fisher vector
KW - Image classification
UR - http://www.scopus.com/inward/record.url?scp=85070456147&partnerID=8YFLogxK
U2 - 10.1109/TIP.2019.2908795
DO - 10.1109/TIP.2019.2908795
M3 - 文章
C2 - 30946666
AN - SCOPUS:85070456147
SN - 1057-7149
VL - 28
SP - 4716
EP - 4729
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 10
M1 - 8678832
ER -