TY - JOUR
T1 - Heterogeneous visual features fusion via sparse multimodal machine
AU - Wang, Hua
AU - Nie, Feiping
AU - Huang, Heng
AU - Ding, Chris
PY - 2013
Y1 - 2013
N2 - To better understand, search, and classify image and video information, many visual feature descriptors have been proposed to describe elementary visual characteristics, such as the shape, the color, the texture, etc. How to integrate these heterogeneous visual features and identify the important ones from them for specific vision tasks has become an increasingly critical problem. In this paper, We propose a novel Sparse Multimodal Learning (SMML) approach to integrate such heterogeneous features by using the joint structured sparsity regularizations to learn the feature importance of for the vision tasks from both group-wise and individual point of views. A new optimization algorithm is also introduced to solve the non-smooth objective with rigorously proved global convergence. We applied our SMML method to five broadly used object categorization and scene understanding image data sets for both single-label and multi-label image classification tasks. For each data set we integrate six different types of popularly used image features. Compared to existing scene and object categorization methods using either single modality or multi-modalities of features, our approach always achieves better performances measured.
AB - To better understand, search, and classify image and video information, many visual feature descriptors have been proposed to describe elementary visual characteristics, such as the shape, the color, the texture, etc. How to integrate these heterogeneous visual features and identify the important ones from them for specific vision tasks has become an increasingly critical problem. In this paper, We propose a novel Sparse Multimodal Learning (SMML) approach to integrate such heterogeneous features by using the joint structured sparsity regularizations to learn the feature importance of for the vision tasks from both group-wise and individual point of views. A new optimization algorithm is also introduced to solve the non-smooth objective with rigorously proved global convergence. We applied our SMML method to five broadly used object categorization and scene understanding image data sets for both single-label and multi-label image classification tasks. For each data set we integrate six different types of popularly used image features. Compared to existing scene and object categorization methods using either single modality or multi-modalities of features, our approach always achieves better performances measured.
KW - Data Integration
KW - Structured Sparsity
KW - Visual Features Fusion
UR - http://www.scopus.com/inward/record.url?scp=84887363909&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2013.398
DO - 10.1109/CVPR.2013.398
M3 - 会议文章
AN - SCOPUS:84887363909
SN - 1063-6919
SP - 3097
EP - 3102
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
M1 - 6619242
T2 - 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2013
Y2 - 23 June 2013 through 28 June 2013
ER -