TY - JOUR
T1 - Optimized graph learning using partial tags and multiple features for image and video annotation
AU - Song, Jingkuan
AU - Gao, Lianli
AU - Nie, Feiping
AU - Shen, Heng Tao
AU - Yan, Yan
AU - Sebe, Nicu
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2016/11
Y1 - 2016/11
N2 - In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available. This is often done by adding a geometry-based regularization term in the objective function of a supervised learning model. In this case, a similarity graph is indispensable to exploit the geometrical relationships among the training data points, and the graph construction scheme essentially determines the performance of these graph-based learning algorithms. However, most of the existing works construct the graph empirically and are usually based on a single feature without using the label information. In this paper, we propose a semi-supervised annotation approach by learning an optimized graph (OGL) from multi-cues (i.e., partial tags and multiple features), which can more accurately embed the relationships among the data points. Since OGL is a transductive method and cannot deal with novel data points, we further extend our model to address the out-of-sample issue. Extensive experiments on image and video annotation show the consistent superiority of OGL over the state-of-the-art methods.
AB - In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available. This is often done by adding a geometry-based regularization term in the objective function of a supervised learning model. In this case, a similarity graph is indispensable to exploit the geometrical relationships among the training data points, and the graph construction scheme essentially determines the performance of these graph-based learning algorithms. However, most of the existing works construct the graph empirically and are usually based on a single feature without using the label information. In this paper, we propose a semi-supervised annotation approach by learning an optimized graph (OGL) from multi-cues (i.e., partial tags and multiple features), which can more accurately embed the relationships among the data points. Since OGL is a transductive method and cannot deal with novel data points, we further extend our model to address the out-of-sample issue. Extensive experiments on image and video annotation show the consistent superiority of OGL over the state-of-the-art methods.
KW - Graph learning
KW - image and video annotation
KW - semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=84991618952&partnerID=8YFLogxK
U2 - 10.1109/TIP.2016.2601260
DO - 10.1109/TIP.2016.2601260
M3 - 文章
AN - SCOPUS:84991618952
SN - 1057-7149
VL - 25
SP - 4999
EP - 5011
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 11
M1 - 7547296
ER -