Multimodal learning for multi-label image classification

Yanwei Pang; Zhao Ma; Yuan Yuan; Xuelong Li; Kongqiao Wang

doi:10.1109/ICIP.2011.6115811

Multimodal learning for multi-label image classification

Yanwei Pang, Zhao Ma, Yuan Yuan, Xuelong Li, Kongqiao Wang

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

11 Scopus citations

Abstract

We tackle the challenge of web image classification using additional tags information. Unlike traditional methods that only use the combination of several low-level features, we try to use semantic concepts to represent images and corresponding tags. At first, we extract the latent topic information by probabilistic latent semantic analysis (pLSA) algorithm, and then use multi-label multiple kernel learning to combine visual and textual features to make a better image classification. In our experiments on PASCAL VOC'07 set and MIR Flickr set, we demonstrate the benefit of using multimodal feature to improve image classification. Specifically, we discover that on the issue of image classification, utilizing latent semantic feature to represent images and associated tags can obtain better classification results than other ways that integrating several low-level features.

Original language	English
Title of host publication	ICIP 2011
Subtitle of host publication	2011 18th IEEE International Conference on Image Processing
Pages	1797-1800
Number of pages	4
DOIs	https://doi.org/10.1109/ICIP.2011.6115811
State	Published - 2011
Externally published	Yes
Event	2011 18th IEEE International Conference on Image Processing, ICIP 2011 - Brussels, Belgium Duration: 11 Sep 2011 → 14 Sep 2011

Publication series

Name	Proceedings - International Conference on Image Processing, ICIP
ISSN (Print)	1522-4880

Conference

Conference	2011 18th IEEE International Conference on Image Processing, ICIP 2011
Country/Territory	Belgium
City	Brussels
Period	11/09/11 → 14/09/11

Keywords

Multilabel learning
Multimodal features
Multiple kernel learning

Access to Document

10.1109/ICIP.2011.6115811

Cite this

@inproceedings{e31d2cb5fd6a4be6877c8edc16c9f3b7,

title = "Multimodal learning for multi-label image classification",

abstract = "We tackle the challenge of web image classification using additional tags information. Unlike traditional methods that only use the combination of several low-level features, we try to use semantic concepts to represent images and corresponding tags. At first, we extract the latent topic information by probabilistic latent semantic analysis (pLSA) algorithm, and then use multi-label multiple kernel learning to combine visual and textual features to make a better image classification. In our experiments on PASCAL VOC'07 set and MIR Flickr set, we demonstrate the benefit of using multimodal feature to improve image classification. Specifically, we discover that on the issue of image classification, utilizing latent semantic feature to represent images and associated tags can obtain better classification results than other ways that integrating several low-level features.",

keywords = "Multilabel learning, Multimodal features, Multiple kernel learning",

author = "Yanwei Pang and Zhao Ma and Yuan Yuan and Xuelong Li and Kongqiao Wang",

year = "2011",

doi = "10.1109/ICIP.2011.6115811",

language = "英语",

isbn = "9781457713033",

series = "Proceedings - International Conference on Image Processing, ICIP",

pages = "1797--1800",

booktitle = "ICIP 2011",

note = "2011 18th IEEE International Conference on Image Processing, ICIP 2011 ; Conference date: 11-09-2011 Through 14-09-2011",

}

Pang, Y, Ma, Z, Yuan, Y, Li, X & Wang, K 2011, Multimodal learning for multi-label image classification. in ICIP 2011: 2011 18th IEEE International Conference on Image Processing., 6115811, Proceedings - International Conference on Image Processing, ICIP, pp. 1797-1800, 2011 18th IEEE International Conference on Image Processing, ICIP 2011, Brussels, Belgium, 11/09/11. https://doi.org/10.1109/ICIP.2011.6115811

Multimodal learning for multi-label image classification. / Pang, Yanwei; Ma, Zhao; Yuan, Yuan et al.
ICIP 2011: 2011 18th IEEE International Conference on Image Processing. 2011. p. 1797-1800 6115811 (Proceedings - International Conference on Image Processing, ICIP).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Multimodal learning for multi-label image classification

AU - Pang, Yanwei

AU - Ma, Zhao

AU - Yuan, Yuan

AU - Li, Xuelong

AU - Wang, Kongqiao

PY - 2011

Y1 - 2011

N2 - We tackle the challenge of web image classification using additional tags information. Unlike traditional methods that only use the combination of several low-level features, we try to use semantic concepts to represent images and corresponding tags. At first, we extract the latent topic information by probabilistic latent semantic analysis (pLSA) algorithm, and then use multi-label multiple kernel learning to combine visual and textual features to make a better image classification. In our experiments on PASCAL VOC'07 set and MIR Flickr set, we demonstrate the benefit of using multimodal feature to improve image classification. Specifically, we discover that on the issue of image classification, utilizing latent semantic feature to represent images and associated tags can obtain better classification results than other ways that integrating several low-level features.

AB - We tackle the challenge of web image classification using additional tags information. Unlike traditional methods that only use the combination of several low-level features, we try to use semantic concepts to represent images and corresponding tags. At first, we extract the latent topic information by probabilistic latent semantic analysis (pLSA) algorithm, and then use multi-label multiple kernel learning to combine visual and textual features to make a better image classification. In our experiments on PASCAL VOC'07 set and MIR Flickr set, we demonstrate the benefit of using multimodal feature to improve image classification. Specifically, we discover that on the issue of image classification, utilizing latent semantic feature to represent images and associated tags can obtain better classification results than other ways that integrating several low-level features.

KW - Multilabel learning

KW - Multimodal features

KW - Multiple kernel learning

UR - http://www.scopus.com/inward/record.url?scp=84863045985&partnerID=8YFLogxK

U2 - 10.1109/ICIP.2011.6115811

DO - 10.1109/ICIP.2011.6115811

M3 - 会议稿件

AN - SCOPUS:84863045985

SN - 9781457713033

T3 - Proceedings - International Conference on Image Processing, ICIP

SP - 1797

EP - 1800

BT - ICIP 2011

T2 - 2011 18th IEEE International Conference on Image Processing, ICIP 2011

Y2 - 11 September 2011 through 14 September 2011

ER -

Multimodal learning for multi-label image classification

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this