Deep binary reconstruction for cross-modal hashing

Xuelong Li, Di Hu, Feiping Nie

科研成果: 书/报告/会议事项章节会议稿件同行评审

41 引用 (Scopus)

摘要

With the increasing demand of massive multimodal data storage and organization, cross-modal retrieval based on hashing technique has drawn much attention nowadays. It takes the binary codes of one modality as the query to retrieve the relevant hashing codes of another modality. However, the existing binary constraint makes it difficult to find the optimal cross-modal hashing function. Most approaches choose to relax the constraint and perform thresholding strategy on the real-value representation instead of directly solving the original objective. In this paper, we first provide a concrete analysis about the effectiveness of multimodal networks in preserving the inter- and intra-modal consistency. Based on the analysis, we provide a so-called Deep Binary Reconstruction (DBRC) network that can directly learn the binary hashing codes in an unsupervised fashion. The superiority comes from a proposed simple but efficient activation function, named as Adaptive Tanh (ATanh). The ATanh function can adaptively learn the binary codes and be trained via back-propagation. Extensive experiments on three benchmark datasets demonstrate that DBRC outperforms several state-of-the-art methods in both image2text and text2image retrieval task.

源语言英语
主期刊名MM 2017 - Proceedings of the 2017 ACM Multimedia Conference
出版商Association for Computing Machinery, Inc
1398-1406
页数9
ISBN(电子版)9781450349062
DOI
出版状态已出版 - 23 10月 2017
活动25th ACM International Conference on Multimedia, MM 2017 - Mountain View, 美国
期限: 23 10月 201727 10月 2017

出版系列

姓名MM 2017 - Proceedings of the 2017 ACM Multimedia Conference

会议

会议25th ACM International Conference on Multimedia, MM 2017
国家/地区美国
Mountain View
时期23/10/1727/10/17

指纹

探究 'Deep binary reconstruction for cross-modal hashing' 的科研主题。它们共同构成独一无二的指纹。

引用此