Skip to main navigation Skip to search Skip to main content

Learning to Predict Eye Fixations via Multiresolution Convolutional Neural Networks

  • Northwestern Polytechnical University Xian
  • University of Georgia
  • CAS - Xi'an Institute of Optics and Precision Mechanics

Research output: Contribution to journalArticlepeer-review

101 Scopus citations

Abstract

Eye movements in the case of freely viewing natural scenes are believed to be guided by local contrast, global contrast, and top-down visual factors. Although a lot of previous works have explored these three saliency cues for several years, there still exists much room for improvement on how to model them and integrate them effectively. This paper proposes a novel computation model to predict eye fixations, which adopts a multiresolution convolutional neural network (Mr-CNN) to infer these three types of saliency cues from raw image data simultaneously. The proposed Mr-CNN is trained directly from fixation and nonfixation pixels with multiresolution input image regions with different contexts. It utilizes image pixels as inputs and eye fixation points as labels. Then, both the local and global contrasts are learned by fusing information in multiple contexts. Meanwhile, various top-down factors are learned in higher layers. Finally, optimal combination of top-down factors and bottom-up contrasts can be learned to predict eye fixations. The proposed approach significantly outperforms the state-of-the-art methods on several publically available benchmark databases, demonstrating the superiority of Mr-CNN. We also apply our method to the RGB-D image saliency detection problem. Through learning saliency cues induced by depth and RGB information on pixel level jointly and their interactions, our model achieves better performance on predicting eye fixations in RGB-D images.

Original languageEnglish
Article number7762165
Pages (from-to)392-404
Number of pages13
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume29
Issue number2
DOIs
StatePublished - Feb 2018

Keywords

  • Contrast
  • RGB-D
  • convolutional neural network (CNN)
  • eye fixation prediction
  • saliency detection

Fingerprint

Dive into the research topics of 'Learning to Predict Eye Fixations via Multiresolution Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this