Two-Stream Convolutional Networks for Dynamic Saliency Prediction
arXiv preprint
Abstract In recent years, visual saliency estimation in images has attracted much attention in the computer vision community. However, predicting saliency in videos has received relatively little attention. Inspired by the recent success of deep convolutional neural networks based static saliency models, in this work, we study two different two-stream convolutional networks for dynamic saliency prediction. To improve the generalization capability of our models, we also introduce a novel, empirically grounded data augmentation technique for this task. We test our models on DIEM dataset and report superior results against the existing mod- els. Moreover, we perform transfer learning experiments on SALICON, a recently proposed static saliency dataset, by finetuning our models on the optical flows estimated from static images. Our experiments show that taking motion into account in this way can be helpful for static saliency estimation.

BibTeX
@inproceedings{Bak2016arxiv,
title={Two-Stream Convolutional Networks for Dynamic Saliency Prediction},
author={Cagdas Bak and Aykut Erdem and Erkut Erdem},
booktitle={arXiv preprint arXiv:1607.04730},
month={July},
year={2016}
}