Publication - Data-Driven Image Captioning via Salient Region Discovery

Type
Journal Article

Authors
Mert Kilickaya, Burak Kerim Akkus, Ruket Cakici, Aykut Erdem, Erkut Erdem, Nazli Ikizler-Cinbis

Published/Presented on
IET Computer Vision

Publication Page
No information

Full Text
Click to download

Supplementary Materials
No information

Abstract In the past few years, automatically generating descriptions for images has attracted a lot of attention in computer vision and natural language processing research. Among the existing approaches, data-driven methods have been proven to be highly effective. These methods compare the given image against a large set of training images to determine a set of relevant images, then generate a description using the associated captions. In this study, we propose to integrate an object-based semantic image representation into a deep features-based retrieval framework to select the relevant images. Moreover, we present a novel phrase selection paradigm and a sentence generation model which depends on a joint analysis of salient regions in the input and retrieved images within a clustering framework. We demonstrate the effectiveness of our proposed approach on Flickr8K and Flickr30K benchmark datasets and show that our model gives highly competitive results compared to the state-of-the-art models.

BibTeX

@article{erdemIJISAE2016a,
title={Data-Driven Image Captioning via Salient Region Discovery},
author={Mert Kilickaya and Burak Kerim Akkus and Ruket Cakici and Aykut Erdem and Erkut Erdem and Nazli Ikizler-Cinbis},
journal={IET Computer Vision},
year={2017},
volume = {XXX},
number = {XXX},
pages = {XXX}
}