A Dual Prediction Network for Image Captioning

Guo, Y.; Liu, Y.; Boer, M.H.T. de; Liu, L.; Lew, M.S.

A Dual Prediction Network for Image Captioning

conference paper

2018

Guo, Y.

Liu, Y.

Boer, M.H.T. de

Liu, L.

Lew, M.S.

General captioning practice involves a single forward prediction, with the aim of predicting the word in the next timestep given the word in the current timestep. In this paper, we present a novel captioning framework, namely Dual Prediction Network (DPN), which is end-to-end trainable and addresses the captioning problem with dual predictions. Specifically, the dual predictions consist of a forward prediction to generate the next word from the current input word, as well as a backward prediction to reconstruct the input word using the predicted word. DPN has two appealing properties: 1) By introducing an extra supervision signal on the prediction, DPN can better capture the interplay between the input and the target; 2) Utilizing the reconstructed input, DPN can make another new prediction. During the test phase, we average both predictions to formulate the final target sentence. Experimental results on the MS COCO dataset demonstrate that, benefiting from the reconstruction step, both generated predictions in DPN outperform the predictions of methods based on the general captioning practice (single forward prediction), and averaging them can bring a further accuracy boost. Overall, DPN achieves competitive results with state-of-the-art approaches, across multiple evaluation metrics. © 2018 IEEE.

Topics

Deep supervision Dual Prediction Network Reconstruction Image reconstruction Current input Deep supervision End to end Evaluation metrics Forward prediction Image captioning Signal-on State-of-the-art approach Forecasting

TNO Identifier

862252

Repository link

https://resolver.tno.nl/uuid:7627551a-0cc5-4c08-b643-cc06dd266b0a

ISSN

19457871

ISBN

9781538617373

Publisher

IEEE Computer Society

Article nr.

8486491

Source title

Proceedings - IEEE International Conference on Multimedia and Expo, 2018 IEEE International Conference on Multimedia and Expo, ICME 2018, 23 July 2018 through 27 July 2018

Files

To receive the publication files, please send an e-mail request to TNO Repository.

A Dual Prediction Network for Image Captioning

Make TNO yours!