Multi-Stage Selective Re-Decoding Module for Image Paragraph Captioning

Guozhang Nie; Xian Zhong; Chengming Zou; Qi Cu and Luo Zhong

doi:10.23977/cnci2021.008

Multi-Stage Selective Re-Decoding Module for Image Paragraph Captioning

Download as PDF

DOI: 10.23977/cnci2021.008

Author(s)

Guozhang Nie, Xian Zhong, Chengming Zou, Qi Cu and Luo Zhong

Corresponding Author

Xian Zhong

ABSTRACT

Image paragraph captioning describes an image with a paragraph. Existing methods typically train hierarchical networks with a one-stage strategy, where one-stage means those models directly generate a description without multi-stage modification. Due to the exposure bias, we have observed that there may be errors and omissions in the description generation process, such as one object in the image is wrongly expressed or one subregion in the image is neglected. To solve this problem,we present a novel approach for image paragraph captioning, called the multi-stage selective re-decoding (MSSRD) module,which extends the conventional one-stage methods to generate richer captions. After gaining a preliminary caption, our module dynamically selects appropriate words and un-decoded visual features that are in the previous stage. These selected features are re- decoded into a new caption in the next stage. The new caption is more diverse and finer than previous one. We conduct extensive experiments to demonstrate the significance of our work.

KEYWORDS

Image Paragraph Captioning, Encoder-Decoder, Multi-Stage Re-Decoding

Multi-Stage Selective Re-Decoding Module for Image Paragraph Captioning

Author(s)

Corresponding Author

ABSTRACT

KEYWORDS

RESOURCES

JOIN US

PUBLICATION SERVICES

CONTACT US