基于生成对抗方法的无监督图像描述模型

Unsupervised image captioning model based on generative adversarial method

摘要: 建立了一种基于生成对抗方法的无监督图像描述(GA-based UIC)模型，用于图像描述任务。通过将卷积神经网络作为提取图像特征的编码部分，生成对抗式文本生成方法作为产生描述的解码部分，用目标检测算法YOLOv3作为辅助网络来无监督地生成图像的描述。实验结果表明，相较于传统模型，GA-based UIC模型最终生成的描述在BLEU、METEOR、ROUGE和CIDEr等文本评价指标的得分及训练速度均有提升。所提模型为图像描述任务提供了新的解决方案。

Abstract: An unsupervised image description model based on generative adversarial method was proposed for image captioning tasks. By using convolutional neural network as the encoder to extract image features, adversarial text generation method as the decoder to generate captions was generated, and YOLOv3 as the auxiliary network was used to unsupervised generate captions of images. The results of comparative experiments show that compared with the traditional models, the final captions generated by the GA-based UIC model has a certain improvement in training speed of text evaluation indicators and the scores such as BLEU, METEOR, ROUGE, and CIDEr. The proposal model provides a new solution for the image captioning task.