• 中国期刊全文数据库
  • 中国学术期刊综合评价数据库
  • 中国科技论文与引文数据库
  • 中国核心期刊(遴选)数据库
蔡锦, 蔡国永. 图像情感信息增强的视觉问答模型[J]. 桂林电子科技大学学报, xxxx, x(x): 1-7.
引用本文: 蔡锦, 蔡国永. 图像情感信息增强的视觉问答模型[J]. 桂林电子科技大学学报, xxxx, x(x): 1-7.
CAI Jin, CAI Guoyong. Visual question answering model enhanced with image emotional information[J]. Journal of Guilin University of Electronic Technology, xxxx, x(x): 1-7.
Citation: CAI Jin, CAI Guoyong. Visual question answering model enhanced with image emotional information[J]. Journal of Guilin University of Electronic Technology, xxxx, x(x): 1-7.

图像情感信息增强的视觉问答模型

Visual question answering model enhanced with image emotional information

  • 摘要: 视觉问答是指给定一张图像和与该图像内容相关的自然语言问题,并让计算机做出正确回答的多媒体理解任务。早期的视觉问答模型往往忽略了图像中的情感信息,使得其在回答与情感相关的问题时表现不足;另一方面,现有的融合情感信息的视觉问答模型对图像关键区域和文本关键词的利用不够充分,对细粒度的问题理解不够深入,导致回答的准确率总体偏低。为了在视觉问答模型中充分融入图像情感信息,同时利用这些情感信息来增强模型回答问题的能力,提出了一种使用图像情感信息增强的视觉问答模型(IEVQA),该模型在大规模预训练模型的基础框架上,使用一个情感模块来增强模型回答情感相关问题的能力,并在视觉问答基准数据集上进行了实验。最终的实验结果表明,IEVQA模型在综合指标上比其它对比方法表现更好,同时验证了使用情感信息辅助视觉问答模型的有效性。

     

    Abstract: Visual question answering (VQA) refers to the multimedia understanding task where a computer is given an image and a natural language question related to the image content, and it is required to provide a correct answer. Early VQA models often overlooked the emotional information in images, resulting in insufficient performance when answering emotion-related questions. On the other hand, existing emotion-integrated VQA models do not make full use of key regions in images and keywords in text, leading to a lack of in-depth understanding of fine-grained questions and overall low accuracy in their answers. To fully incorporate image emotional information into VQA models and use this information to enhance the models' ability to answer questions, we propose an emotion-enhanced visual question answering model (IEVQA). This model builds on a large-scale pre-trained model framework and uses an emotion module to improve its capability in answering emotion-related questions. Experiments were conducted on a VQA benchmark dataset. The final results show that the IEVQA model outperforms other comparison methods in comprehensive metrics, and it validates the effectiveness of using emotional information to assist VQA models.

     

/

返回文章
返回