• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:周 刚,李捍东,陈烨烨.基于对比学习的文本生成图像[J].软件工程,2025,28(2):38-41.【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于对比学习的文本生成图像
周 刚,李捍东,陈烨烨
(贵州大学电气工程学院,贵州 贵阳 550025)
1101808591@qq.com; 470394668@qq.com; zgsrkl@126.com
摘 要: 针对在多目标文本生成图像和语义相关度高的情况下,于CUB数据集中进行实验时,发现生成的鸟图像中有许多“多头”“多脚”情况,文章在MA-GAN(多阶段注意力机制的生成对抗网络)模型上加入对比学习以优化图像生成。同时,采用特征插值方法增强图像的某些特征,从而提高语义一致性和文本辨识度。通过在CUB和COCO数据集上的实现验证,改进后模型的IS(InceptionScore)指标分别提高了0.11和2.58,而R 分数(Rprecision)指标分别提高了1.98和1.37,证明了改进后的模型能够解决图像质量和语义一致性问题。
关键词: 文本生成图像;对比学习;文本特征表示;特征插值
中图分类号: TP393    文献标识码: A
Text-to-Image Generation Basedon Contrastive Learning
ZHOU Gang, LI Handong, CHEN Yeye
(School of Electrical Engineering, Guizhou University, Guiyang 550025, China)
1101808591@qq.com; 470394668@qq.com; zgsrkl@126.com
Abstract: When conducting experiments on the CUB dataset with high semantic relevance and multi-object text generated images, it was found that many generated bird images contained instances of "multiple heads" and "multiple feet". To optimize image generation, this paper proposes to enhance the MA-GAN (Multi-stage Attention Mechanism Generative Adversarial Network) model with contrastive learning. Additionally, a feature interpolation method is used to enhance certain image features, thereby improving semantic consistency and text recognition. Experiments on the CUB and COCO datasets verify that that the improved model increases the Inception Score (IS) by 0.11 and 2.58, respectively, and the R-precision (R score) by 1.98 and 1.37, respectively. This demonstrates that the modified model effectively addresses the issues of image quality and semantic consistency.
Keywords: text-to-image generation; contrastive learning; text feature representation; feature interpolation


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫