• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:郭晓栋,贺平安,代 琦.基于注意力权重机制和引导成本体积激励的三维重建多视图立体网络算法研究[J].软件工程,2024,(8):30-36.【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于注意力权重机制和引导成本体积激励的三维重建多视图立体网络算法研究
郭晓栋1, 贺平安1,2, 代 琦1,3
(1.浙江理工大学计算机科学与技术学院, 浙江 杭州 310000;
2.浙江理工大学理学院, 浙江 杭州 310000;
3.浙江理工大学生命科学与医药学院, 浙江 杭州 310000)
1394614018@qq.com; pinganhe@zstu.edu.cn; daiqi@zstu.edu.cn
摘 要: 针对基于成本体积金字塔的多视图立体网络在初始构建成本体积时存在深度预测误差大的问题,提出了一种利用注意力权重特征图补充三维卷积的方法。该方法引入注意力机制关注感受野空间特征,计算源视角图像特征金字塔的注意力权重,将其加权到原始特征图中,同时设计引导成本体积激励模块,通过特征图丰富三维卷积。在DTU(Danish Test of Urban Competencies)基准数据集上的结果显示,该方法表现很好,准确度达到了0.291,相较于CVPMVSNET(Cost Volume Pyramid Based Depth Inference for Multi-View Stereo),整体精度提高了6.55%,表明该模型的改进有效。
关键词: 多视图立体;三维重建;注意力机制;成本体积
中图分类号: TP391    文献标识码: A
Research on 3D Reconstruction Multi-View Stereo Network Based on Attention Weight Mechanism and Guided Cost Volume Excitation
GUO Xiaodong1, HE Pingan1,2, DAI Qi1,3
(1.School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310000, China;
2.School of Science, Zhejiang Sci-Tech University, Hangzhou 310000, China;
3.School of Li f e Sciences and Medicine, Zhejiang Sci-Tech University, Hangzhou 310000, China)
1394614018@qq.com; pinganhe@zstu.edu.cn; daiqi@zstu.edu.cn
Abstract: Aiming at the problem that the multi-view stereo network based on the cost volume pyramid will lead to depth prediction errors when initially constructing the cost volume, a method using attention weight feature maps to supplement 3D convolution is proposed. This method introduces an attention mechanism to focus on the spatial characteristics of the receptive field, calculates the attention weight of the source perspective image feature pyramid, and weights it into the original feature map. At the same time, a guidance cost volume excitation module is designed to enrich the 3D convolution through the feature map. The results on the DTU (Danish Test of Urban Competencies) benchmark data set show that the method performs very well, with an accuracy of 0.291 and a 6.55% improvement in overall accuracy compared to CVPMVSNET (Cost Volume Pyramid Based Depth Inference for Multi-View Stereo), indicating the effectiveness of the model improvement.
Keywords: multi-view stereo; 3D reconstruction; attention mechanisms; cost volume


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫