软件工程

引用本文:

【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于谱双池化注意力的环境声特征融合方法

田旭华

陕西科技大学

摘要: 在环境声分类算法终端部署的背景下，为应对传统特征拼接、融合方法在特征间相关性和特征局部信息表达方面的不足，提出了一种基于谱双池化注意力的多特征轻量级环境声识别方法。该方法在Log-Mel谱图、线性谐波谱和线性冲击谱组合特征的基础上，利用谱双池化注意力进行全局频率信息、局部频率信息和通道间信息的联合建模，实现了多谱图特征的有效融合。在以ResNet18预训练网络为骨干并考虑混合损失的分类器模型上开展了实验，结果表明组合特征方法和谱双池化注意力结合使用能够有效提升环境声分类的准确率。

关键词: 环境声分类组合特征谱双池化注意力预训练模型

中图分类号: 文献标识码:

Environment Sound Feature Fusion Method Based on Spectral Dual-Pooling Attention

Tian XuHua

Shaanxi University of Science and Technology

Abstract: In the context of terminal deployment of environmental sound classification algorithms, a lightweight multi-feature environmental sound recognition method based on spectral dual-pooling attention is proposed to address the limitations of traditional feature concatenation and fusion methods in representing inter-feature correlations and local feature information. On the basis of the combined features of the Log-Mel spectrogram, linear harmonic spectrum, and linear percussive spectrum, global frequency information, local frequency information, and inter-channel information are jointly modeled by spectral dual-pooling attention, so that effective fusion of multi-spectrogram features is achieved. Experimental results on a classifier model that is built with a ResNet18 pre-trained backbone and trained with a hybrid loss show that the combination of the proposed feature set and spectral dual-pooling attention effectively improves the accuracy of environmental sound classification.

Keywords: Environmental Sound Classification Combined Features Spectral Dual-Pooling Attention Pre-trained Model

用微信扫一扫