| 摘 要: 在环境声分类算法终端部署的背景下,为应对传统特征拼接、融合方法在特征间相关性和特征局部信息表达方面的不足,提出了一种基于谱双池化注意力的多特征轻量级环境声识别方法。该方法在Log-Mel谱图、线性谐波谱和线性冲击谱组合特征的基础上,利用谱双池化注意力进行全局频率信息、局部频率信息和通道间信息的联合建模,实现了多谱图特征的有效融合。在以ResNet18预训练网络为骨干并考虑混合损失的分类器模型上开展了实验,结果表明组合特征方法和谱双池化注意力结合使用能够有效提升环境声分类的准确率。 |
| 关键词: 环境声分类 组合特征 谱双池化注意力 预训练模型 |
|
中图分类号:
文献标识码:
|
|
| Environment Sound Feature Fusion Method Based on Spectral Dual-Pooling Attention |
|
Tian XuHua
|
Shaanxi University of Science and Technology
|
| Abstract: In the context of terminal deployment of environmental sound classification algorithms, a lightweight multi-feature environmental sound recognition method based on spectral dual-pooling attention is proposed to address the limitations of traditional feature concatenation and fusion methods in representing inter-feature correlations and local feature information. On the basis of the combined features of the Log-Mel spectrogram, linear harmonic spectrum, and linear percussive spectrum, global frequency information, local frequency information, and inter-channel information are jointly modeled by spectral dual-pooling attention, so that effective fusion of multi-spectrogram features is achieved. Experimental results on a classifier model that is built with a ResNet18 pre-trained backbone and trained with a hybrid loss show that the combination of the proposed feature set and spectral dual-pooling attention effectively improves the accuracy of environmental sound classification. |
| Keywords: Environmental Sound Classification Combined Features Spectral Dual-Pooling Attention Pre-trained Model |