• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:曾文玺,董育宁.一种基于信息熵的级联式新类识别方法[J].软件工程,2023,26(11):43-47.【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
一种基于信息熵的级联式新类识别方法
曾文玺, 董育宁
(南京邮电大学通信与信息工程学院, 江苏 南京 210003)
673553642@qq.com; 19900011@njupt.edu.cn
摘 要: 针对传统机器学习在新类识别中存在分类精度较低和分类耗时较长的问题,提出了一种基于信息熵的级联式新类识别方法。利用随机森林的投票机制,计算并统计分析各样本的信息熵,作为新类检测的依据,识别已知类和候选新类样本;通过滤除候选新类中的异常流样本,提高分类准确率。实验表明:所提方法在南邮数据集和ISCX数据集的两个实际网络数据集上均能实现约95%的分类准确率,并且单个样本的分类时长仅需0.079 ms;分类精度和时间性能明显优于代表性文献方法。
关键词: 网络流分类;新类检测;信息熵
中图分类号: TP391    文献标识码: A
A Cascaded Novel Class Recognition Method Based on Information Entropy
ZENG Wenxi, DONG Yuning
(College of Telecommunications & In f ormation Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China)
673553642@qq.com; 19900011@njupt.edu.cn
Abstract: Aiming at the shortcomings of traditional machine learning in novel class recognition, such as low classification accuracy and long classification time, this paper proposes a cascaded novel class recognition method based on information entropy. This method utilizes the voting mechanism of a Random Forest to calculate and analyze the information entropy of each sample. The entropy is used as a basis for novel class detection to identify known classes and candidate novel class samples. The classification accuracy is improved by filtering out abnormal flow samples in candidate novel classes. Experiments show that the proposed method can achieve a classification accuracy of about 95% on both actual network datasets of NJUPT Dataset (NDset) and ISCX Dataset, and the classification time for a single sample is only 0.079 ms. It is significantly superior to representative literature methods in classification accuracy and time performance.
Keywords: network traffic classification; novel class detection; information entropy


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫