• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:程钟慧,陈 珂,陈 刚,徐世泽,傅丁莉.基于强化学习协同训练的命名实体识别方法[J].软件工程,2020,23(1):7-11.【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于强化学习协同训练的命名实体识别方法
程钟慧,陈 珂,陈 刚,徐世泽,傅丁莉1,2,3
1.(1.浙江大学计算机科学与技术学院,浙江 杭州 310027;2.
2.浙江省大数据智能计算重点实验室,浙江 杭州 310027;3.
3.浙江华云电力工程设计咨询有限公司,浙江 杭州 310027)
摘 要: 命名实体识别是一项从非结构化大数据集中抽取有意义的实体的技术。命名实体识别技术有着非常广泛 的应用,例如从轨道交通列车产生的海量运行控制日志中抽取日期、列车、站台等实体信息进行进阶数据分析。近年 来,基于学习的方法成为主流,然而这些算法严重依赖人工标注,训练集较小时会出现过拟合现象,无法达到预期的泛 化效果。针对以上问题,本文提出了一种基于强化学习的协同训练框架,在少量标注数据的情况下,无须人工参与,利 用大量无标注数据自动提升模型性能。在两种不同领域的语料上进行实验,模型F1值均提升10%,证明了本文方法的有 效性和通用性。同时,与传统的协同训练方法进行对比,本文方法F1值高于其他方法5%,实验结果表明本文方法更加 智能。
关键词: 强化学习;协同训练;命名实体识别
中图分类号: TP391.1    文献标识码: A
基金项目: 国家重点研发计划课题(2017YFB1201001).
Named Entity Recognition Method Based on Co-training of Reinforcement Learning
CHENG Zhonghui,CHEN Ke,CHEN Gang,XU Shize,FU Dingli1,2,3
1.( 1.College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China;2.
2.Key Laboratory of Big Data Intelligent Computing of Zhejiang Province, Hangzhou 310027, China;3.
3.Zhejiang Huayun Electric Power Engineering Design & Consultation CO., LTD., Hangzhou 310027, China)
Abstract: Named entity recognition(NER)is a technique for extracting meaningful entities from unstructured big datasets.NER has a wide range of applications.An example of NER is advanced data analysis which extracts date,train,platform and other entity information from a large operation logs dataset produced by rail transit trains.In recent years,the reinforcement learning based method has become the mainstream method of solving this task.However,these algorithms rely heavily on manual labeling.The over-fitting problem may occur when the training set is small,and cannot achieve the expected generalization effect.In this paper,we propose a novel method,Reinforced Co-Training.With only small amount of labeled data,the performance of the named entity recognition model can be automatically improved by using a large amount of unlabeled data.We have experimented our framework on corpus in two different fields,the results show that the F1 value of our proposed method is increased by 10%,which proves the effectiveness and generality of the method in this paper.We also compared our method with the traditional co-training methods,the F1 value of our method is 5% higher than other methods,which shows that this method is more intelligent.
Keywords: reinforcement learning;co-training;named entity recognition


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫