软件工程

引用本文:

程钟慧，陈珂，陈刚，徐世泽，傅丁莉.基于强化学习协同训练的命名实体识别方法[J].软件工程,2020,23(1):7-11.【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于强化学习协同训练的命名实体识别方法

程钟慧，陈珂，陈刚，徐世泽，傅丁莉^1,2,3

1.(1.浙江大学计算机科学与技术学院，浙江杭州 310027;2.
2.浙江省大数据智能计算重点实验室，浙江杭州 310027;3.
3.浙江华云电力工程设计咨询有限公司，浙江杭州 310027)

摘要: 命名实体识别是一项从非结构化大数据集中抽取有意义的实体的技术。命名实体识别技术有着非常广泛的应用，例如从轨道交通列车产生的海量运行控制日志中抽取日期、列车、站台等实体信息进行进阶数据分析。近年来，基于学习的方法成为主流，然而这些算法严重依赖人工标注，训练集较小时会出现过拟合现象，无法达到预期的泛化效果。针对以上问题，本文提出了一种基于强化学习的协同训练框架，在少量标注数据的情况下，无须人工参与，利用大量无标注数据自动提升模型性能。在两种不同领域的语料上进行实验，模型F1值均提升10%，证明了本文方法的有效性和通用性。同时，与传统的协同训练方法进行对比，本文方法F1值高于其他方法5%，实验结果表明本文方法更加智能。

关键词: 强化学习协同训练命名实体识别

中图分类号: TP391.1 文献标识码: A

基金项目: 国家重点研发计划课题(2017YFB1201001).

Named Entity Recognition Method Based on Co-training of Reinforcement Learning

CHENG Zhonghui,CHEN Ke,CHEN Gang,XU Shize,FU Dingli^1,2,3

1.( 1.College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China;2.
2.Key Laboratory of Big Data Intelligent Computing of Zhejiang Province, Hangzhou 310027, China;3.
3.Zhejiang Huayun Electric Power Engineering Design & Consultation CO., LTD., Hangzhou 310027, China)

Abstract: Named entity recognition(NER)is a technique for extracting meaningful entities from unstructured big datasets.NER has a wide range of applications.An example of NER is advanced data analysis which extracts date,train,platform and other entity information from a large operation logs dataset produced by rail transit trains.In recent years,the reinforcement learning based method has become the mainstream method of solving this task.However,these algorithms rely heavily on manual labeling.The over-fitting problem may occur when the training set is small,and cannot achieve the expected generalization effect.In this paper,we propose a novel method,Reinforced Co-Training.With only small amount of labeled data,the performance of the named entity recognition model can be automatically improved by using a large amount of unlabeled data.We have experimented our framework on corpus in two different fields,the results show that the F1 value of our proposed method is increased by 10%,which proves the effectiveness and generality of the method in this paper.We also compared our method with the traditional co-training methods,the F1 value of our method is 5% higher than other methods,which shows that this method is more intelligent.

Keywords: reinforcement learning co-training named entity recognition

用微信扫一扫