• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:魏鑫炀,唐向红.基于BERT的抽取式裁判文书摘要生成方法研究[J].软件工程,2022,25(5):1-4.【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于BERT的抽取式裁判文书摘要生成方法研究
魏鑫炀1,唐向红1,2
(1.贵州大学计算机科学与技术学院,贵州 贵阳 550025;
2.贵州大学贵州省公共大数据重点实验室,贵州 贵阳 550025)
604564607@qq.com; xhtang@gzu.edu.cn
摘 要: 针对民事裁判文书区别于新闻文本的文本结构和重要信息分布的特点,基于BERT提出了一种结合粗粒度和细粒度抽取方法的结构化民事裁判文书摘要生成方法。首先通过粗粒度抽取方法对裁判文书进行重要的模块信息抽取,以保留文本结构;然后采用基于BERT的序列标注方法构建细粒度的抽取式摘要模型,从句子级别对重要模块的信息进行进一步抽取,以构建最终摘要。实验表明,相比于单一的粗粒度抽取或者细粒度抽取,本文方法均获得了更好的摘要生成性能。
关键词: 司法领域;裁判文书;抽取式文本摘要;序列标注
中图分类号: TP399    文献标识码: A
Research on Extractive Judgment Document Abstract Generation Method based on BERT
WEI Xinyang1, TANG Xianghong1,2
( 1.College of Computer Science and Technology, Guizhou University, Guiyang 550025, China ;
2.Guizhou Provincial Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China)
604564607@qq.com; xhtang@gzu.edu.cn
Abstract: Aiming at the text structure and important information distribution features of civil judgment documents that are different from news texts, this paper proposes a structured civil judgment document abstract generation method based on BERT (Bidirectional Encoder Representation from Transformers), combining coarse-grained and fine-grained extraction methods. Firstly, important module information is extracted from the judgment documents by the coarse-grained extraction method to preserve the text structure. Then the BERT-based sequence labeling method is used to build a fine-grained extractive abstract model. Information of important modules is further extracted based on the sentence level, so to construct the final abstract. Experiments show that the proposed method has better abstract generation performance than single coarsegrained extraction or fine-grained extraction.
Keywords: judicial field; judgment documents; extractive text abstract; sequence annotation


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫