李明超,田丹,沈扬,Jonathan Shi,韩帅.融入Attention机制改进Word2vec技术的水利水电工程专业词智能提取与分析方法[J].水利学报,2020,51(7):816-826 |
融入Attention机制改进Word2vec技术的水利水电工程专业词智能提取与分析方法 |
An intelligent extraction and analysis approach of professional technical words for hydraulic engineering by improved Word2vec technology with Attention mechanism |
投稿时间:2019-12-29 |
DOI:10.13243/j.cnki.slxb.20190920 |
中文关键词: 水利水电工程 专业文本 自然语言处理 词向量 Word2vec技术 Attention机制 智能提取 |
英文关键词: hydraulic engineering professional text natural language processing (NLP) word vector word2vec attention mechanism intelligent extraction |
基金项目:国家自然科学基金项目(51879185);国家重点研发计划项目(2018YFC0406905);国家优秀青年科学基金项目(51622904) |
|
摘要点击次数: 2967 |
全文下载次数: 524 |
中文摘要: |
水利水电工程专业文本信息处理与分析以往主要依赖于人工交互,存在过程繁琐、效率低且易出错等问题。本文基于自然语言处理技术,引入Attention机制对Word2vec技术加以改进,提出了一种智能高效的水利水电工程专业词识别提取与分析方法。该方法通过组合Attention机制,改进Word2vec技术建立了专业词向量计算模型;根据所求词向量,计算词语间相似度,以词语间相似度为组合标准,组合提取水利水电工程专业词;进而结合已有的水利水电工程专业文本,验证所提取专业词的可信度,实现了水利水电工程专业词的自动提炼,构建了一套水利水电工程专业词智能识别提取与分析体系。该方法应用于实际某混凝土大坝长达229周的施工监理周报文本分析中,经过3轮识别计算与分析,获得了9034个水利水电工程专业词,准确率为87.58%,有效提升了水利水电工程专业文本信息提取分析的效率、准确率与智能化水平。 |
英文摘要: |
The traditional text information processing and analysis of hydraulic engineering mainly rely on manual interaction, which exists some problems such as complicated processes, low efficiency, error-prone and so on. In this study, an intelligent and high-efficiency method of professional technical word recogni-tion extraction and analysis is proposed for hydraulic engineering based on the Natural Language Processing (NLP) technology, integrating the Word2vec technique with the attention mechanism. The word vector com-puting model by the improved Word2vec technology is established. The word vector is used to calculate the similarity between words. The similarity between words serves as a combination standard to extract profes-sional technical words of hydraulic engineering. An intelligent recognition and analysis framework for profes-sional technical words of hydraulic project management is established by professional texts to verify the cred-ibility and realize the automatic extraction accuracy of professional technical words. This approach is ap-plied to analyze the weekly supervision report text of a practical concrete dam construction for 229 weeks. There are 9034 extracted professional technical words after three iterations, and the accuracy is 87.58%. It effectively improves the efficiency, accuracy and intelligence level of text information extraction and analy-sis of hydraulic engineering. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |