文章摘要
基于本体和自然语言处理的土石坝险情知识图谱构建方法研究
An ontology and NLP-based knowledge graph construction method for dangers of earth-rock dam
投稿时间:2023-12-20  修订日期:2024-07-14
DOI:
中文关键词: 土石坝险情  知识图谱  本体  自然语言处理
英文关键词: dangers of earth-rock dam  knowledge graph  ontology  natural language processing
基金项目:浙江省水利河口研究院院长科学基金项目(ZIHE21Z004);云南省重大科技专项计划项目(202102AF080001)
作者单位邮编
张继勋 河海大学水利水电学院 210098
王虞清* 河海大学水利水电学院 210098
焦修明 浙江省水利河口研究院 
张玉贤 河海大学水利水电学院 
摘要点击次数: 159
全文下载次数: 0
中文摘要:
      土石坝在运维阶段可能受施工质量低、极端环境灾害等因素的影响,从而发生滑坡、裂缝、洪水漫顶等一系列险情。现阶段与土石坝险情相关的大量数据存储分散、结构多样,难以直接转化为经验和知识得到有效利用,快速指导险情处置。本研究针对土石坝险情领域提出了基于本体和自然语言处理(NLP)的知识图谱(KG)构建方法,分别采用自顶向下与自底向上法,构建图谱的模式层和数据层。模式层围绕险情类型、险情原因、险情措施三大概念,从土石坝结构、过程、环境、材料4方面建立领域本体库,搭建KG的概念结构。数据层通过数据预处理、知识抽取、语义对齐等操作,运用NLP对文本进行处理并根据语料的特征建立相应的提取规则,获得数据层的具体知识内容。最后以三元组形式存储不同类型的实例和相互关系,运用Neo4j图数据库进行土石坝险情领域KG的可视化表达及查询应用,使领域内分散数据向集成知识转化,为土石坝安全管理和险情处置提供技术和理论支持。
英文摘要:
      Due to factors such as suboptimal construction quality and severe environmental disasters, a range of dangerous situations, including landslides, cracks, and flooding, may arise during the operational and maintenance stages of earth-rock dams. At present, a large amount of data related to dangers of earth and rock dam is dispersed and exists in diverse forms, posing challenges for its conversion into experiential knowledge for efficient utilization and swift guidance in hazard mitigation. Consequently, a knowledge graph (KG) construction method is proposed in this study, integrating ontology and Natural Language Processing (NLP). To construct pattern and data layers of the graph, this method employs the top-down and bottom-up methods, respectively. The pattern layer centers on three primary concepts: risk types, risk causes, and risk disposal measures. A domain ontology library is established encompassing four facets of earth-rock dam structure, process, environment, and materials, and a conceptual framework of KG is built. The construction of the data layer encompasses operations such as data preprocessing, knowledge extraction, and semantic alignment. NLP is employed to process the text and formulate extraction rules based on corpus characteristics, facilitating the acquisition of specific knowledge content for the data layer. Ultimately, various types of instances and their relationships are stored in the form of triplets. The Neo4j platform is employed to actualize the visual representation and querying application of the KG within the dangers of earth-rock dam. This transformation of dispersed data in the domain into integrated knowledge offers essential technical and theoretical support for the safety management and hazard mitigation of earth-rock dams.
  查看/发表评论  下载PDF阅读器
关闭