文章摘要
二维水动力模型多GPU分布式数据并行计算方法研究
Research on multi-GPU distributed data parallel computing methods for a 2D hydrodynamic model
投稿时间:2025-05-07  修订日期:2025-07-05
DOI:
中文关键词: 二维水动力模型  有限体积法  计算域分割  NCCL  异构计算
英文关键词: 2D Hydrodynamic Models  Finite Volume Methods  Computational Domain Partitioning  NCCL Communication  Heterogeneous Parallel Computing
基金项目:国家重点研发计划项目(2024YFC3212000)
作者单位邮编
丁武 河海大学水文水资源学院 210024
杨芳* 珠江水利科学研究院 510611
王卫光 河海大学水文水资源学院 
王汉岗 珠江水利科学研究院 
何用 珠江水利科学研究院 
蔺崇哲 青岛市海润自来水集团有限公司崂山水库管理处 
摘要点击次数: 37
全文下载次数: 0
中文摘要:
      针对二维水动力模型在复杂流域洪水模拟中的算力瓶颈,本研究构建了基于物理拓扑分域与基于分布式数据并行及NCCL异步通信融合的多GPU异构并行架构,实现非结构三角网格水动力模型的超算级加速。通过物理拓扑保持型计算域分割算法,在保障非结构三角网格邻接关系完整性的同时,实现多GPU间的动态负载均衡;结合基于Godunov有限体积法的分布式数据并行求解框架与NCCL异步通信策略,构建了“计算-通信”协同的异构加速架构。实验验证表明:在二维理想溃坝算例中,模型能精准捕捉溃坝激波传播特征,与理论解保持高度一致。在崂山水库溃坝洪水案例中,利用8块GPU并行计算实现了15.70s完成2h的溃坝洪水演进模拟,取得了308.43倍的加速比,其中各子域间的数据通信效率较传统MPI提升了8.67%。该架构支持单节点至跨节点GPU集群的弹性扩展,可为数字孪生流域提供秒级响应的超算级水动力引擎,为推动防洪预报调度从静态预案向动态预演范式提供核心技术支撑。
英文摘要:
      To address the computational bottlenecks of 2D hydrodynamic models in complex watershed flood simulations, this study develops a multi-GPU heterogeneous parallel architecture based on physical topology domain decomposition and the integration of distributed data parallelism with NCCL asynchronous communication. This framework enables supercomputing-level acceleration of unstructured triangular mesh-based hydrodynamic models. A physical topology-preserving domain partitioning algorithm is proposed, which maintains the integrity of adjacency relationships within unstructured triangular meshes while achieving dynamic load balancing across multiple GPUs. By combining a distributed data parallel solution framework based on the Godunov finite volume method with an NCCL asynchronous communication strategy, a "computation-communication" collaborative heterogeneous acceleration architecture is established. Experimental results demonstrate that, in a 2D ideal dam-break test case, the model accurately captures the shock wave propagation characteristics of the dam-break process, maintaining high consistency with theoretical solutions. In the case study of the Laoshan Reservoir dam-break flood, the model achieved a complete two-hour flood evolution simulation in just 15.70s using eight GPUs in parallel, attaining a 308.43× speedup. Furthermore, the inter-domain data communication efficiency improved by 8.67% compared to traditional MPI-based methods. The proposed architecture supports elastic scaling from single-node to cross-node GPU clusters, providing a supercomputing-level hydrodynamic engine with second-level response capabilities for digital twin watersheds. It offers core technological support for advancing flood forecasting and dispatching from static planning toward dynamic rehearsal paradigms.
  查看/发表评论  下载PDF阅读器
关闭