二维水动力模型多GPU分布式数据并行计算方法研究

丁武; 杨芳; 王卫光; 王汉岗; 何用; 蔺崇哲

文章摘要

二维水动力模型多GPU分布式数据并行计算方法研究

Research on multi-GPU distributed data parallel computing methods for a 2D hydrodynamic model

投稿时间：2025-05-07 修订日期：2025-07-05

DOI：

英文关键词: 2D Hydrodynamic Models Finite Volume Methods Computational Domain Partitioning NCCL Communication Heterogeneous Parallel Computing

基金项目:国家重点研发计划项目(2024YFC3212000)

作者	单位	邮编
丁武	河海大学水文水资源学院	210024
杨芳^*	珠江水利科学研究院	510611
王卫光	河海大学水文水资源学院
王汉岗	珠江水利科学研究院
何用	珠江水利科学研究院
蔺崇哲	青岛市海润自来水集团有限公司崂山水库管理处

摘要点击次数: 37

全文下载次数: 0

中文摘要:

针对二维水动力模型在复杂流域洪水模拟中的算力瓶颈，本研究构建了基于物理拓扑分域与基于分布式数据并行及NCCL异步通信融合的多GPU异构并行架构，实现非结构三角网格水动力模型的超算级加速。通过物理拓扑保持型计算域分割算法，在保障非结构三角网格邻接关系完整性的同时，实现多GPU间的动态负载均衡；结合基于Godunov有限体积法的分布式数据并行求解框架与NCCL异步通信策略，构建了“计算-通信”协同的异构加速架构。实验验证表明：在二维理想溃坝算例中，模型能精准捕捉溃坝激波传播特征，与理论解保持高度一致。在崂山水库溃坝洪水案例中，利用8块GPU并行计算实现了15.70s完成2h的溃坝洪水演进模拟，取得了308.43倍的加速比，其中各子域间的数据通信效率较传统MPI提升了8.67%。该架构支持单节点至跨节点GPU集群的弹性扩展，可为数字孪生流域提供秒级响应的超算级水动力引擎，为推动防洪预报调度从静态预案向动态预演范式提供核心技术支撑。

英文摘要:

To address the computational bottlenecks of 2D hydrodynamic models in complex watershed flood simulations, this study develops a multi-GPU heterogeneous parallel architecture based on physical topology domain decomposition and the integration of distributed data parallelism with NCCL asynchronous communication. This framework enables supercomputing-level acceleration of unstructured triangular mesh-based hydrodynamic models. A physical topology-preserving domain partitioning algorithm is proposed, which maintains the integrity of adjacency relationships within unstructured triangular meshes while achieving dynamic load balancing across multiple GPUs. By combining a distributed data parallel solution framework based on the Godunov finite volume method with an NCCL asynchronous communication strategy, a "computation-communication" collaborative heterogeneous acceleration architecture is established. Experimental results demonstrate that, in a 2D ideal dam-break test case, the model accurately captures the shock wave propagation characteristics of the dam-break process, maintaining high consistency with theoretical solutions. In the case study of the Laoshan Reservoir dam-break flood, the model achieved a complete two-hour flood evolution simulation in just 15.70s using eight GPUs in parallel, attaining a 308.43× speedup. Furthermore, the inter-domain data communication efficiency improved by 8.67% compared to traditional MPI-based methods. The proposed architecture supports elastic scaling from single-node to cross-node GPU clusters, providing a supercomputing-level hydrodynamic engine with second-level response capabilities for digital twin watersheds. It offers core technological support for advancing flood forecasting and dispatching from static planning toward dynamic rehearsal paradigms.

查看/发表评论下载PDF阅读器

关闭