|
二维水动力模型多GPU分布式数据并行计算方法研究 |
Research on multi-GPU distributed data parallel computing methods for a 2D hydrodynamic model |
投稿时间:2025-05-07 修订日期:2025-07-05 |
DOI: |
中文关键词: 二维水动力模型 有限体积法 计算域分割 NCCL 异构计算 |
英文关键词: 2D Hydrodynamic Models Finite Volume Methods Computational Domain Partitioning NCCL Communication Heterogeneous Parallel Computing |
基金项目:国家重点研发计划项目(2024YFC3212000) |
|
摘要点击次数: 37 |
全文下载次数: 0 |
中文摘要: |
针对二维水动力模型在复杂流域洪水模拟中的算力瓶颈,本研究构建了基于物理拓扑分域与基于分布式数据并行及NCCL异步通信融合的多GPU异构并行架构,实现非结构三角网格水动力模型的超算级加速。通过物理拓扑保持型计算域分割算法,在保障非结构三角网格邻接关系完整性的同时,实现多GPU间的动态负载均衡;结合基于Godunov有限体积法的分布式数据并行求解框架与NCCL异步通信策略,构建了“计算-通信”协同的异构加速架构。实验验证表明:在二维理想溃坝算例中,模型能精准捕捉溃坝激波传播特征,与理论解保持高度一致。在崂山水库溃坝洪水案例中,利用8块GPU并行计算实现了15.70s完成2h的溃坝洪水演进模拟,取得了308.43倍的加速比,其中各子域间的数据通信效率较传统MPI提升了8.67%。该架构支持单节点至跨节点GPU集群的弹性扩展,可为数字孪生流域提供秒级响应的超算级水动力引擎,为推动防洪预报调度从静态预案向动态预演范式提供核心技术支撑。 |
英文摘要: |
To address the computational bottlenecks of 2D hydrodynamic models in complex watershed flood simulations, this study develops a multi-GPU heterogeneous parallel architecture based on physical topology domain decomposition and the integration of distributed data parallelism with NCCL asynchronous communication. This framework enables supercomputing-level acceleration of unstructured triangular mesh-based hydrodynamic models. A physical topology-preserving domain partitioning algorithm is proposed, which maintains the integrity of adjacency relationships within unstructured triangular meshes while achieving dynamic load balancing across multiple GPUs. By combining a distributed data parallel solution framework based on the Godunov finite volume method with an NCCL asynchronous communication strategy, a "computation-communication" collaborative heterogeneous acceleration architecture is established. Experimental results demonstrate that, in a 2D ideal dam-break test case, the model accurately captures the shock wave propagation characteristics of the dam-break process, maintaining high consistency with theoretical solutions. In the case study of the Laoshan Reservoir dam-break flood, the model achieved a complete two-hour flood evolution simulation in just 15.70s using eight GPUs in parallel, attaining a 308.43× speedup. Furthermore, the inter-domain data communication efficiency improved by 8.67% compared to traditional MPI-based methods. The proposed architecture supports elastic scaling from single-node to cross-node GPU clusters, providing a supercomputing-level hydrodynamic engine with second-level response capabilities for digital twin watersheds. It offers core technological support for advancing flood forecasting and dispatching from static planning toward dynamic rehearsal paradigms. |
查看/发表评论 下载PDF阅读器 |
关闭 |