Horovod Performance on GPUs with MVAPICH2-GDR

Machine Specifications

CPU Model CPU Core Info Memory IB Card OS OFED GPU
AMD EPYC 7713 2x64 @ 2Ghz 256 GB Mellanox HDR (200 Gbps) Rocky Linux 8.5 MOFED 5.5-1.0.3.2 NVIDIA A100 (2/Node)
Model Batch Size Benchmark DL Framework CUDA
ResNet-50 256 tensorflow2_synthetic_benchmark TensorFlow v2.8 11.2
NVIDIA TF horovod
Model Batch Size Benchmark DL Framework CUDA
ResNet-50 256 pytorch_synthetic_benchmark PyTorch v1.12.1 11.3
NVIDIA PT horovod

Machine Specifications

CPU Model CPU Core Info Memory IB Card OS OFED GPU
AMD EPYC 7713 2x64 @ 2Ghz 256 GB Mellanox HDR (200 Gbps) Rocky Linux 8.5 MOFED 5.5-1.0.3.2 AMD MI100 32GB (2/Node)
Model Batch Size Benchmark DL Framework ROCm
ResNet-50 32 tensorflow2_synthetic_benchmark TensorFlow v2.11.0 5.1.1
AMD TF horovod
Model Batch Size Benchmark DL Framework ROCm
ResNet-50 32 pytorch_synthetic_benchmark PyTorch v1.12.1 5.1.1
AMD PF horovod