Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL? A. Awan, C. Chu, H. Subramoni, D. Panda The EuroMPI 2018 Conference, Sep 2018.