This page lists publications from the group related to designing High Performance Deep Learning frameworks as well as co-designing MPI runtimes for efficient support of scalable DL.
|1||A. Awan, K. Vadambacheri Manian, C. Chu, H. Subramoni, and DK Panda, Optimized Large-Message Broadcast for Deep Learning Workloads: MPI, MPI+NCCL, or NCCL2? , https://doi.org/10.1016/j.parco.2019.03.005 , .|
|2||X. Lu, H. Shi, R. Biswas, M. H. Javed, and DK Panda, DLoBD: A Comprehensive Study of Deep Learning over Big Data Stacks on HPC Clusters , IEEE Transactions on Multi-Scale Computing Systems , Jun 2018.|
Conferences & Workshops (12)
M.S. Thesis (1)
|1||R. Biswas, Benchmarking and Accelerating TensorFlow-based Deep Learning on Modern HPC Systems, Jul 2018|