This page lists publications from the group related to designing High Performance Deep Learning frameworks as well as co-designing MPI runtimes for efficient support of scalable DL.

Journals (4)

1 DK Panda, H. Subramoni, C. Chu, and M. Bayatpour, The MVAPICH project: Transforming Research into High-Performance MPI Library for HPC Community , Journal of Computational Science (JOCS), Special Issue on Translational Computer Science , Oct 2020.
2 Ammar Awan, A. Jain, C. Chu, H. Subramoni, and DK Panda, Communication Profiling and Characterization of Deep Learning Workloads on Clusters with High-Performance Interconnects , IEEE Micro, vol. 40, no. 1, pp. 35-43, 1 Jan.-Feb. 2020. , .
3 Ammar Awan, K. Vadambacheri Manian, C. Chu, H. Subramoni, and DK Panda, Optimized Large-Message Broadcast for Deep Learning Workloads: MPI, MPI+NCCL, or NCCL2? , Volume 85, July 2019, Pages 141-152, https://doi.org/10.1016/j.parco.2019.03.005 , .
4 X. Lu, H. Shi, R. Biswas, M. H. Javed, and DK Panda, DLoBD: A Comprehensive Study of Deep Learning over Big Data Stacks on HPC Clusters , IEEE Transactions on Multi-Scale Computing Systems , Jun 2018.

Conferences & Workshops (23)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

Ph.D. Disserations (3)

1 C. Chu, Accelerator-enabled Communication Middleware for Large-scale Heterogeneous HPC Systems with Modern Interconnects, Jul 2020
2 J. Hashmi, Designing High Performance Shared-Address-Space and Adaptive Communication Middlewares for Next-Generation HPC Systems, Apr 2020
3 Ammar Awan, Co-designing Communication Middleware and Deep Learning Frameworks for High-Performance DNN Training on HPC Systems, Apr 2020

M.S. Thesis (1)

1 R. Biswas, Benchmarking and Accelerating TensorFlow-based Deep Learning on Modern HPC Systems, Jul 2018