MPI-GDS: High Performance MPI Designs with GPUDirect-aSync for CPU-GPU Control Flow Decoupling A. Venkatesh, C. Chu, K. Hamidouche, S. Potluri, Davide Rossetti, D. Panda ICPP 2017 : International Conference on Parallel Processing, Aug 2017.