LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs

Jin Wang, Norm Rubin, Albert Sidelnik and Sudhakar Yalamanchili. “LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs.” The 43rd International Symposium on Computer Architecture (ISCA). June 2016.

Abstract

Recent developments in GPU execution models and architectures have introduced dynamic parallelism to facilitate the execution of irregular applications where control flow and memory behavior can be unstructured, time-varying, and hier- archical. The changes brought about by this extension to the traditional bulk synchronous parallel (BSP) model also creates new challenges in exploiting the current GPU memory hierarchy. One of the major challenges is that the reference locality that exists between the parent and child thread blocks (TBs) created during dynamic nested kernel and thread block launches cannot be fully leveraged using the current TB scheduling strategies. These strategies were designed for the current implementations of the BSP model but fall short when dynamic parallelism is introduced since they are oblivious to the hierarchical reference locality.
We propose LaPerm, a new locality-aware TB scheduler that exploits such parent-child locality, both spatial and temporal. LaPerm adopts three different scheduling decisions to i) prioritize the execution of the child TBs, ii) bind them to the stream multiprocessors (SMXs) occupied by their parents TBs, and iii) maintain workload balance across compute units. Experiments with a set of irregular CUDA applications executed on a cycle- level simulator employing dynamic parallelism demonstrate that LaPerm is able to achieve an average of 27% performance im- provement over the baseline round-robin TB scheduler commonly used in modern GPUs.

Download

paper [PDF]
presentation [PDF]

Citation

@inproceedings{wang-isca2016,
author={Jin Wang and Norm Rubin and Albert Sidelnik and Sudhakar Yalamanchili},
booktitle={The 43rd International Symposium on Computer Architecture (ISCA)},
title={LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs},
year={2016},
month={June},
}