The emergence of the multi-core era has led to increased interest in designing effective yet practical parallel programming models. Models based on task graphs that operate on single-assignment data are attractive in several ways: they can support dynamic applications and precisely represent the available concurrency. However, they also require nuanced algorithms for scheduling and memory management for efficient execution. In this paper, we consider memory-efficient dynamic scheduling of task graphs. Specifically, we present a novel approach for dynamically recycling the memory locations assigned to data items as they are produced by tasks. We develop algorithms to identify memory-efficient store recycling functions by systematically evaluating the validity of a set of (user-provided or automatically generated) alternatives. Because recycling function can be input data-dependent, we have also developed support for continued correct execution of a task graph in the presence of a potentially incorrect store recycling function.
Experimental evaluation demonstrates that our approach to automatic store recycling incurs little to no overheads, achieves memory usage comparable to the best manually derived solutions, often produces recycling functions valid across problem sizes and input parameters, and efficiently recovers from an incorrect choice of store recycling functions.
Revised: February 21, 2017 |
Published: December 28, 2016
Citation
Kurt M., S. Krishnamoorthy, G. Agrawal, and B. Ren. 2016.User-Assisted Store Recycling for Dynamic Task Graph Schedulers.ACM Transactions on Architecture and Code Optimization 13, no. 4:55.PNNL-SA-122158.doi:10.1145/3018111