June 17, 2021
Conference Paper

Generic, Sparse Tensor Core for Neural Networks

Abstract

Sparse neural network attracts more attention for model compression, fast execution, and power reduction. The state-of-the-art designed sparse tensor core for structured and static sparsity, which did not support well for generic or dynamic sparsity. We design a sparse tensor core to support generic sparsity pruning with a novel hybrid and blocked sparse matrix storage format, HB-ELL, which saves computation and storage while keeping the most significant elements, as well as supporting dynamic sparsity for data flow in neural networks. We achieve better performance with preliminary results than the state-of- the-art on an NVIDIA GPU simulator.

Published: June 17, 2021

Citation

Wu X., Y. Yi, D. Tian, and J. Li. 2020. Generic, Sparse Tensor Core for Neural Networks. In 1st International Workshop on Machine Learning for Software Hardware Co-Design (MLSH 20200), in conjunction with the 29th International Conference on Parallel Architectures and Compilation Techniques (PACT 2020), October 2, 2020, Virtual. Cambridge, Massachusetts:Massachusetts Institute of Technology. PNNL-SA-156246.