Ang Li
Ang Li
Ang Li is a senior computer scientist who joined the High-Performance Computing (HPC) group at Pacific Northwest National Laboratory in 2016. His research has been focused on software-hardware co-design for scalable heterogeneous HPC, including graphics processing units, field-programmable gate arrays, coarse-grained reconfigurable arrays, artificial intelligence/machine learning accelerators, and quantum processors. His research covers the full-stack design from circuit level up to architecture, system, library, and applications.
He has published works in major HPC venues, including the International Conference for HPC, Networking, Storage, and Analysis, the International Conference on Supercomputing (SC), and the Principles and Practice of Parallel Programming Conference, among others. Li has also published works at IEEE International Symposiums on Microarchitecture, HPC Architecture, and Workload Characterization, as well as the IEEE Computer Society Technical Consortium on HPC.
Dr. Li’s research has been funded by multiple sponsors. His website can be found here:
Research Interest
- Quantum computing
- Heterogenenous HPC
- Artificial Intelligence/Machine Learning Acceleration
- PhD in Electrical and Computer Engineering, National University of Singapore
- PhD in Electrical Engineering, Technische Universiteit Eindhoven
- BS in Computer Science, Zhejiang University
Affiliations and Professional Service
- Member, IEEE
- Member, Association for Computing Machinery
- Organizing Committee Member and Program Chair/External Review Committee Member for Principles and Practice of Parallel Programming, SC, the Architectural Support for Programming Languages and Operating Systems Conference, IEEE International Symposiums on Microarchitecture, Parallel Computing Technologies, the International Symposium on Computer Architecture, the International Parallel and Distributed Processing Symposium, and more
Awards and Recognitions
- Best Paper Award, IEEE International Conference on Computer Design (2021)
- Best Paper Nomination, SC (2015, 2017, 2020)
- Best Paper Nomination, HPC Architecture and Workload Characterization (2018)
- Paper Award, European High Performance and Embedded Architecture and Compilation (2017)
- Best Student Paper Nomination, SC (2015)
- Baheri B., D. Chen, B. Fang, S.A. Stein, V. Chaudhary, Y. Mao, and S. Xu, et al. 2021. "TQEA: Temporal Quantum Error Analysis." In 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S 2021), June 21-24, 2021, Taipei, Taiwan, 65-67. Piscataway, New Jersey: IEEE. PNNL-SA-159972. doi:10.1109/DSN-S52858.2021.00034
- Feng B., Y. Wang, T. Geng, A. Li, and Y. Ding. 2021. "APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores." In International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond (SC 2021), November 14-19, 2021, Virtual, Online, Art. No. 37. Los Alamitos, California: IEEE Computer Society. PNNL-SA-161389. doi:10.1145/3458817.3476157
- Geng T., A. Li, T. Wang, C. Wu, Y. Li, R. Shi, and W. Wu, et al. 2021. "O3BNN-R: An Out-Of-Order Architecture for High-Performance and Regularized BNN Inference." IEEE Transactions on Parallel and Distributed Systems 32, no. 1:199-213. PNNL-SA-148318. doi:10.1109/TPDS.2020.3013637
- Geng T., C. Wu, C. Tan, C. Xie, A. Guo, P. Haghi, and S. He, et al. 2021. "A Survey: Handling Irregularities in Neural Network Acceleration with FPGAs." In IEEE High Performance Extreme Computing Conference. PNNL-SA-165315. doi:10.1109/HPEC49654.2021.9622877
- Geng T., C. Wu, Y. Zhang, C. Tan, C. Xie, H. You, and M. Herbordt, et al. 2021. "I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization." In Proceedings of the 54th IEEE/ACM Annual International Symposium on Microarchitecture (MICRO 2021), October 18-22, 2021, Virtual, Online, 1051 - 1063. Los Alamitos, California: IEEE Computer Society. PNNL-SA-161514. doi:10.1145/3466752.3480113
- Huang R., Y. Chen, T. Yin, X. Li, A. Li, J. Tan, and W. Yu, et al. 2021. "Accelerated Derivative-free Deep Reinforcement Learning for Large-scale Grid Emergency Voltage Control." IEEE Transactions on Power Systems. PNNL-SA-153819. doi:10.1109/TPWRS.2021.3095179
- Li A., and S. Su. 2021. "Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs." IEEE Transactions on Parallel and Distributed Systems 32, no. 7:1878-1891. PNNL-SA-156570. doi:10.1109/TPDS.2020.3045828
- Li A., B. Fang, C.E. Granade, G. Prawiroatmodjo, B. Heim, M. Roetteler, and S. Krishnamoorthy. 2021. "SV-Sim: Scalable PGAS-based State Vector Simulation of Quantum Circuits." In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2021), November 14-19, 2021, Virtual, Online, Art. No. 97. New York, New York: Association for Computing Machinery. PNNL-SA-161181. doi:10.1145/3458817.3476169
- Manu D., Y. Sheng, J. Yang, J. Deng, T. Geng, A. Li, and C. Ding, et al. 2021. "FL-DISCO: Federated Generative Adversarial Network for Graph-based Molecule Drug Discovery." In International Conference On Computer Aided Design. PNNL-SA-166598. doi: 10.1109/ICCAD51958.2021.9643440
- Peng H., S. Chen, Z. Wang, J. Yang, S. Weitze, T. Geng, and A. Li, et al. 2021. "Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search." In International Conference On Computer Aided Design. PNNL-SA-166147. doi: 10.1109/ICCAD51958.2021.9643528
- Peng H., S. Huang, T. Geng, A. Li, W. Jiang, H. Liu, and S. Wang, et al. 2021. "Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning." In Proceedings of the 22nd International Symposium on Quality Electronic Design (ISQED 2021), April 7-9, 2021, Santa Clara, CA, 142-148. Piscataway, New Jersey: IEEE. PNNL-SA-159983. doi:10.1109/ISQED51717.2021.9424344
- Peng H., S. Zhou, S. Weitze, J. Li, S. Islam, T. Geng, and A. Li, et al. 2021. "Binary Complex Neural Network Acceleration on FPGA." In IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP 2021), July 7-9, 2021, Virtual, 1, 85-92. Los Alamitos, California: IEEE Computer Society. PNNL-SA-165575. doi:10.1109/ASAP52443.2021.00021
- Tan C., N. Bohm Agostini, J. Zhang, M. Minutoli, V.G. Castellana, C. Xie, and T. Geng, et al. 2021. "OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays." In The 32nd IEEE International Conference on Application-specific Systems, Architectures and Processors. PNNL-SA-163425. doi: 10.1109/ASAP52443.2021.00029
- Tan C., T. Geng, C. Xie, N. Bohm Agostini, J. Li, A. Li, and K.J. Barker, et al. 2021. "DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications." In ICCD: The 39th IEEE International Conference on Computer Design. PNNL-SA-163151. doi: 10.1109/ICCD53106.2021.00018
- Tan C., C. Xie, A. Li, K.J. Barker, and A. Tumeo. 2021. "AURORA: Automated Refinement of Coarse-Grained Reconfigurable Accelerators." In Proceedings - Design Automation and Test In Europe (DATE 2021), February 1-5, 2021, Virtual, Online, 2021, 1388 - 1393; Paper No. 9473955. Piscataway, New Jersey: IEEE. PNNL-SA-156552. doi:10.23919/DATE51398.2021.9473955
- Tan C., C. Xie, T. Geng, A. Marquez, A. Tumeo, K.J. Barker, and A. Li. 2021. "ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing." IEEE Transactions on Parallel and Distributed Systems 32, no. 12:2880-2892. PNNL-SA-152862. doi:10.1109/TPDS.2021.3081074
- Xie C., J. Chen, J.S. Firoz, J. Li, S. Song, K.J. Barker, and M.V. Raugas, et al. 2021. "Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures." In 50th International Conference on Parallel Processing (ICPP-21), August 9-12, 2021, Lermont, IL, Article No. 53, pages 1-11. New York, New York: Association for Computing Machinery. PNNL-SA-150878. doi:10.1145/3472456.3472478
- Zhang Y., H. You, Y. Fu, T. Geng, A. Li, and Y. Lin. 2021. "G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency." In International Conference on Computer-Aided Design. PNNL-SA-164098. doi:10.1109/ICCAD51958.2021.9643549
- Firoz J.S., A. Li, J. Li, and K.J. Barker. 2020. "On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics." In IEEE High Performance Extreme Computing Conference (HPEC 2020), September 22-24, 2020, Waltham, MA, 1-7. Piscataway, New Jersey: IEEE. PNNL-SA-153853. doi:10.1109/HPEC43674.2020.9286152
- Geng T., A. Li, R. Shi, C. Wu, T. Wang, Y. Li, and P. Haghi, et al. 2020. "AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing." In Proceedings 53rd IEEE/ACM International Symposium on Microarchitecture (MICRO), October 17-21, 2020, Athens, Greece, 922-936. Piscataway, New Jersey: IEEE. PNNL-SA-146537. doi:10.1109/MICRO50266.2020.00079
- Geng T., C. Wu, C. Tan, B. Fang, A. Li, and M. Herbordt. 2020. "CQNN: a CGRA-based QNN Framework." In IEEE High Performance Extreme Computing Conference (HPEC 2020), September 22-24, 2020, Waltham, MA, 1-7. Piscataway, New Jersey: IEEE. PNNL-SA-153940. doi:10.1109/HPEC43674.2020.9286194
- Li A., O. Subasi, X. Yang, and S. Krishnamoorthy. 2020. "Density Matrix Quantum Circuit Simulation via the BSP Machine on Modern GPU Clusters." In International Conference for High Performance Computing, Networking, Storage and Analysis (SC2020), November 9-19, 2020, Atlanta, GA, 1-15. Piscataway, New Jersey: IEEE. PNNL-SA-143160. doi:10.1109/SC41405.2020.00017
- Li A., S. Song, J. Chen, J. Li, X. Liu, N.R. Tallent, and K.J. Barker. 2020. "Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect." IEEE Transactions on Parallel and Distributed Systems 31, no. 1:94 - 110. PNNL-SA-141707. doi:10.1109/TPDS.2019.2928289
- Li J., M. Lakshminarasimhan, X. Wu, A. Li, C. Olschanowsky, and K.J. Barker. 2020. "A Sparse Tensor Benchmark Suite for CPUs and GPUs." In IEEE International Symposium on Workload Characterization (IISWC 2020), October 27-30, 2020, Beijing, China, 193-204. Piscataway, New Jersey: IEEE. PNNL-SA-142736. doi:10.1109/IISWC50251.2020.00027
- Shi R., P. Dong, T. Geng, Y. Ding, X. Ma, H. So, and M. Herbordt, et al. 2020. "CSB-RNN: A Faster-Than-Realtime RNN Acceleration Framework with Compressed Structured Blocks." In Proceedings of the 34th International Conference on Supercomputing (ICS 2020), June 29-July 2, 2020, Barcelona, Spain, Article No. 24. New York, New York: Association for Computing Machinery. PNNL-SA-150973. doi:10.1145/3392717.3392749
- Tan C., C. Xie, A. Li, K.J. Barker, and A. Tumeo. 2020. "OpenCGRA: An Open-Source Unified Framework for Modeling, Testing, and Evaluating CGRAs." In IEEE 38th International Conference on Computer Design (ICCD 2020), October 18-21, 2020, 381-388. Piscataway, New Jersey: IEEE. PNNL-SA-152863. doi:10.1109/ICCD50377.2020.00070
- Wang T., T. Geng, A. Li, X. Jin, and M. Herbordt. 2020. "FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters." IEEE Transactions on Computers 68, no. 8:1143 - 1158. PNNL-SA-140455. doi:10.1109/TC.2020.3000118
- Zou P., A. Li, K.J. Barker, and R. Ge. 2020. "Detecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines." In Proceedings of the 49th International Conference on Parallel Processing (ICPP 2020) August 17-20, 2020, Online., Article No.3404435. New York, New York: Association for Computing Machinery. PNNL-SA-148325. doi:10.1145/3404397.3404435
- Zou P., A. Li, K.J. Barker, and R. Ge. 2020. "Indicator-directed Dynamic Power Management for Iterative Workloads on GPU-Accelerated Systems." In The 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid 2020), May 11-144, 2020, Melbourne, Australia, 559-568. Piscataway, New Jersey: IEEE. PNNL-SA-148280. doi:10.1109/CCGrid49817.2020.00-37
- Geng T., T. Wang, C. Wu, C. Yang, W. Wu, A. Li, and M. Herbordt. 2019. "O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning." In Proceedings of International Conference on Supercomputing, June 2019, Phoenix, AZ, 461-472. New York, New York: Association for Computing Machinery. PNNL-SA-141065. doi:10.1145/3330345.3330386
- Geng T., T. Wang, C. Wu, S. Song, A. Li, and M. Herbordt. 2019. "LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism." In IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP 2019), July 15-17, 2019, New York, 9-16. Piscataway, New Jersey: IEEE. PNNL-SA-143161. doi:10.1109/ASAP.2019.00-43
- Li A., T. Geng, T. Wang, M. Herbordt, S. Song, and K.J. Barker. 2019. "BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets." In International Conference for High Performance Computing, Networking, Storage, and Analysis, November 17-22, 2019, Denver, CO, Article No a38. Los Alamitos, California: IEEE Computer Society. PNNL-SA-142851. doi:10.1145/3295500.3356169
- Li J., Y. Ma, X. Wu, A. Li, and K.J. Barker. 2019. "PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite." CCF Transactions on High Performance Computing 1, no. 2:111-130. PNNL-SA-140675. doi:10.1007/s42514-019-00012-w
- Xie C., X. Zhang, A. Li, X. Fu, and S. Song. 2019. "PIM-VR: Erasing Motion Anomalies In Highly-Interactive Virtual Reality World With Customized Memory Cube." In IEEE International Symposium on High Performance Computer Architecture. PNNL-SA-143513. doi:10.1109/HPCA.2019.00013
- Zou P., A. Li, K.J. Barker, and R. Ge. 2019. "Fingerprinting Anomalous Computation with RNN for GPU-accelerated HPC Machines." In IEEE International Symposium on Workload Characterization (IISWC 2019), November 3-5, 2019, Orlando, FL, 253-256. Piscataway, New Jersey: IEEE. PNNL-SA-144356. doi:10.1109/IISWC47752.2019.9042165
- Li A., W. Liu, L. Wang, K.J. Barker, and S. Song. 2018. "Warp-Consolidation: A Novel Execution Model for GPUs." In The 32nd ACM International Conference on Supercomputing. PNNL-SA-133947. doi: 10.1145/3205289.3205294
- Li A., S. Song, J. Chen, X. Liu, N.R. Tallent, and K.J. Barker. 2018. "Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite." In IEEE International Symposium on Workload Characterization (IISWC 2018), September 30-October 2, 2018, 191-202. Piscataway, New Jersey: IEEE. PNNL-SA-137642. doi:10.1109/IISWC.2018.8573483
- Shen D., A. Li, S. Song, and X. Liu. 2018. "CUDAAdvisor: LLVM-based Runtime Profiling for Modern GPUs." In International Symposium on Code Generation and Optimization. PNNL-SA-143512. doi:10.1145/3168831
- Wang L., J. Ye, Y. Zhao, W. Wu, A. Li, S. Song, and Z. Xu, et al. 2018. "SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks." In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (PPOPP 2018), February 24-28, 2018, Vienna, Austria, 41-53. New York, New York: ACM. PNNL-SA-143407. doi:10.1145/3200691.3178491
- Li A., W. Liu, M. Kristensen, B. Vinter, H. Wang, K. Hou, and A. Marquez, et al. 2017. "Exploring And Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels." In International Conference for High Performance Computing, Networking, Storage and Analysis. PNNL-SA-143163. doi:10.1145/3126908.3126931
- Li A., S. Song, W. Liu, X. Liu, A. Kumar, and H. Corporaal. 2017. "Locality-Aware CTA Clustering for Modern GPUs." In The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems. PNNL-SA-143164. doi:10.1145/3093337.3037709
- Li A., W. Zhao, and S. Song. 2017. "BVF: Enabling Significant On-Chip Power Savings via Bit-Value-Favor for Throughput Processors." In The 50th Annual IEEE/ACM International Symposium on Microarchitecture. PNNL-SA-130500. doi:10.1145/3123939.3123944
- Liu W., A. Li, J.D. Hogg, I.S. Duff, and B. Vinter. 2017. "Fast synchronization-free algorithms for parallel sparse triangular solves with multiple right-hand sides." Concurrency and Computation: Practice and Experience 29, no. 21:Article No. e4244. PNNL-SA-130501. doi:10.1002/cpe.4244
- Zhao W., A. Li, Y. Wang, and Y. Ha. 2017. "Analysis and Design of Energy-Efficient Data-Dependent SRAM." In IEEE 12th International Conference on ASIC. PNNL-SA-143165. doi:10.1109/ASICON.2017.8252625