Ang Li
Ang Li
Biography
Ang Li is a senior computer scientist who joined the High-Performance Computing (HPC) group at Pacific Northwest National Laboratory in 2016. His research has been focused on software-hardware co-design for scalable heterogeneous HPC, including graphics processing units, field-programmable gate arrays, coarse-grained reconfigurable arrays, artificial intelligence/machine learning accelerators, and quantum processors. His research covers the full-stack design from circuit level up to architecture, system, library, and applications.
He has published works in major HPC venues, including the International Conference for HPC, Networking, Storage, and Analysis, the International Conference on Supercomputing (SC), and the Principles and Practice of Parallel Programming Conference, among others. Li has also published works at IEEE International Symposiums on Microarchitecture, HPC Architecture, and Workload Characterization, as well as the IEEE Computer Society Technical Consortium on HPC.
Dr. Li’s research has been funded by multiple sponsors. His website can be found here: http://www.angliphd.com/.
Research Interest
- Quantum computing
- Heterogenenous HPC
- Artificial Intelligence/Machine Learning Acceleration
Education
- PhD in Electrical and Computer Engineering, National University of Singapore
- PhD in Electrical Engineering, Technische Universiteit Eindhoven
- BS in Computer Science, Zhejiang University
Affiliations and Professional Service
- Member, IEEE
- Member, Association for Computing Machinery
- Organizing Committee Member and Program Chair/External Review Committee Member for Principles and Practice of Parallel Programming, SC, the Architectural Support for Programming Languages and Operating Systems Conference, IEEE International Symposiums on Microarchitecture, Parallel Computing Technologies, the International Symposium on Computer Architecture, the International Parallel and Distributed Processing Symposium, and more
Awards and Recognitions
- Best Paper Award, IEEE Cluster Conference (2022)
- Best Paper Nomination, International Symposium on Computer Architecture (2022)
- Best Paper Award, IEEE International Conference on Computer Design (2021)
- Best Paper Nomination, SC (2015, 2017, 2020)
- Best Paper Nomination, HPC Architecture and Workload Characterization (2018)
- Paper Award, European High Performance and Embedded Architecture and Compilation (2017)
- Best Student Paper Nomination, SC (2015)
Publications
2025
- Liu C., C. Wu, R. Song, Y. Chen, A. Li, M. Huang, and T. Geng. 2025. "Nature-GL: A Revolutionary Learning Paradigm Unleashing Nature’s Power in Real-World Spatial-Temporal Graph Learning." In 30th Asia and South Pacific Design Automation Conference (ASPDAC). PNNL-SA-205614.
2024
- Li Y., J. Liu, X. Zhao, W. Liu, T. Geng, A. Li, and X. Zhang. 2024. "Accurate and Data-Efficient Micro X-ray Diffraction Phase Identification Using Multitask Learning: Application to Hydrothermal Fluids." Advanced Intelligent Systems 6, no. 12:Art. No. 2400204. PNNL-SA-196045. doi:10.1002/aisy.202400204
- Burns M.X., C. Liu, S.A. Stein, B. Peng, K. Kowalski, and A. Li. 2024. "GALIC: Hybrid Multi-Qubitwise Pauli Grouping for Quantum Computing Measurement." Quantum Science and Technology. PNNL-SA-203031. doi:10.1088/2058-9565/ad9d74
- Zheng M., B. Peng, A. Li, X. Yang, and K. Kowalski. 2024. "Unleashed from Constrained Optimization: Quantum Computing for Quantum Chemistry Employing Generator Coordinate Inspired Method." npj Quantum Information. PNNL-SA-193552. doi:10.1038/s41534-024-00916-8
- Wu C., R. Song, C. Liu, Y. Chen, A. Li, D. Liu, and Y. Wu, et al. 2024. "NP-NDS: A Nature-Powered Nonlinear Dynamical System for Power Grid Forecasting." In Learning on Graphs Conference. PNNL-SA-205914.
- Yin K., X. Fang, T.S. Humble, A. Li, Y. Shi, and Y. Ding. 2024. "FlexiSCD: Flexible Surface Code Deformer for Dynamic Defects." In IEEE/ACM International Symposium on Microarchitecture. PNNL-SA-201279.
- Haghi P., C. Wu, Z. Azad, Y. Li, A. Gui, Y. Hao, and A. Li, et al. 2024. "Bridging the Gap Between LLMs and LNS with Dynamic Data Format and Architecture Codesign." In Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture (MICRO 2024), November 2-6, 2024, Austin, TX, 1617-1631. Los Alamitos, California:IEEE Computer Society. PNNL-SA-192868. doi:10.1109/MICRO61859.2024.00118
- Zheng M., Y. Chen, X. Yang, and A. Li. 2024. "Early Exploration of a Flexible Framework for Efficient Quantum Linear Solvers in Power Systems." In IEEE Power & Energy Society General Meeting (PESGM 2024), July 21-25, 2024, Seattle, WA, 1-5. Piscataway, New Jersey: IEEE. PNNL-SA-191945. doi:10.1109/PESGM51994.2024.10688916
- Huang Q., R. Huang, T. Yin, S. Datta, X. Sun, Z. Hou, and J. Tan, et al. 2024. "Towards Intelligent Emergency Control for Large-scale Power Systems: Convergence of Learning, Physics, Computing and Control." Electric Power Systems Research 235, no. _:Art. No. 110648. PNNL-SA-190811. doi:10.1016/j.epsr.2024.110648
- Fang B., X. Li, H. Dam, C. Tan, S. Hari, T. Tsai, and I. Laguna, et al. 2024. "Understanding Mixed Precision GEMM with MPGemmFI: Insights into Fault Resilience." In IEEE International Conference on Cluster Computing (CLUSTER 2024), September 24-27, 2024, Kobe, Japan, 166-178. Piscataway, New Jersey: IEEE. PNNL-SA-183954. doi:10.1109/CLUSTER59578.2024.00022
- Kan S., Z. Du, M. Palma, S.A. Stein, C. Liu, W. Wei, and J. Chen, et al. 2024. "Scalable Circuit Cutting and Scheduling in a Resource-constrained and Distributed Quantum System." In IEEE International Conference on Quantum Computing and Engineering (QCE). PNNL-SA-204678.
- Ang J.A., G. Carini, Y. Chen, I. Chuang, M.A. Demarco, S. Economou, and A. Eickbusch, et al. 2024. "ARQUIN : Architectures for Multinode Superconducting Quantum Computers." ACM Transactions on Quantum Computing 5, no. 3:Art. No. 19. PNNL-SA-189729. doi:10.1145/3674151
- Kan S., M. Palma, Z. Du, S.A. Stein, C. Liu, J. Chen, and A. Li, et al. 2024. "Benchmarking Optimizers for Qumode State Preparation with Variational Quantum Algorithms." In IEEE International Conference on Quantum Computing and Engineering (QCE). PNNL-SA-204634.
- Senapati P., Z. Wang, W. Jiang, A. Li, B. Fang, and Q. Guan. 2024. "PQML: Enabling the Predictive Reproducibility on NISQ Machines for Quantum ML Applications." In IEEE International Conference on Quantum Computing and Engineering (QCE24). PNNL-SA-180185.
- Song R., C. Wu, C. Liu, A. Li, M. Huang, and T. Geng. 2024. "DS-GL: Advancing Graph Learning via Harnessing the Power of Nature within Dynamic Systems." In ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA 2024), June 29-July 3, 2024, Buenos Aires, Argentina, 45-57. Los Alamitos, California: IEEE Computer Society. PNNL-SA-196761. doi:10.1109/ISCA59077.2024.00014
- Hood K., A. Li, and Q. Guan. 2024. "SNNPG: Using Spiking Neural Networks to Detect Attacks in the Power Grid." In International Conference on Neuromorphic Systems (ICONS 2024), July 30-August 2, 2024, Arlington, VA, 224-228. Los Alamitos, California:IEEE Computer Society. PNNL-SA-199477. doi:10.1109/ICONS62911.2024.00039
- Wang Z., Y. Wang, B. Feng, G. Huang, D. Mudigere, B. Muthiah, and A. Li, et al. 2024. "OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model." In USENIX Annual Technical Conference, July 10-12, 2024, Santa Clara, CA, 667-682. Berkeley, California: USENIX: The Advanced Computing Systems Association. PNNL-SA-178386.
- Sun W., A. Li, S. Stuijk, and H. Corporaal. 2024. "How much can we gain from Tensor Kernel Fusion on GPUs?." IEEE Access 12, no. _:126135 - 126144. PNNL-SA-186332. doi:10.1109/ACCESS.2024.3411473
- Haghi P., C. Tan, A. Guo, C. Wu, D. Liu, A. Li, and A. Skjellum, et al. 2024. "SmartFuse: Reconfigurable Smart Switches to Accelerate Fused Collectives in HPC Applications." In Proceedings of the 38th ACM International Conference on Supercomputing (ICS 2024), June 4-7, 2024, Kyoto, Japan, 413–425. New York, New York: Association for Computing Machinery. PNNL-SA-197072. doi:10.1145/3650200.3656616
- Wu C., C. Yang, S. Bandara, T. Geng, A. Guo, P. Haghi, and A. Li, et al. 2024. "FPGA-Accelerated Range-Limited Molecular Dynamics." IEEE Transactions on Computers 73, no. 6:1544 - 1558. PNNL-SA-185033. doi:10.1109/TC.2024.3375613
- Li A., A. Baroni, I. Stetcu, and T.S. Humble. 2024. "Deep Quantum Circuit Simulations of Low-energy Nuclear States." European Physical Journal A. Hadrons and nuclei. 60. PNNL-SA-189734. doi:10.1140/epja/s10050-024-01286-7
- Li X., A. Li, B. Fang, I. Laguna, and G. Gopalakrishnan. 2024. "FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators." In IEEE 24th International Symposium on Cluster, Cloud and Internet Computing. PNNL-SA-190978.
- Wu C., R. Song, C. Liu, Y. Yang, A. Li, M. Huang, and T. Geng. 2024. "Extending Power of Nature from Binary Problems to Real-Valued Graph Learning in Real World." In Proceedings of the Twelfth International Conference on Learning Representations (ICLR 2024), May 7, 2024 Vienna, Austria 2024, 2334-2339. Appleton, Wisconsin: International Conference on Learning Representations. PNNL-SA-194054.
- Peng H., C. Ding, T. Geng, S. Choudhury, K.J. Barker, and A. Li. 2024. "Evaluating Emerging AI/ML Accelerators: IPU, RDU, and and NVIDIA/AMD GPUs." In Companion of the 15th ACM/SPEC International Conference on Performance Engineering (ICPE 2024), May 7-11, 2024, London, 14 - 20. New York, New York: Association for Computing Machinery. PNNL-SA-188060. doi:10.1145/3629527.3651428
- Mao Z., X. Li, S. Hu, G. Gopalakrishnan, and A. Li. 2024. "A GPU accelerated mixed-precision Smoothed Particle Hydrodynamics framework with cell-based relative coordinate." Engineering Analysis with Boundary Elements 161. PNNL-SA-190575. doi:10.1016/j.enganabound.2024.01.020
- Wang Z., Y. Wang, J. Deng, D. Zheng, A. Li, and Y. Ding. 2024. "RAP: Resource-aware Automated GPU Sharing for Multi-GPU Recommendation Model Training and Input Preprocessing." In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024), April 27- May 1, 2024, San Diego, CA, 2, 964-979. New York, New York:Association for Computing Machinery. PNNL-SA-189479. doi:10.1145/3620665.3640406
- Wang M., B. Fang, A. Li, and P. Nair. 2024. "Red-QAOA: Efficient Variational Optimization through Circuit Reduction." In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024), April 27-May 1, 2024, La Jolla, CA, 2, 990-998. New York, New York: Association for Computing Machinery. PNNL-SA-192496. doi:10.1145/3620665.3640363
- Li J., A. Li, and W. Jiang. 2024. "Quapprox: A Framework for Benchmarking the Approximability of Variational Quantum Circuit." In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024), April 14-19, 2024 Seoul, Republic of Korea, 13376-13380. Piscataway, New Jersey: IEEE. PNNL-SA-193669. doi:10.1109/ICASSP48485.2024.10447919
- Helal H., J.S. Firoz, J.A. Bilbrey, H.W. Sprueill, K.M. Herman, M.M. Krell, and T. Murray, et al. 2024. "Acceleration of Graph Neural Network-based Prediction Models in Chemistry via Co-design Optimization on Intelligence Processing Units." Journal of Chemical Information and Modeling 64, no. 5:1568–1580. PNNL-SA-193670. doi:10.1021/acs.jcim.3c01312
- L'Abbate R., A. D'Onofrio, S.A. Stein, S. Chen, A. Li, P. Chen, and J. Chen, et al. 2024. "A Quantum-Classical Collaborative Training Architecture Based on Quantum State Fidelity." IEEE Transactions on Quantum Engineering 5. PNNL-SA-195015. doi:10.1109/TQE.2024.3367234
2023
- Peng H., S. Huang, T. Zhou, Y. Luo, C. Wang, Z. Wang, and J. Zhao, et al. 2023. "AutoReP: Automatic ReLU Replacement for Fast Private Network Inference." In IEEE/CVF International Conference on Computer Vision (ICCV 2023), October 1-6, 2023, Paris, France, 5155-5165. Piscataway, New Jersey: IEEE. PNNL-SA-187458. doi:10.1109/ICCV51070.2023.00478
- D'Onofrio A., A. Hossain, L. Santana, N. Machlovi, S.A. Stein, J. Liu, and A. Li, et al. 2023. "Distributed Quantum Learning with co-Management in a Multi-tenant Quantum System." In IEEE International Conference on Big Data (BigData 2023), December 15-18, 2023, Sorrento, Italy, 221-228. Piscataway, New Jersey: IEEE. PNNL-SA-192282. doi:10.1109/BigData59044.2023.10386676
- Stein S.A., S. Sussman, T.J. Tomesh, C. Guinn, E. Tureci, S.F. Lin, and W. Tang, et al. 2023. "HetArch: Heterogeneous Microarchitectures for Superconducting Quantum Systems." In Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2023), October 28-November 1, 2023, Toronto, Canada, 539 - 554. New York, New York: Association for Computing Machinery. PNNL-SA-185242. doi:10.1145/3613424.3614300
- Wu A., Y. Ding, and A. Li. 2023. "QuComm: Optimizing Collective Communication for Distributed Quantum Computing." In Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2023), October 28-November 1, 2023, Toronto, Canada, 479 - 493. New York, New York:Association for Computing Machinery. PNNL-SA-177285. doi:10.1145/3613424.3614253
- Li J., Z. Wang, Z. Hu, P. Date, A. Li, and W. Jiang. 2023. "A Novel Spatial-Temporal Variational Quantum Circuit to Enable Deep Learning on NISQ Devices." In IEEE International Conference on Quantum Computing and Engineering (QCE 2023), September 17-22, 2023, Bellevue, WA, 272-282. Piscataway, New Jersey: IEEE. PNNL-SA-187540. doi:10.1109/QCE57702.2023.00038
- Shi Y., T. Nguyen, S.A. Stein, T.J. Stavenger, M.G. Warner, M. Roetteler, and T. Hoefler, et al. 2023. "A Reference Implementation for a Quantum Message Passing Interface." In Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC-W 2023), November 12-17, 2023, Denver, CO, 1420–1425. New York, New York: Association for Computing Machinery. PNNL-SA-189779. doi:10.1145/3624062.3624212
- Wang M., F. Hua, C. Liu, N.P. Bauman, K. Kowalski, D. Claudino, and T.S. Humble, et al. 2023. "Enabling Scalable VQE Simulation on Leading HPC Systems." In Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC-W 2023), November 12-17, 2023, Denver, CO, 1460–1467. New York, New York: Association for Computing Machinery. PNNL-SA-189179. doi:10.1145/3624062.3624221
- Hua F., M. Wang, G. Li, B. Peng, C. Liu, M. Zheng, and S.A. Stein, et al. 2023. "QASMTrans: A QASM Quantum Transpiler Framework for NISQ Devices." In Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC-W 2023), November 12-17, 2023, Denver, CO, 1468–1477. New York, New York: Association for Computing Machinery. PNNL-SA-188499. doi:10.1145/3624062.3624222
- Wu C., T. Geng, A. Guo, S. Bandara, P. Haghi, C. Liu, and A. Li, et al. 2023. "FASDA: An FPGA-Aided, Scalable and Distributed Accelerator for Range-Limited Molecular Dynamics." In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2023), November 12-17, 2023, Denver, CO, 1-14. New York, New York: Association for Computing Machinery. PNNL-SA-186622. doi:10.1145/3581784.3607100
- Chen Y., S.A. Stein, A. Li, and Z. Huang. 2023. "Is it coming soon to power systems: Quantum Computing and its early exploration." In IEEE Power & Energy Society General Meeting (PESGM 2023), July 16-20, 2023, Orlando, FL, 1-5. Piscataway, New Jersey: IEEE. PNNL-SA-179849. doi:10.1109/PESGM52003.2023.10252721
- Liu Z., Y. Yang, Z. Pan, A. Sharma, A. Hasan, C. Ding, and A. Li, et al. 2023. "Ising-CF: A Pathbreaking Collaborative Filtering Method Through Efficient Ising Machine Learning." In Proceedings of the 60th Design Automation Conference (DAC 2023), July 9-13, 2023, San Francisco, CA, 1-6. Piscataway, New Jersey: IEEE. PNNL-SA-182186. doi:10.1109/DAC56929.2023.10247860
- Luo Y., C. Tan, N. Bohm Agostini, A. Li, A. Tumeo, N. Dave, and T. Geng. 2023. "ML-CGRA: An Integrated Compilation Framework to Enable Efficient Machine Learning Acceleration on CGRAs." In Proceedings of the 60th ACM/IEEE Design Automation Conference (DAC 2023), July 9-13, 2023, San Francisco, CA, 1-6. Piscataway, New Jersey: IEEE. PNNL-SA-180015. doi:10.1109/DAC56929.2023.10247873
- Wang M., B. Fang, A. Li, and P. Nair. 2023. "Efficient QAOA Optimization using Directed Restarts and Graph Lookup." In QCCC '23: Proceedings of the 2023 International Workshop on Quantum Classical Cooperative (QCCC 2023), June 20, 2023, Orlando, FL, 5–8. New York, New York: Association for Computing Machinery. PNNL-SA-189163. doi:10.1145/3588983.3596680
- Zhang B., B. Fang, Q. Guan, A. Li, and D. Tao. 2023. "HQ-Sim: High-performance State Vector Simulation of Quantum Circuits on Heterogeneous HPC Systems." In Proceedings of the 2023 International Workshop on Quantum Classical Cooperative (QCCC 2023), June 20, 2023, Orlando, FL, 1–4. New York, New York: Association for Computing Machinery. PNNL-SA-189164. doi:10.1145/3588983.3596679
- Baheri B., V. Chaudhary, A. Li, S. Xu, B. Fang, and Q. Guan. 2023. "Quantum Noise Mitigation: Introducing the Robust Quantum Circuit Scheduler for Enhanced Fidelity and Throughput." In Proceedings of the 2023 International Workshop on Quantum Classical Cooperative (QCCC 2023), June 20, 2023, Orlando, FL, 21–24. New York, New York: Association for Computing Machinery. PNNL-SA-189165. doi:10.1145/3588983.3596688
- Zhang B., B. Fang, Q. Guan, A. Li, and D. Tao. 2023. "MEMQSIM: Highly Memory-Efficient and Modularized Quantum State-Vector Simulation." In Fourth International Workshop on Quantum Computing Software (with SC23). PNNL-SA-189199.
- Li X., I. Laguna, B. Fang, K. Swirydowicz, A. Li, and G. Gopalakrishnan. 2023. "Design and Evaluation of GPU-FPX: A Low-Overhead tool for Floating-Point Exception Detection in NVIDIA GPUs." In Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2023), June 16-23, 2023, Orlando, FL, 59–71. New York, New York: Association for Computing Machinery. PNNL-SA-171816. doi:10.1145/3588195.3592991
- Zhao X., Y. Luo, J. Liu, W. Liu, K.M. Rosso, X. Guo, and T. Geng, et al. 2023. "Machine Learning Automated Analysis of Enormous Synchrotron X-Ray Diffraction Datasets." Journal of Physical Chemistry C 127, no. 30:14830–14838. PNNL-SA-183442. doi:10.1021/acs.jpcc.3c03572
- Zheng M., B. Peng, N.O. Wiebe, A. Li, X. Yang, and K. Kowalski. 2023. "Quantum algorithms for generator coordinate methods." Physical Review Research 5, no. 2:Art. No. 023200. PNNL-SA-180637. doi:10.1103/PhysRevResearch.5.023200
- Wang Y., B. Feng, T. Geng, X. Pengfei, K.J. Barker, A. Li, and Y. Ding. 2023. "MGG: Accelerating Graph Neural Networks via Multi-GPU Shared Memory." In USENIX Symposium on Operating Systems Design and Implementation. PNNL-SA-165668.
- Chen J., H. Sung, X. Shen, N.R. Tallent, K.J. Barker, and A. Li. 2023. "Accelerating Matrix-Centric Graph Processing on GPUs through Bit-Level Optimizations." Journal of Parallel and Distributed Computing 177. PNNL-SA-179122. doi:10.1016/j.jpdc.2023.02.013
- Guo A., Y. Hao, C. Wu, P. Haghi, Z. Pan, M. Si, and D. Tao, et al. 2023. "Software-Hardware Co-design of Heterogeneous SmartNIC System for Recommendation Models Inference and Training." In Proceedings of the 27th International Conference on Supercomputing (ICS-2023) June 21-23, 2023, Orlando, FL, 336–347. New York, New York:Association for Computing Machinery. PNNL-SA-181666. doi:10.1145/3577193.3593724
- Chen J., H. Sung, X. Shen, S. Choudhury, and A. Li. 2023. "BitGNN: Unlocking the Performance Potential of Binary Graph Neural Networks on GPUs." In Proceedings of the 37th International Conference on Supercomputing (ICS-2023), June 21-23, 2023, Orlando, FL, 264 - 276. New York, New York: Association for Computing Machinery. PNNL-SA-171811. doi:10.1145/3577193.3593725
- Haghi P., W. Krska, C. Tan, T. Geng, P. Chen, C. Greenwood, and A. Guo, et al. 2023. "FLASH: FPGA-Accelerated Smart Switches with GCN Case Study." In Proceedings of the 37th International Conference on Supercomputing (ICS-2023), June 21-23, 2023, Orlando, FL, edited by 450–462. New York, New York: Association for Computing Machinery. PNNL-SA-167625. doi:10.1145/3577193.3593739
- Stein S.A., N.O. Wiebe, Y. Ding, J.A. Ang, and A. Li. 2023. "Q-BEEP: Quantum Bayesian Error Mitigation Employing Poisson Modeling over the Hamming Spectrum." In Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA 2023), June 17-21, 2023, Orlando, FL, 1–13; Art. No. 8. New York, New York: Association for Computing Machinery. PNNL-SA-175133. doi:10.1145/3579371.3589043
- Mao Y., V. Sharma, W. Zheng, L. Cheng, Q. Guan, and A. Li. 2023. "Elastic Resource Management for Deep Learning Applications in a Container Cluster." IEEE Transactions on Cloud Computing 11, no. 2:2204 - 2216. PNNL-SA-166431. doi:10.1109/TCC.2022.3194128
- Chen Y., Y. Jin, F. Hua, A. Hayes, A. Li, Y. Shi, and Z. Zhang. 2023. "A Pulse Generation Framework with Augmented Program-aware Basis Gates and Criticality Analysis." In IEEE 29th International Symposium on High-Performance Computer Architecture (HPCA 2023), February 25-March 1, 2023, Montreal, Canada, 773-786. Piscataway, New Jersey: IEEE. PNNL-SA-168657. doi:10.1109/HPCA56546.2023.10070990
- Li A., S.A. Stein, S. Krishnamoorthy, and J.A. Ang. 2023. "QASMBench: A Low-Level Quantum Benchmark Suite for NISQ Evaluation and Simulation." ACM Transactions on Quantum Computing 4, no. 2:10, pp 1-26. PNNL-SA-162867. doi:10.1145/3550488
- Zheng M., A. Li, T. Terlaky, and X. Yang. 2023. "A Bayesian Approach for Characterizing and Mitigating Gate and Measurement Errors." ACM Transactions on Quantum Computing 4, no. 2:Art. No. 11. PNNL-SA-176720. doi:10.1145/3563397
- Pan Z., A. Sharma, J. Hu, Z. Liu, A. Li, H. Liu, and M. Huang, et al. 2023. "Ising-Traffic: Using Ising Machine Learning to Predict Traffic Congestion under Uncertainty." In Proceedings of the AAAI Conference on Artificial Intelligence, February 7-14, 2023, Washington, D.C., 37, 9354-9363. Washington, District Of Columbia: AAAI Press. PNNL-SA-176751. doi:10.1609/aaai.v37i8.26121
- Li Y., T. Geng, S.A. Stein, A. Li, and H. Yu. 2023. "GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm." Tsinghua Science and Technology 28, no. 1:207 - 220. PNNL-SA-165349. doi:10.26599/TST.2021.9010084
- Sun W., A. Li, T. Geng, S. Stuijk, and H. Corporaal. 2023. "Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numerical Behaviors." IEEE Transactions on Parallel and Distributed Systems 34, no. 1:246 - 261. PNNL-SA-173565. doi:10.1109/TPDS.2022.3217824
2022
- Stein S.A., Y. Mao, J.A. Ang, and A. Li. 2022. "QuCNN : A Quantum Convolutional Neural Network with Entanglement Based Backpropagation." In Proceedings of the 7th ACM/IEEE Symposium on Edge Computing (SEC 2022), December 5-8, 2022, Seattle, WA, 368-374. Piscataway, New Jersey: IEEE. PNNL-SA-178064. doi:10.1109/SEC54971.2022.00054
- Fang B., S. Hari, T. Tsai, X. Li, G. Gopalakrishnan, I. Laguna, and K.J. Barker, et al. 2022. "Towards Precision-Aware Fault Tolerance Approaches for Mixed-Precision Applications." In IEEE/ACM 12th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS 2022), November 13-18, 2022, Dallas, TX, 47-52. Piscataway, New Jersey:IEEE. PNNL-SA-177005. doi:10.1109/FTXS56515.2022.00010
- Baheri B., J. Tronge, B. Fang, A. Li, V. Chaudhary, and Q. Guan. 2022. "MARS: Malleable Actor-Critic Reinforcement Learning Scheduler." In Proceedings of the 41st International Performance Computing and Communications Conference (IPCCC 2022), November 11-13, 2022, Austin, TX, 217-226. Piscataway, New Jersey: IEEE. PNNL-SA-170367. doi:10.1109/IPCCC55026.2022.9894315
- Huang R., Y. Chen, T. Yin, Q. Huang, J. Tan, W. Yu, and X. Li, et al. 2022. "Learning and Fast Adaptation for Grid Emergency Control via Deep Meta Reinforcement Learning." IEEE Transactions on Power Systems 37, no. 6:4168-4178. PNNL-SA-179153. doi:10.1109/TPWRS.2022.3155117
- Fang B., Y. Ozkaya, A. Li, U. Catalyurek, and S. Krishnamoorthy. 2022. "Efficient Hierarchical State Vector Simulation of Quantum Circuits via Acyclic Graph Partitioning." In IEEE International Conference on Cluster Computing (CLUSTER 2022), September 5-8, 2022, Heidelberg, Germany, 289-300. Piscataway, New Jersey: IEEE. PNNL-SA-170183. doi:10.1109/CLUSTER51413.2022.00041
- Guo A., T. Geng, Y. Zhang, P. Haghi, C. Wu, C. Tan, and Y. Lin, et al. 2022. "A Framework for Neural Network Inference on FPGA-Centric SmartNICs." In Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications (FPL 2022), August 29-September 2, 2022, Belfast, United Kingdom, 01-08. Piscataway, New Jersey: IEEE. PNNL-SA-169702. doi:10.1109/FPL57034.2022.00071
- Chen Y., Z. Huang, S. Jin, and A. Li. 2022. "Computing for Power System Operation and Planning: Then, Now, and the Future." iEnergy 1, no. 3:315 - 324. PNNL-SA-174682. doi:10.23919/IEN.2022.0037
- Zhang C., T. Geng, A. Guo, J. Tian, M. Herbordt, A. Li, and D. Tao. 2022. "H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture." In 32nd International Conference on Field Programmable Logic and Applications (FPL 2022), August 29- September 2, 2022, Belfast, UK, 200-208. Piscataway, New Jersey:IEEE. PNNL-SA-169703. doi:10.1109/FPL57034.2022.00040
- Wan C., L. Youjie, A. Li, N. Kim, and Y. Lin. 2022. "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling." In Fifth Conference on Machine Learning and Systems (MLSys). PNNL-SA-169701.
- Stein S.A., B. Baheri, D. Chen, Y. Mao, Q. Guan, S. Xu, and C. Ding, et al. 2022. "QuClassi: A Hybrid Deep Neural Network Architecture based on Quantum State Fidelity." In Fifth Conference on Machine Learning and Systems. PNNL-SA-169700.
- Peng H., S. Huang, S. Chen, B. Li, T. Geng, A. Li, and W. Jiang, et al. 2022. "A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining." In Proceedings of the 59th ACM/IEEE Design Automation Conference (DAC 2022), July 10-14, 2022, San Francisco, CA, 1135–1140. New York, New York: Association for Computing Machinery. PNNL-SA-170686. doi:10.1145/3489517.3530585
- Stein S.A., N.O. Wiebe, J.A. Ang, and A. Li. 2022. "Benchmarking Quantum Processor Performance through Quantum Distance Metrics Over An Algorithm Suite." In IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2022 ), May 30-June 3, 2022, Lyon, France, 618-624. Piscataway, New Jersey: IEEE. PNNL-SA-172019. doi:10.1109/IPDPSW55747.2022.00106
- Chen J., N.R. Tallent, K.J. Barker, X. Shen, H. Sung, and A. Li. 2022. "Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU." In IEEE International Parallel and Distributed Processing Symposium (IPDPS 2022), May 30-June 03, 2022, Virtual, Online, 515-525. Los Alamitos, California: IEEE Computer Society. PNNL-SA-161317. doi:10.1109/IPDPS53621.2022.00056
- Tan C., T. Tambe, J. Zhang, B. Fang, T. Geng, G. Wei, and D. Brooks, et al. 2022. "ASAP: Automatic Synthesis of Area-Efficient and Precision-Aware CGRAs." In Proceedings of the 36th ACM International Conference on Supercomputing (ICS 2022), June 28-30, 2022, Virtual, Online, Paper No. 4. New York, New York: Association for Computing Machinery. PNNL-SA-172791. doi:10.1145/3524059.3532359
- Tan C., T. Tambe, J. Zhang, B. Fang, T. Geng, G. Wei, and D. Brooks, et al. 2022. "ASAP: Automatic Synthesis of Area-Efficient and Precision-Aware CGRAs." In ACM International Conference on Supercomputing 2022. PNNL-SA-170181.
- Zhang C., S. Jin, T. Geng, J. Tian, A. Li, and D. Tao. 2022. "CEAZ: Accelerating Parallel I/O Via Hardware-Algorithm Co-Designed Adaptive Lossy Compression." In Proceedings of the 36th ACM International Conference on Supercomputing (ICS 2022), June 28-30, 2022, Virtual, Online, Paper No.: 12. New York, New York: Association for Computing Machinery. PNNL-SA-161283. doi:10.1145/3524059.3532362
- Tumeo A., N. Bohm Agostini, S. Curzel, A.M. Limaye, C. Tan, M. Minutoli, and V.C. Amatya, et al. 2022. "SO(DA)^2: End-to-end Generation of Specialized Reconfigurable Architectures." In PARMA-DITAM: 13th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 11th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms. PNNL-SA-171765.
- Stein S.A., N.O. Wiebe, Y. Ding, B. Peng, K. Kowalski, N.A. Baker, and J.A. Ang, et al. 2022. "EQC: Ensembled Quantum Computing for Variational Quantum Algorithms." In Proceedings of the 49th Annual International Symposium on Computer Architecture (ISCA 2022), June 18-22, 2022, New York, NY, 59–71. New York, New York: Association for Computing Machinery. PNNL-SA-165576. doi:10.1145/3470496.3527434
- You H., T. Geng, Y. Zhang, A. Li, and Y. Lin. 2022. "GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design." In The 28th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2022), April 2-6, 2022, Virutal, Online, 460-474. Los Alamitos, California: IEEE Computer Society. PNNL-SA-161518. doi:10.1109/HPCA53966.2022.00041
- Stein S.A., R. L'Abbate, W. Mu, Y. Cui, B. Baheri, Y. Mao, and Q. Guan, et al. 2022. "A Hybrid System for Learning Classical Data in Quantum Series." In 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC). Piscataway, New Jersey: IEEE. PNNL-SA-179084.
- Tan C., N. Bohm Agostini, T. Geng, C. Xie, J. Li, A. Li, and K.J. Barker, et al. 2022. "DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs." In IEEE International Symposium on High-Performance Computer Architecture (HPCA 2022), April 2-6, 2022, Seoul, Korea, 304-316. Piscataway, New Jersey:IEEE. PNNL-SA-165149. doi:10.1109/HPCA53966.2022.00030
2021
- Stein S.A., B. Baheri, D. Chen, Y. Mao, Q. Guan, A. Li, and B. Fang, et al. 2021. "QuGAN: A Quantum State Fidelity based Generative Adversarial Network." In IEEE International Conference on Quantum Computing and Engineering (QCE 2021), October 17-22, 2021, Broomfield, CO,, edited by H.A. Müller, et al, 71-81. Piscataway, New Jersey:IEEE. PNNL-SA-156090. doi:10.1109/QCE52317.2021.00023
- Gopalakrishnan G., I. Laguna, A. Li, P. Panchekha, C. Rubio-Gonzalez, and Z. Tatlock. 2021. "Guarding Numerics Amidst Rising Heterogeneity." In Correctness 2021: 5th International Workshop on Software Correctness for HPC Applications. PNNL-SA-166342.
- Li Y., T. Geng, A. Li, and H. Yu. 2021. "BCNN: Binary Complex Neural Network." Microprocessors and Microsystems 87. PNNL-SA-161063. doi:10.1016/j.micpro.2021.104359
- Stein S.A., R. L'Abbate, W. Mu, Y. Liu, B. Baheri, Y. Mao, and Q. Guan, et al. 2021. "A Hybrid System for Learning Classical Data in Quantum States." In IEEE International Performance Computing and Communications Conference (IPCCC 2021), October 29-31, 2021, Austin, TX, 1-7. Piscataway, New Jersey:IEEE. PNNL-SA-165589. doi:10.1109/IPCCC51483.2021.9679430
- Baheri B., D. Chen, B. Fang, S.A. Stein, V. Chaudhary, Y. Mao, and S. Xu, et al. 2021. "TQEA: Temporal Quantum Error Analysis." In 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S 2021), June 21-24, 2021, Taipei, Taiwan, 65-67. Piscataway, New Jersey: IEEE. PNNL-SA-159972. doi:10.1109/DSN-S52858.2021.00034
- Feng B., Y. Wang, T. Geng, A. Li, and Y. Ding. 2021. "APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores." In International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond (SC 2021), November 14-19, 2021, Virtual, Online, Art. No. 37. Los Alamitos, California: IEEE Computer Society. PNNL-SA-161389. doi:10.1145/3458817.3476157
- Geng T., A. Li, T. Wang, C. Wu, Y. Li, R. Shi, and W. Wu, et al. 2021. "O3BNN-R: An Out-Of-Order Architecture for High-Performance and Regularized BNN Inference." IEEE Transactions on Parallel and Distributed Systems 32, no. 1:199-213. PNNL-SA-148318. doi:10.1109/TPDS.2020.3013637
- Geng T., C. Wu, C. Tan, C. Xie, A. Guo, P. Haghi, and S. He, et al. 2021. "A Survey: Handling Irregularities in Neural Network Acceleration with FPGAs." In IEEE High Performance Extreme Computing Conference. PNNL-SA-165315. doi:10.1109/HPEC49654.2021.9622877
- Geng T., C. Wu, Y. Zhang, C. Tan, C. Xie, H. You, and M. Herbordt, et al. 2021. "I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization." In Proceedings of the 54th IEEE/ACM Annual International Symposium on Microarchitecture (MICRO 2021), October 18-22, 2021, Virtual, Online, 1051 - 1063. Los Alamitos, California: IEEE Computer Society. PNNL-SA-161514. doi:10.1145/3466752.3480113
- Huang R., Y. Chen, T. Yin, X. Li, A. Li, J. Tan, and W. Yu, et al. 2021. "Accelerated Derivative-free Deep Reinforcement Learning for Large-scale Grid Emergency Voltage Control." IEEE Transactions on Power Systems. PNNL-SA-153819. doi:10.1109/TPWRS.2021.3095179
- Li A., and S. Su. 2021. "Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs." IEEE Transactions on Parallel and Distributed Systems 32, no. 7:1878-1891. PNNL-SA-156570. doi:10.1109/TPDS.2020.3045828
- Li A., B. Fang, C.E. Granade, G. Prawiroatmodjo, B. Heim, M. Roetteler, and S. Krishnamoorthy. 2021. "SV-Sim: Scalable PGAS-based State Vector Simulation of Quantum Circuits." In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2021), November 14-19, 2021, Virtual, Online, Art. No. 97. New York, New York: Association for Computing Machinery. PNNL-SA-161181. doi:10.1145/3458817.3476169
- Manu D., Y. Sheng, J. Yang, J. Deng, T. Geng, A. Li, and C. Ding, et al. 2021. "FL-DISCO: Federated Generative Adversarial Network for Graph-based Molecule Drug Discovery." In International Conference On Computer Aided Design. PNNL-SA-166598. doi: 10.1109/ICCAD51958.2021.9643440
- Peng H., S. Chen, Z. Wang, J. Yang, S. Weitze, T. Geng, and A. Li, et al. 2021. "Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search." In International Conference On Computer Aided Design. PNNL-SA-166147. doi: 10.1109/ICCAD51958.2021.9643528
- Peng H., S. Huang, T. Geng, A. Li, W. Jiang, H. Liu, and S. Wang, et al. 2021. "Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning." In Proceedings of the 22nd International Symposium on Quality Electronic Design (ISQED 2021), April 7-9, 2021, Santa Clara, CA, 142-148. Piscataway, New Jersey: IEEE. PNNL-SA-159983. doi:10.1109/ISQED51717.2021.9424344
- Peng H., S. Zhou, S. Weitze, J. Li, S. Islam, T. Geng, and A. Li, et al. 2021. "Binary Complex Neural Network Acceleration on FPGA." In IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP 2021), July 7-9, 2021, Virtual, 1, 85-92. Los Alamitos, California: IEEE Computer Society. PNNL-SA-165575. doi:10.1109/ASAP52443.2021.00021
- Tan C., N. Bohm Agostini, J. Zhang, M. Minutoli, V.G. Castellana, C. Xie, and T. Geng, et al. 2021. "OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays." In The 32nd IEEE International Conference on Application-specific Systems, Architectures and Processors. PNNL-SA-163425. doi: 10.1109/ASAP52443.2021.00029
- Tan C., T. Geng, C. Xie, N. Bohm Agostini, J. Li, A. Li, and K.J. Barker, et al. 2021. "DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications." In ICCD: The 39th IEEE International Conference on Computer Design. PNNL-SA-163151. doi: 10.1109/ICCD53106.2021.00018
- Tan C., C. Xie, A. Li, K.J. Barker, and A. Tumeo. 2021. "AURORA: Automated Refinement of Coarse-Grained Reconfigurable Accelerators." In Proceedings - Design Automation and Test In Europe (DATE 2021), February 1-5, 2021, Virtual, Online, 2021, 1388 - 1393; Paper No. 9473955. Piscataway, New Jersey: IEEE. PNNL-SA-156552. doi:10.23919/DATE51398.2021.9473955
- Tan C., C. Xie, T. Geng, A. Marquez, A. Tumeo, K.J. Barker, and A. Li. 2021. "ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing." IEEE Transactions on Parallel and Distributed Systems 32, no. 12:2880-2892. PNNL-SA-152862. doi:10.1109/TPDS.2021.3081074
- Xie C., J. Chen, J.S. Firoz, J. Li, S. Song, K.J. Barker, and M.V. Raugas, et al. 2021. "Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures." In 50th International Conference on Parallel Processing (ICPP-21), August 9-12, 2021, Lermont, IL, Article No. 53, pages 1-11. New York, New York: Association for Computing Machinery. PNNL-SA-150878. doi:10.1145/3472456.3472478
- Zhang Y., H. You, Y. Fu, T. Geng, A. Li, and Y. Lin. 2021. "G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency." In International Conference on Computer-Aided Design. PNNL-SA-164098. doi:10.1109/ICCAD51958.2021.9643549
2020
- Firoz J.S., A. Li, J. Li, and K.J. Barker. 2020. "On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics." In IEEE High Performance Extreme Computing Conference (HPEC 2020), September 22-24, 2020, Waltham, MA, 1-7. Piscataway, New Jersey: IEEE. PNNL-SA-153853. doi:10.1109/HPEC43674.2020.9286152
- Geng T., A. Li, R. Shi, C. Wu, T. Wang, Y. Li, and P. Haghi, et al. 2020. "AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing." In Proceedings 53rd IEEE/ACM International Symposium on Microarchitecture (MICRO), October 17-21, 2020, Athens, Greece, 922-936. Piscataway, New Jersey: IEEE. PNNL-SA-146537. doi:10.1109/MICRO50266.2020.00079
- Geng T., C. Wu, C. Tan, B. Fang, A. Li, and M. Herbordt. 2020. "CQNN: a CGRA-based QNN Framework." In IEEE High Performance Extreme Computing Conference (HPEC 2020), September 22-24, 2020, Waltham, MA, 1-7. Piscataway, New Jersey: IEEE. PNNL-SA-153940. doi:10.1109/HPEC43674.2020.9286194
- Li A., O. Subasi, X. Yang, and S. Krishnamoorthy. 2020. "Density Matrix Quantum Circuit Simulation via the BSP Machine on Modern GPU Clusters." In International Conference for High Performance Computing, Networking, Storage and Analysis (SC2020), November 9-19, 2020, Atlanta, GA, 1-15. Piscataway, New Jersey: IEEE. PNNL-SA-143160. doi:10.1109/SC41405.2020.00017
- Li A., S. Song, J. Chen, J. Li, X. Liu, N.R. Tallent, and K.J. Barker. 2020. "Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect." IEEE Transactions on Parallel and Distributed Systems 31, no. 1:94 - 110. PNNL-SA-141707. doi:10.1109/TPDS.2019.2928289
- Li J., M. Lakshminarasimhan, X. Wu, A. Li, C. Olschanowsky, and K.J. Barker. 2020. "A Sparse Tensor Benchmark Suite for CPUs and GPUs." In IEEE International Symposium on Workload Characterization (IISWC 2020), October 27-30, 2020, Beijing, China, 193-204. Piscataway, New Jersey: IEEE. PNNL-SA-142736. doi:10.1109/IISWC50251.2020.00027
- Shi R., P. Dong, T. Geng, Y. Ding, X. Ma, H. So, and M. Herbordt, et al. 2020. "CSB-RNN: A Faster-Than-Realtime RNN Acceleration Framework with Compressed Structured Blocks." In Proceedings of the 34th International Conference on Supercomputing (ICS 2020), June 29-July 2, 2020, Barcelona, Spain, Article No. 24. New York, New York: Association for Computing Machinery. PNNL-SA-150973. doi:10.1145/3392717.3392749
- Tan C., C. Xie, A. Li, K.J. Barker, and A. Tumeo. 2020. "OpenCGRA: An Open-Source Unified Framework for Modeling, Testing, and Evaluating CGRAs." In IEEE 38th International Conference on Computer Design (ICCD 2020), October 18-21, 2020, 381-388. Piscataway, New Jersey: IEEE. PNNL-SA-152863. doi:10.1109/ICCD50377.2020.00070
- Wang T., T. Geng, A. Li, X. Jin, and M. Herbordt. 2020. "FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters." IEEE Transactions on Computers 68, no. 8:1143 - 1158. PNNL-SA-140455. doi:10.1109/TC.2020.3000118
- Zou P., A. Li, K.J. Barker, and R. Ge. 2020. "Detecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines." In Proceedings of the 49th International Conference on Parallel Processing (ICPP 2020) August 17-20, 2020, Online., Article No.3404435. New York, New York: Association for Computing Machinery. PNNL-SA-148325. doi:10.1145/3404397.3404435
- Zou P., A. Li, K.J. Barker, and R. Ge. 2020. "Indicator-directed Dynamic Power Management for Iterative Workloads on GPU-Accelerated Systems." In The 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid 2020), May 11-144, 2020, Melbourne, Australia, 559-568. Piscataway, New Jersey: IEEE. PNNL-SA-148280. doi:10.1109/CCGrid49817.2020.00-37
2019
- Geng T., T. Wang, C. Wu, C. Yang, W. Wu, A. Li, and M. Herbordt. 2019. "O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning." In Proceedings of International Conference on Supercomputing, June 2019, Phoenix, AZ, 461-472. New York, New York: Association for Computing Machinery. PNNL-SA-141065. doi:10.1145/3330345.3330386
- Geng T., T. Wang, C. Wu, S. Song, A. Li, and M. Herbordt. 2019. "LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism." In IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP 2019), July 15-17, 2019, New York, 9-16. Piscataway, New Jersey: IEEE. PNNL-SA-143161. doi:10.1109/ASAP.2019.00-43
- Li A., T. Geng, T. Wang, M. Herbordt, S. Song, and K.J. Barker. 2019. "BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets." In International Conference for High Performance Computing, Networking, Storage, and Analysis, November 17-22, 2019, Denver, CO, Article No a38. Los Alamitos, California: IEEE Computer Society. PNNL-SA-142851. doi:10.1145/3295500.3356169
- Li J., Y. Ma, X. Wu, A. Li, and K.J. Barker. 2019. "PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite." CCF Transactions on High Performance Computing 1, no. 2:111-130. PNNL-SA-140675. doi:10.1007/s42514-019-00012-w
- Xie C., X. Zhang, A. Li, X. Fu, and S. Song. 2019. "PIM-VR: Erasing Motion Anomalies In Highly-Interactive Virtual Reality World With Customized Memory Cube." In IEEE International Symposium on High Performance Computer Architecture. PNNL-SA-143513. doi:10.1109/HPCA.2019.00013
- Zou P., A. Li, K.J. Barker, and R. Ge. 2019. "Fingerprinting Anomalous Computation with RNN for GPU-accelerated HPC Machines." In IEEE International Symposium on Workload Characterization (IISWC 2019), November 3-5, 2019, Orlando, FL, 253-256. Piscataway, New Jersey: IEEE. PNNL-SA-144356. doi:10.1109/IISWC47752.2019.9042165
2018
- Li A., W. Liu, L. Wang, K.J. Barker, and S. Song. 2018. "Warp-Consolidation: A Novel Execution Model for GPUs." In The 32nd ACM International Conference on Supercomputing. PNNL-SA-133947. doi: 10.1145/3205289.3205294
- Li A., S. Song, J. Chen, X. Liu, N.R. Tallent, and K.J. Barker. 2018. "Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite." In IEEE International Symposium on Workload Characterization (IISWC 2018), September 30-October 2, 2018, 191-202. Piscataway, New Jersey: IEEE. PNNL-SA-137642. doi:10.1109/IISWC.2018.8573483
- Shen D., A. Li, S. Song, and X. Liu. 2018. "CUDAAdvisor: LLVM-based Runtime Profiling for Modern GPUs." In International Symposium on Code Generation and Optimization. PNNL-SA-143512. doi:10.1145/3168831
- Wang L., J. Ye, Y. Zhao, W. Wu, A. Li, S. Song, and Z. Xu, et al. 2018. "SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks." In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (PPOPP 2018), February 24-28, 2018, Vienna, Austria, 41-53. New York, New York: ACM. PNNL-SA-143407. doi:10.1145/3200691.3178491
2017
- Li A., W. Liu, M. Kristensen, B. Vinter, H. Wang, K. Hou, and A. Marquez, et al. 2017. "Exploring And Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels." In International Conference for High Performance Computing, Networking, Storage and Analysis. PNNL-SA-143163. doi:10.1145/3126908.3126931
- Li A., S. Song, W. Liu, X. Liu, A. Kumar, and H. Corporaal. 2017. "Locality-Aware CTA Clustering for Modern GPUs." In The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems. PNNL-SA-143164. doi:10.1145/3093337.3037709
- Li A., W. Zhao, and S. Song. 2017. "BVF: Enabling Significant On-Chip Power Savings via Bit-Value-Favor for Throughput Processors." In The 50th Annual IEEE/ACM International Symposium on Microarchitecture. PNNL-SA-130500. doi:10.1145/3123939.3123944
- Liu W., A. Li, J.D. Hogg, I.S. Duff, and B. Vinter. 2017. "Fast synchronization-free algorithms for parallel sparse triangular solves with multiple right-hand sides." Concurrency and Computation: Practice and Experience 29, no. 21:Article No. e4244. PNNL-SA-130501. doi:10.1002/cpe.4244
- Zhao W., A. Li, Y. Wang, and Y. Ha. 2017. "Analysis and Design of Energy-Efficient Data-Dependent SRAM." In IEEE 12th International Conference on ASIC. PNNL-SA-143165. doi:10.1109/ASICON.2017.8252625