Skip to Main Content U.S. Department of Energy
Fundamental and Computational Sciences Directorate

Staff information


Leon Song

High Performance Computing
Pacific Northwest National Laboratory
PO Box 999
MSIN: J4-30
Richland, WA 99352


Dr. Shuaiwen Leon Song is currently a staff research scientist in High Performance Computing Group at Pacific Northwest National Lab (PNNL). He got Ph.D. degree from Computer Science department at Virginia Tech in May 2013. In the past, he has interned with several government and industrial labs including Center for Advanced Computing (CASC) at Lawrence Livermore National Lab (LLNL), Performance Analysis Lab (PAL) at Pacific Northwest National Lab (PNNL), and the Architecture Research Division at NEC Research American at Princeton.

He was a 2011 Livermore ISCR scholar and recipient of 2011 Paul E. Torgersen Excellent research award. He has published in the major HPC conferences including HPDC, ICS, SC, PACT, and IPDPS, etc. His SC'15 paper is nominated for best student paper. He has served as PC member, session or publicity chair for several major HPC venues including SC, IPDPS, HPDC, etc. HIs past and current research is funded by several major government agencies including DOE ASCR, DoD, and DoD DARPA.

Research Interests

  • Performance and Energy modeling and analysis for HPC systems
  • Fault tolerance and system reliability
  • Multi-core and Many-core architectures (e.g., emergent many-core accelerators)
  • Power-aware computing and energy-efficient design for large scale distributed systems
  • Big data analytic, Deep Learning, and Dynamic modeling techniques (e.g. machine learning)
  • Approximate Computing
  • Runtime System

Education and Credentials

  • Ph.D. in Computer Science and Application, Virginia Tech, May 2013
  • Master's in Computer Science and Application, Virginia Tech, May 2009

Affiliations and Professional Service

  • IEEE professional
  • ACM professional
  • Upsilon Pi Epsilon

Awards and Recognitions

  • PNNL staff research highlight award 2015
  • PNNL research award 2015
  • Best student paper nominee for SC15
  • Recipient of 2011 Paul E. Torgersen excellent research award
  • 2011 ISCR scholar, Lawrence Livermore National Lab
  • PACT 12 ACM SRC travel award by Microsoft Research
  • SC 11 selected Ph.D. showcase

PNNL Publications


  • Li A, S Song, A Kumar, E Zhang, D Chavarría-Miranda, and H Corporaal. 2016. "Critical Points Based Register-Concurrency Autotuning for GPUs." In Proceedings of the Design, Automation and Test in Europe Conference (DATE 2016), March 14-18, 2016, Dresden, Germany, pp. 1273-1278.  IEEE, Piscataway, NJ. 
  • Tan L, Z Chen, and S Song. 2016. "Scalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology." ACM Transactions on Architecture and Code Optimization 12(4):Article No. 35.  doi:10.1145/2822893
  • Tan L, Z Chen, and S Song. 2016. "Scalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology." In 11th International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC 2016), January 18-20, 2016, Prague, Czech Republic.  ACM , New York, NY. 


  • Li C, S Song, H Dai, A Sidelnik, S Hari, and H Zhou. 2015. "Locality-Driven Dynamic GPU Cache Bypassing." In Proceedings of the 29th ACM on International Conference on Supercomputing (ICS 2015), June 8-11, 2015, Newport Beach, California, pp. 66-77.  ACM , New York, NY.  doi:10.1145/2751205.2751237
  • Sengupta D, S Song, K Agarwal, and K Schwan. 2015. "GraphReduce: Processing Large-Scale Graphs on Accelerator-Based Systems." In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'15), November 15-20, 2015, Austin, Texas, p. Paper No. 28.  ACM , New York, NY.  doi:10.1145/2807591.2807655
  • Sengupta D, K Agarwal, S Song, and K Schwan. 2015. "GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems." In IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW 2015), May 25-29, 2016, Hyderabad, India, pp. 604-609.  IEEE, Piscataway, NJ.  doi:10.1109/IPDPSW.2015.16
  • Shrestha S, JB Manzano Franco, A Marquez, S Zuckerman, S Song, and GR Gao. 2015. "Gregarious Data Re-structuring in a Many Core Architecture." In IEEE 17th International Conference on High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conference on Embedded Software and Systems (ICESS), August 24-26, 2015, New York, pp. 712-720.  IEEE, Piscataway, NJ.  doi:10.1109/HPCC-CSS-ICESS.2015.291
  • Tan L, S Song, P Wu, Z Chen, R Ge, and DJ Kerbyson. 2015. "Investigating the Interplay between Energy Efficiency and Resilience in High Performance Computing." In IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015), May 25-29, 2015, Hyderabad, India, pp. 786-796.  IEEE Computer Society, Los Alamitos.  doi:10.1109/IPDPS.2015.108
  • You Y, H Fu, S Song, A Randles, DJ Kerbyson, A Marquez, G Yang, and A Hoisie. 2015. "Scaling Support Vector Machines On Modern HPC Platforms." Journal of Parallel and Distributed Computing 76:16-31.  doi:10.1016/j.jpdc.2014.09.005


  • Li B, HC Chang, S Song, CY Su, T Meyer, J Mooring, and K Cameron. 2014. "Extending PowerPack for Profiling and Analysis of High Performance Accelerator-Based Systems." Parallel Processing Letters 24(4):Article No. 144200.  doi:10.1142/S0129626414420018
  • Li B, HC Chang, S Song, CY Su, T Meyer, J Mooring, and K Cameron. 2014. "The Power-Performance Tradeoffs of the Intel Xeon Phi on HPC Applications." In IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW 2014), May 19-23, 2014, Phoenix, Arizona, pp. 1448-1456.  IEEE, Piscataway, NJ.  doi:10.1109/IPDPSW.2014.162
  • Marquez A, JB Manzano Franco, S Song, B Meister, S Shrestha, T St. John, and GR Gao. 2014. "ACDT: Architected Composite Data Types Trading-in Unfettered Data Access for Improved Execution." In The 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2014), December 16-19, 2015, Hsinchu, Taiwan, pp. 289-297.  IEEE, Piscataway, NJ.  doi:10.1109/PADSW.2014.7097820
  • You Y, S Song, and DJ Kerbyson. 2014. "An Adaptive Cross-Architecture Combination Method for Graph Traversal." In Proceedings of the 28th ACM international conference on Supercomputing (ICS'14), June 10-13, 2014, Munich, Germany, pp. 169-169.  Association for Computing Machinery , New York, NY.  doi:10.1145/2597652.2600110
  • You Y, H Fu, S Song, M Mehri Dehanavi, L Gan, X Huang, and G Yang. 2014. "Evaluating Multi-core Architectures through Accelerating the Three-Dimensional Lax–Wendroff Correction." International Journal of High Performance Computing Applications 28(3):301-318.  doi:10.1177/1094342014524807
  • You Y, S Song, H Fu, A Marquez, M Mehri Dehanavi, KJ Barker, K Cameron, A Randles, and G Yang. 2014. "MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures." In IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), May 19-23, 2014, Phoenix, Arizona, pp. 809-818.  IEEE Computer Society, Los Alamitos, CA.  doi:10.1109/IPDPS.2014.88


  • Vishnu A, S Song, A Marquez, KJ Barker, DJ Kerbyson, K Cameron, and P Balaji. 2013. "Designing Energy Efficient Communication Runtime Systems: A View from PGAS Models." Journal of Supercomputing 63(3):691-709 .  doi:10.1007/s11227-011-0699-9
  • Li B, S Song, I Bezakova, and K Cameron. 2013. "EDR: An Energy-Aware Runtime Load Distribution System for Data-Intensive Applications in the Cloud." In IEEE International Conference on Cluster Computing (CLUSTER 2013), September 23-27, 2013, Indianapolis, IN, pp. 1-8.  Institute of Electrical and Electronics Engineers , Piscataway, NJ.  doi:10.1109/CLUSTER.2013.6702674
  • Song S, KJ Barker, and DJ Kerbyson. 2013. "Unified Performance and Power Modeling of Scientific Workloads." In E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, November 17-21, 2013, Denver, Colorado, p. Article No. 4.  Association for Computing Machinery, New York, NY.  doi:10.1145/2536430.2536435
  • Song S, NR Tallent, and A Vishnu. 2013. "Exploring Machine Learning Techniques For Dynamic Modeling on Future Exascale Systems." In Modeling & Simulation of Exascale Systems & Applications: Workshop on Modeling & Simulation of Exascale Systems & Applications, September 18-19, 2013, Seattle, Washington.  US Department of Energy, Office of Advanced Scientific Computing Research, Washington DC. 


  • Song S, C Si Yu, R Ge, A Vishnu, and K Cameron. 2011. "Iso-Energy-Efficiency: An Approach to Power Constrained Parallel Computation." In IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011), May 16-20, 2011, Anchorage, Alaska, pp. 128-139.  IEEE, Piscataway, NJ.  doi:10.1109/IPDPS.2011.22


  • Vishnu A, HJJ van Dam, WA De Jong, P Balaji, and S Song. 2010. "Fault Tolerant Communication Runtime Support for Data-Centric Programming Models." In International Conference on High Performance Computing (HiPC 2010), December 19-22, 2010, Goa, India.  International Electrical and Electronics Engineers, Piscataway, NJ.  doi:10.1109/HIPC.2010.5713195
  • Vishnu A, S Song, A Marquez, KJ Barker, DJ Kerbyson, K Cameron, and P Balaji. 2010. "Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models." In IEEE/ACM Internationall Conference on Green Computing and Communications (GreenCom 2010) and the International Conference on Cyber, Physical and Social Computing (CPSCom 2010), December 18-20, 2010, Hangzhou, China, ed. P Zhu, et al, pp. 229-236.  Institute of Electrical and Electronics Engineers, Inc., Piscatawy, NJ.  doi:10.1109/GreenCom-CPSCom.2010.133

Science at PNNL

Research Areas


User Facilities

Research Highlights

View All Research Highlights & Staff Accomplishments

RSS Feed