Changing the Game

December 1, 2016

Feature

Changing the Game

Song's work makes the cut in two major high-performance computing conferences

Media Contact: PNNL News & Media Relations

Congratulations to Shuaiwen Leon Song, a research scientist with PNNL’s High Performance Computing group, who recently had two papers accepted at high-level conferences that focus on diverse and leading-edge research related to high-performance computer architectures.

Initially, the paper, “Processing-in-Memory Enabled Graphics Processors for 3D Rendering,” was accepted for presentation at the 23rd IEEE International Symposium on High-Performance Computer Architecture, or HPCA-23, to be held next February in Austin. HPCA-23 is a leading forum for scientists and engineers to present their diverse work involving computer architectures. Only 50 papers were selected for this year’s main session presentations.

The work, coauthored with scientists from the University of Houston and Beijing Advanced Innovation Center for Imaging Technology, examines the three-dimensional (3D) gaming experience on modern computer systems that provides today’s gamers with immersive graphics and intense imaging at the cost of memory bandwidth. Song and his coauthors employed two architectural designs to enable processing-in-memory-based graphics processing units (GPU) for efficient 3D rendering. The team used well-known games, including Doom 3 and Half-Life 2, to demonstrate their designs could improve texture filtering performance and 3D rendering up to 6.4 times and 65 percent over baseline GPUs. Their design also was shown to provide significant memory traffic and energy reduction without sacrificing rendering quality.

“As part of this work, we explored the Hybrid Memory Cube, or HMC, a type of stacked memory in the GPU, to efficiently process high-performance graphics applications and alter their pipeline to significantly reduce memory access,” Song explained. “We also enabled an approximate computing strategy with the new design.”

In April 2017, Song will present his work, “Locality-Aware CTA Clustering For Modern GPUs,” coauthored with Ang Li, a Ph.D. intern also with PNNL’s High Performance Computing group, at the 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, known as ASPLOS 2017, being held in China. ASPLOS showcases “groundbreaking research at the intersection of at least two disciplines: architecture, programming languages, operating systems, and related areas,” making this acceptance especially noteworthy for Song.

AngLi_small — Ang Li, a Ph.D. intern at PNNL, co-authored the paper accepted by ASPLOS.

“This year, the conference only accepted 56 out of 321 papers submitted, which is a 17.4 percent acceptance rate,” he noted. “Both the number of papers submitted and accepted are ASPLOS records. It is exciting for this paper, which engages the intersection of architecture and runtime/compiler techniques, to be counted among them.”

For ASPLOS, Li and Song will describe their novel clustering technique for tapping into the performance potential of a largely ignored type of locality: inter-cooperative thread array (CTA) locality. In computing systems, CTAs execute the same programs on an input data set to deliver an output data set. They can do this concurrently or in parallel, and threads within the CTA can communicate with each other.

Their paper describes the concept, method, and design for a “CTA Clustering” framework that automatically exploits inter-CTA locality for general applications. Song and his colleagues designed the framework to be integrated as part of the compiler and immediately deployable on commodity GPUs. The paper also describes how they used NVIDIA GPU architectures (Fermi, Kepler, Maxwell, and Pascal) to validate their method and showed significant performance speedup garnered by improving L1 hit rates and reducing L2 transactions.

“CENATE [PNNL’s Center for Advanced Technology Evaluation] gave me a lot of flexibility to pursue this line of important research,” Song added. “I look forward to showcasing it at ASPLOS.”

References:

Li A and SL Song. 2017. “Locality-Aware CTA Clustering For Modern GPUs.” To be presented at: 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2017). April 08-12, 2017, Xi’an, China.
Xie C, SL Song, J Wang, W Zhang, and X Fu. 2017. “Processing-in-Memory Enabled Graphics Processors for 3D Rendering.” To be presented at: 23rd IEEE International Symposium on High-Performance Computer Architecture (HPCA-23). February 04-08, 2017, Austin, Texas.

Download Publications

Locality-Aware CTA Clustering For Modern GPUs

Processing-in-Memory Enabled Graphics Processors for 3D Rendering

Key Capabilities

Advanced Computer Science, Visualization, & Data

Cyber & Information Sciences

Computational Science

SEE ADDITIONAL CAPABILITIES

###

About PNNL

Pacific Northwest National Laboratory draws on its distinguishing strengths in chemistry, Earth sciences, biology and data science to advance scientific knowledge and address challenges in energy resiliency and national security. Founded in 1965, PNNL is operated by Battelle and supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the DOE Office of Science website. For more information on PNNL, visit PNNL's News Center. Follow us on Twitter, Facebook, LinkedIn and Instagram.

Published: December 1, 2016

Research topics

Future Computing Technologies

Computing & Analytics

PNNL

Changing the Game

Download Publications

Key Capabilities

Research topics

Looking Back, Moving Forward: Reflections from a Decade as PNNL Director

Unique Active Memory Computer Purpose-built for AI Science Applications

PNNL Research Featured at World's Second-Largest Machine Learning Conference

Changing the Game

Download Publications

File

File

Key Capabilities

Research topics