Exascale Computing
What is exascale computing?
Exascale computing refers to the next milestone in the measurement of capabilities of the world’s fastest supercomputers. The lightning speed of these computers is known as an exaflop, or one quintillion calculations per second. Without the world’s highest-performance supercomputers, simulations would take many years or decades of computing to complete the required number of operations.
Through the combination of high-performance computing and simulation, exascale computing is positioned to tackle some of the world’s greatest challenges. These challenges pertain to topics such as national security, climate, medicine, energy, and water. Exascale computing brings information technology and applied science engineering together to advance modeling, simulation, data analytics, machine learning, and artificial intelligence.
The history of exascale computing
In a first step toward exascale computing taken after the previous milestone, the first petascale (1015 FLOPS) computer was put into operation at Los Alamos National Laboratory in 2008. That year, two U.S. government organizations within the Department of Energy (DOE)—the Office of Science and the National Nuclear Security Administration—provided initial funding for the development of an exascale supercomputer. By 2012, the United States had allotted $126 million for exascale computing development.
In 2015, the National Strategic Computing Initiative was created with an executive order by former U.S. President Barack Obama. The initiative funded research for the accelerated development of an exascale system.
Through the development of an Exascale Computing Initiative in 2016, DOE, the Office of Science, and the National Nuclear Security Administration began work to establish the exascale computing capability for DOE national laboratories. This strategic partnership led to the Exascale Computing Project, the prioritization of facilities and projects for the exascale system, and the development of programmatic mission applications. The Exascale Computing Project is devoted to creating the exascale computing foundation that includes applications, software, and hardware technologies.
Also in 2019, DOE and Hewlett Packard Enterprise, formerly Cray, announced plans to build the Frontier exascale computer with the power to perform at greater than 1.5 exaFLOPS. In 2022, Frontier became the world’s first public exascale computer.
In 2019, DOE and Intel announced the development of a supercomputer delivering exascale performance in the United States named Aurora, which is scheduled to be operational in late 2022. The computer is being developed at Argonne National Laboratory and at its peak will have a performance of more than 2 exaFLOPS. It will be used for a range of breakthrough research projects.
In 2020, DOE entered into a contract with Hewlett Packard Enterprise to build the $600 million El Capitan supercomputer by 2023 for simulations to keep the nuclear stockpile safe, secure, and reliable. El Capitan, which is expected to be operational in early 2023, is expected to have a performance of 2 exaFLOPS.
The importance of exascale computing
With a computing power of 1 exaflops, or 1 quintillion operations per second, exascale supercomputers can analyze vast amounts of data in an incredibly quick time frame. Their immense processing power helps expedite scientific discovery—processing data in seconds or minutes that would otherwise take humans hundreds of hours or even years. Exascale computing’s large-scale simulation resources are being used to solve challenges related to scientific discovery, clean energy, nuclear reactors, and stewardship of the nation’s nuclear stockpile.
Through the Exascale Computing Project, computer scientists and engineers are developing exascale applications, as well as new system software and new co-designed hardware, to address challenges of national importance that affect quality of life and national security. The outcome of the project is to provide DOE’s high-performance facilities with applications and hardware technologies to be used on exascale systems.
The benefits of exascale computing
Exascale computing enables researchers to use modeling and simulation to analyze data more quickly and to help resolve challenges for national security, energy, economics, health care, and science. It provides opportunities for discoveries in energy production, energy storage, materials science, artificial intelligence, cancer research, and manufacturing through exponentially more memory, storage, and computer power.
Exascale computing systems offer resources to improve research that affects the world. Through exascale computing, researchers can develop next-generation tools to assess the performance of nuclear weapons, can respond to threats, and can steward stockpiles. The systems can accelerate processes used in additive manufacturing, improve urban planning, assess seismic risks related to earthquakes, and efficiently support the power grid. Predictive modeling from exascale computing is used to test drug response utilized to support and analyze cancer research. In the energy and agriculture sectors, exascale computing aids in the design and commercialization of modular reactors, analyzes stress-resistant crops, evaluates the efficiency of wind plants, and is used in carbon capture, petroleum extraction, and waste disposal.
Limitations of exascale computing
A report released by the Exascale Study Group in 2008 detailed four major challenges of exascale computing. Those challenges included power consumption, data movement, fault tolerance, and extreme parallelism—where CPUs handle separate parts of an overall task. As a result of these challenges, DOE spent more than $300 million in working with six companies over eight years to improve chip reliability, reduce power consumption, and advance technology and equipment. As of 2021, advancements in chip reliability and overall systems processing have led to decreases in power consumption, such as in the case of the Frontier exascale computer where the computer consumes less than 20 megawatts per exaflop compared to the 600 megawatts per 1 exaFLOP predicted. Advancements in hardware, such as using stacked, high bandwidth packaged onto GPUs, among other creations, have led to expedited data movement. Partnerships with vendors for increased chip reliability have reduced failure rates.
While improvements have been made to both hardware and software, computer scientists and engineers continue to identify, research, and develop other improvements that support further advances in exascale computing.
New research developments in exascale computing
In June 2022, Frontier debuted as the world’s first public exascale computer and became the fastest computer in the world. It performed at 1.102 exaFLOPS.
Several other exascale computers are also being developed in the United States. The development of the Aurora exascale computer at Argonne National Laboratory is currently in progress. Additionally, the El Capitan exascale computer at Lawrence Livermore National Laboratory, which will mark the National Nuclear Security Administration’s first exascale supercomputer, is also being developed.
Pacific Northwest National Laboratory research in exascale computing
For more than 25 years, Pacific Northwest National Laboratory (PNNL) has led research in supercomputing. As this research has continued to expand, PNNL and the Environmental Molecular Sciences Laboratory (EMSL), a DOE user facility, have expanded their capabilities. EMSL, for example, hosts the supercomputer Tahoma, which has a peak performance of 1,015 teraflops.
At PNNL, researchers are looking at how to redesign and reinvent the hardware, system software, and applications that will be used for supercomputers. They are also developing applications for future exascale computing systems, creating code that simulates the electrical grid, as well as advancing the frontiers of machine learning to glean insight from supercomputer simulations. An example of this is the ExaSGD project, led by PNNL, which is developing algorithms that can optimize the grid’s response to a large number of disruption events to compute a risk profile for grid operations.
In 2017, DOE chose PNNL to lead the multi-lab Exascale Computing Project (ECP) ExaGraph co-design center, which was focused on graph analytics. Researchers developed methods and techniques to implement important combinatorial algorithms for smart grids, computational biology, computational chemistry, and climate science. ExaGraph, a unified software framework, captured these methods and techniques for future extreme-scale computing.
PNNL researchers are using graph-based machine learning, detailed molecular modeling, and artificial intelligence to create models to answer scientific questions about COVID-19 treatment outcomes. Through what is called counterfactual reasoning, researchers use data to help predict patient outcomes.
PNNL also utilizes data analytics, visualization, and computational modeling and simulation for predictive approaches to biodesigning biofuels and bioproduct production. This accelerates research, which clarifies the molecular mechanisms underlying biological and hydro-biogeochemical processes in the environment.
Researchers at PNNL aid in the development of sophisticated models of biological and environmental processes, including electronic structure and quantum chemistry methods, classical molecular dynamics, continuum models, systems biology and metabolic models, and bioinformatics.
Over 25 years ago, PNNL developed NWChem, an open-source software tool and molecular modeling capability, to study large scientific computational chemistry problems that use parallel computing resources. The ECP NWChemEx project is now redesigning and reimplementing NWChem for pre-exascale and exascale computers, where PNNL is one of several collaborators on the project.
Software codes developed at PNNL are used on DOE’s leadership computing facility systems and at the National Energy Research Scientific Computing Center (NERSC).