September 23, 2025
Feature

At PNNL, AI Is Accelerating the U.S. Bioeconomy

AI-powered innovations are helping bring bioproducts from the lab to industry

Jeff Czajka in his laboratory space at PNNL.

Jeff Czajka accelerates his bioprocess research using a fusion of mechanistic insights and machine learning tools.

(Photo by Andrea Starr | Pacific Northwest National Laboratory)

Biodiesel, bioplastics, penicillin: “If you look at important industrial bioprocesses, most of them use organisms that naturally make a product,” said Jeff Czajka, a bioprocess scientist and Linus Pauling Fellow at the Department of Energy’s (DOE’s) Pacific Northwest National Laboratory (PNNL). “There hasn’t been as much success engineering strains for bioprocesses to make new or better products—instead, people just optimize what nature has started.”

The reason for that is an all-too-familiar story for many researchers: as they aim to improve a bioproduct, they spend years meticulously engineering a promising microbial strain. It works perfectly in the laboratory, but when it’s scaled up for industrial application, it suddenly fails.

Accurately predicting how organisms’ characteristics respond to not only their genes, but also their environments (a nascent field called “predictive phenomics”) is extraordinarily challenging. Even for microbes, a staggering number of factors influence their traits; for bioproduct innovation, that means that when a strain fails, it’s often almost impossible for researchers to isolate the problem and course correct.

“It’s a huge challenge facing synthetic biology researchers and companies,” Czajka said.

At PNNL, researchers are using AI to tackle that challenge. 

The arts and sciences of AI-accelerated bioprocesses

Often, bioproducts researchers want to engineer bioprocesses to increase their outputs. Higher outputs mean more biofuel and more antibiotics at lower costs.

In one case, PNNL researchers had already engineered Lipomyces starkeyi: a yeast commonly used in biofuel production to produce malic acid which is a platform chemical with a wide range of applications. The next task: increasing the output.

“We were trying to optimize the media—what the yeast was growing in—to produce as much malic acid as possible,” Czajka said. “But there are a lot of parameters: these yeasts are sensitive to all sorts of environmental conditions, and it would have taken a very long time to explore all the options.”

“Currently, the main approach to optimization is trial and error,” he said. “Which leads a lot of people to say that the bioprocess field is an art more than a science.”

But Czajka and his colleagues applied a different kind of art: a machine learning model called the Automated Recommendation Tool (ART). The model was developed by researchers at Lawrence Berkeley National Laboratory and Sandia National Laboratories through the Joint BioEnergy Institute (a DOE Bioenergy Research Center) and DOE’s Agile BioFoundry (a consortium of national laboratories dedicated to accelerating biomanufacturing). PNNL is a member of both groups.

“ART basically designed the experiment for us,” Czajka said. “As we iterated the experiment, ART would learn from the results and suggest which conditions to test next and which conditions it expected would produce the most malic acid.”

“In the end, we improved production by around 20 percent in a short time, which was very exciting.”

The best of both worlds

While AI models can excel at predicting strains’ performance, they are often limited by low availability of the precise molecular measurements that inform them—a phenomenon known as sparse data. Furthermore, even highly predictive AI models often produce “black box” solutions: lots of “what,” but not much “why.”

In these cases, PNNL researchers apply a combination of AI and more traditional genome-scale models, which incorporate researchers’ scientific and mechanistic understandings of an organism’s functions.

“Informing machine learning models with mechanistic insights helps them perform better and improves explainability,” Czajka said. 

Machine learning, in turn, offers advantages over genome-scale models.

“It’s challenging to build accurately predictive genome-scale models because of the complexity of gene-environment interactions—and because there are a lot of genes that we simply don’t know what they do,” Czajka said. “Machine learning models can implicitly capture all that complexity.”

By coupling them, researchers get the best of both worlds: the mechanistic precision of genome-scale modeling and the predictive punch of machine learning. 

Czajka and his colleagues, for instance, applied this combined approach to predict outputs from various strains of Yarrowia lipolytica, a yeast with a wide range of applications in the production of biofuels and other bioproducts. The team first used a genome-scale model to fill in the gaps of the sparse data, then used the resulting dataset to train a machine learning model that accurately predicted output concentrations.

A brighter future for bioproduct development

“Right now, you can spend years going through and picking out targets, learning or engineering the strain, seeing how it performs, and trying to go back through with those insights to engineer them,” Czajka said. “Ideally, these AI-boosted modeling efforts will help shorten those cycles, producing new and improved bioproducts while saving time and reducing costs.”

The AI-accelerated processes and tools being developed at PNNL have broad potential to enhance development of bioenergy molecules and countless other bioproducts. Better-engineered strains could lead to better medicine, better fuels, better materials, and more.

But for Czajka, it’s not about any individual bioproduct.

“In my mind, this is all under one umbrella: transitioning strains from the lab to industry,” he said.

PNNL’s work in AI-accelerated bioproducts development is supported by the Department of Energy, Office of Energy Efficiency and Renewable Energy’s Bioenergy Technologies Office and by PNNL’s internally funded Predictive Phenomics Initiative, which is focused on understanding the inner workings of complex biological systems.

###

About PNNL

Pacific Northwest National Laboratory draws on its distinguishing strengths in chemistry, Earth sciences, biology and data science to advance scientific knowledge and address challenges in energy resiliency and national security. Founded in 1965, PNNL is operated by Battelle and supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the DOE Office of Science website. For more information on PNNL, visit PNNL's News Center. Follow us on Twitter, Facebook, LinkedIn and Instagram.