Pathogens, the culprit of infectious diseases, are wily and complex. Highly contagious diseases like COVID-19, influenza, and malaria have proven to be deadly. Climate change, urbanization, and globalization have empowered these diseases to spread easily and quickly across continents and around the world.
To thwart pathogens, researchers in the epidemiology field of infectious disease (ID) prediction are continuously trying to forecast when, where, and how an ID event will occur. ID prediction is a crucial tool that can provide early warnings to officials, who are making decisions on the prevention of disease occurrence and mitigation of its spread.
Yet, the challenges facing accurate ID prediction are daunting. In addition to infectious diseases being undeterred by any border or boundary, there are a lack of reporting systems, incomplete and delayed epidemiological data sharing, and inadequate and biased disease surveillance initiatives.
Also, disease case counts are not the only early warning signs that can be used to predict IDs. Ideally for IDs to occur, the pathogen, hosts (whether humans, animals, or both), and environment must be in the right conditions. Using a One Health approach, the researchers found that available datasets and state-of-the-art modeling approaches can overcome traditional pitfalls and drastically improve ID predictions. However, there were questions about how frequently this approach was used in the literature and to what effect.
To answer these questions, a team of researchers at Pacific Northwest National Laboratory (PNNL) conducted a systematic review to investigate whether advances in machine learning (ML) and deep learning (DL) techniques are being applied in effective and operational ways to improve ID prediction for better biopreparedness of human and animal diseases.
The researchers examined the quality of ID prediction capabilities, focusing on ML and DL techniques applied during the past two decades. Of the 16,148 journal articles reviewed, the researchers selected 237 for in-depth analysis of the top approaches, strategies, and gaps in the field of ID prediction modeling. The results from this systematic review can be used as a guide to improve future research studies, better address operational needs for model deployment, and inform areas where public health and veterinary policies can improve predictive capabilities. Increasing the accuracy, understanding, and operational deployment of ID prediction will help to prevent disease occurrence and spread, saving lives.
Public health has become a growing concern. Globalization, technology, and the interconnectivity of the world have increased vulnerabilities to the spread of IDs, but they have also created large volumes of information. With the prevalence of machine learning, researchers have the data and the methodology to model all factors that influence ID prediction.
PNNL researchers saw a need to evaluate how the scientific community leveraged these new capabilities. They reviewed recent scientific contributions to epidemiology that used ML or DL approaches for ID prediction, with an emphasis on identifying trends and uncovering gaps in the use of these approaches.
In their findings, the researchers found serious gaps. More than 50% of the literature meeting their criteria for inclusion specifically forecasted zoonotic diseases that infect both humans and animals. However, less than 5% of the papers included more than one species. Also, less than 10% of the papers considered essential model characteristics necessary for operational use, such as uncertainty quantification, proper handling of missing data, and computational efficiency.
As observed during the early stages of COVID-19, the quality of both the data and models is critical to combating the disease during a pandemic. By highlighting the gaps in ML and DL models, the authors hope their analysis will result in greater discussions on the need for improved operational aspects of epidemiology for biopreparedness and response, ultimately resulting in data-driven support for actionable decision-making.
One Health is a collaborative, multisectoral, and transdisciplinary approach—working at the local, regional, national, and global levels—with the goal of achieving optimal health outcomes that recognize the interconnection between people, animals, plants, and their shared environment.
Pacific Northwest National Laboratory
Karl.Pazdernik@pnnl.gov, (509) 372-4978
This research was funded by the Defense Threat Reduction Agency Department of Defense Sector. PNNL is a multi-program national laboratory operated by Battelle for the Department of Energy under contract No. DE-AC05-76RL01830.
Published: December 1, 2022
Keshavamurthy, R., Dixon, S., Pazdernik, K., Charles, L. Predicting infectious disease for biopreparedness and response: A systematic review of machine learning and deep learning approaches. One Health 15, 100439 (2022). https://doi.org/10.1016/j.onehlt.2022.100439.