Abstract—Counterfactual inference is a useful tool for comparing
outcomes of interventions on complex systems. It requires
us to represent the system in form of a structural causal model,
complete with a causal diagram, probabilistic assumptions on exogenous
variables, and functional assignments. Specifying such
models can be extremely difficult in practice. The process requires
substantial domain expertise, and does not scale easily to
large systems, multiple systems, or novel system modifications.
At the same time, many application domains, such as molecular
biology, are rich in structured causal knowledge that is qualitative
in nature. This manuscript proposes a general approach
for querying a causal knowledge graph with a causal question
and converting the qualitative result into a quantitative structural
causal model that can learn from data to answer the question.
We demonstrate the feasibility, accuracy and versatility of this
approach using two case studies in systems biology. The first
demonstrates the appropriateness of the underlying assumptions
and the accuracy of the results. The second demonstrates
the versatility of the approach by querying a knowledge base
for the molecular determinants of a severe acute respiratory
syndrome coronavirus 2 (SARS-CoV-2)-induced cytokine storm
and performing counterfactual inference to predict the causal
effect of medical countermeasures for severely ill COVID-19
patients.
Published: August 12, 2021
Citation
Zucker J.D., K. Paneri, S. Mohammad-Taheri, S. Bhargava, P. Kolambkar, C. Bakker, and J.R. Teuton, et al. 2021.Leveraging Structured Biological Knowledge for Counterfactual Inference: A Case Study of Viral Pathogenesis.IEEE Transactions on Big Data 7, no. 1:25-37.PNNL-SA-154544.doi:10.1109/TBDATA.2021.3050680