September 21, 2022
Journal Article

Do-calculus enables estimation of causal effects in partially observed biomolecular pathways


Motivation: Estimating causal queries, such as changes in protein abundance in response to a perturbation, is a fundamental task in the analysis of biomolecular pathways. The estimation requires experimental measurements on the pathway components. However, in practice many pathway components are left unobserved (latent) because they are either unknown, or difficult to measure. Latent variable models (LVMs) are well-suited for such estimation. Unfortunately, LVM-based estimation of causal queries can be inaccurate when parameters of the latent variables are not uniquely identified, or when the number of latent variables is misspecified. This has limited the use of LVMs for causal inference in biomolecular pathways. Results: In this manuscript, we propose a general and practical approach for LVM-based estimation of causal queries. We prove that, despite the challenges above, LVM-based estimators of causal queries are accurate if the queries are identifiable according to Pearl’s do-calculus, and describe an algorithm for its estimation. We illustrate the breadth and the practical utility of this approach for estimating causal queries in four synthetic and two experimental case studies, where structures of biomolecular pathways challenge the existing methods for causal query estimation. Availability: The code and the data documenting all the case studies are available at

Published: September 21, 2022


Mohammad-Taheri S., J.D. Zucker, C.T. Hoyt, K. Sachs, V. Tewari, R. Ness, and O. Vitek. 2022. Do-calculus enables estimation of causal effects in partially observed biomolecular pathways. Bioinformatics 38, no. Supplement_1:i350-i358. PNNL-SA-172208. doi:10.1093/bioinformatics/btac251