AbstractDetecting and anticipating global proliferation expertise and capability evolution from unstructured, noisy, and incomplete public data streams is a highly desired but extremely challenging task. In this paper we present our pioneering data-driven approach to support the non-proliferation mission to detect and explain the evolution of proliferation expertise and capability development globally from terabytes of publicly available information (PAI). We first discuss how we fuse nine open-source data streams, including multilingual data, to convert four terabytes of unstructured data to structured knowledge and encode dynamically evolving proliferation expertise representations -- content and context knowledge graphs. For that we rely on Natural Language Processing (NLP) and Deep Learning (DL) models to perform information extraction, topic modeling, and distributed text representation (aka embedding) learning. We then present interactive, usable, and explainable descriptive analytics to refine domain knowledge and present it in a human understandable form. Finally, we discuss future work that focuses on using our dynamic knowledge representations to enable predictive and prescriptive inferences to achieve real-time domain understanding and contextual reasoning. Our AI-driven descriptive, predictive, and prescriptive analytics are designed to supplement traditional nonproliferation efforts by automatically detecting, forecasting, and reasoning about illicit proliferation expertise evolution globally in real-time including strong multilingual, domain knowledge representation and multitask predictive and prescriptive inference components.
Published: January 13, 2023