February 1, 2017
Conference Paper

Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams

Abstract

Situational awareness data is a prototypical example of structured heterogeneous streaming data with asynchronous timing that can quickly scale beyond the cognitive power of a human analyst. We present deep learning methods that can train in real time from unlabeled streams and provide probabilistic anomaly scores. We compare our novel deep and recurrent neural network models to three baseline systems using threat detection recall on the CERT Insider Threat Dataset v6.2. In our experiments, our deep neural network and our recurrent neural network systems outperformed all baselines, and we found that using aggregated activity count features yielded the best performance. Our best model produces on average an anomaly score of 95.53 percentile for the day, suggesting it can facilitate a great reduction in analyst workload. Our models are also interpretable, in that they break down the anomaly score into individual features of user behavior, which could further aid analysts in reviewing potential cases of insider threat.

Revised: November 1, 2019 | Published: February 1, 2017

Citation

Tuor A.R., S.P. Kaplan, B.J. Hutchinson, N.M. Nichols, and S.M. Robinson. 2017. Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams. In The AAAI Workshop on Artificial Intelligence for Cyber Security, 224-231; WS-17-04. Palo Alto, California:Association for the Advancement of Artificial Intelligence. PNNL-SA-122088.