Stream Adaptive Foraging for Evidence Generalized Data Subsetting

Abstract

Some of the most pressing machine learning applications such as cyber security and object recognition lack enough ground-truth training data to build a classifier. We solve this problem by using unsupervised deep learning techniques to determine when data were anomalous or deviated from the norm. Traditional methods require either specific context-dependent expertise to construct models, or prior examples of events and large amounts of well-labeled data to perform machine learning. Some cutting-edge deep learning approaches have been used to characterize complex events, but these have not been applied to the streaming environment and require a great deal of data. We demonstrated the use of an autoencoder and relevant featurization techniques to both learn a feature space and then identify anomalous portions without the aid of labeled data or domain knowledge. In this way, we demonstrated the detection of anomalous features without domain knowledge or tagged examples, and delivered these features to users.

Author

Jasper,Rob

Dernbach,Stefan

Tuor,Aaron

Hilliard,Nathan C

Nichols,Nicole M

Robinson,Sean M

Kaplan,Sam

Exploratory License

Eligible for exploratory license

Market Sector

Data Sciences