Twenty-four analysts from several U.S. intelligence organizations met in August for a machine learning activity with PNNL researchers Nicole Nichols, Jeremiah Rounds, Lawrence Phillips, and Brian Kritzstein at the Laboratory for Analytic Sciences (LAS) at North Carolina State University. The immersive five-day focused-discovery activity represented the culmination of months of close coordination and collaboration between PNNL and LAS. It also created an opportunity for analysts to interact with emerging machine learning technology and to provide real-life feedback that will strengthen future research and applications.
The PNNL experts are experienced at creating and delivering highly technical content. The LAS mathematicians and statisticians have a clear understanding of mission problems. Together, these teams of rivals set the stage for an engaging week of collaborative learning.
The LAS analysts had mixed computer skills—some had no coding experience at all. So Nichols and Phillips spent the first two days of the workshop making fundamental machine learning concepts accessible. Their goal was to dismiss the assumption that machine learning, as Nichols said, is “only for programmers with a capital P.” They began by teaching the analysts that machine learning is not all-powerful.
“With things like targeted ads, people are starting to think machine learning is magically successful,” Nichols said. “But showing the limitations under the hood helps analysts to start thinking about the gaps that adversaries could possibly exploit.”
Practice Makes Perfect
The limitations of machine learning and how they can be exploited shaped the practicum developed with LAS staff. For the last three days of the activity, with Twitter as the playing field, participants took an Adversarial Machine Learning challenge.
Adversarial Machine Learning is fairly new, but has immediate applications for the intelligence community. It's the art of intentional misclassification, in which one machine learning model seeks to understand how a rival model is trained, and then finds gaps and creates false information to trick that rival model. In a best-case scenario, these tricks then sneak past the rival model undetected and destroy its effectiveness—unless that rival model has been programmed to learn from, and respond to, such tricks.
In a focused-discovery challenge, the LAS practicum pitted teams of analysts and practitioners against a PNNL-created "fixed" model that could differentiate between human- and nonhuman-generated tweets. The teams attacked the fixed model by progressively generating more diversionary synthetic tweets.
The teams were equipped with Jupyter notebooks, a living record filled with initial code and a means to log its output. These notebooks, which were developed with technical contributions from PNNL, recorded the teams’ endeavors using visualizations, equations, and narratives.
First, as a group, the analysts listed ways to stymie the synthetic text detection model. Next, broken into teams, the analysts worked to discover and exploit the weakness of that fixed model. Teams made up of both coders and novices used a number of strategies to expose the fixed model to problematic adversarial approaches. These strategies included character level, synonym, nearest-neighbor substitutions, and the feature importance of phrases. Problematic aspects were recorded in the Jupyter notebooks and applied to other text models the analysts encountered.
The general safety and security of machine learning relies on the ability to detect a model’s risks, and to come to terms with these risks in order to consider deficiencies when machine learning is deployed. Intelligence community analysts helped their technical counterparts learn how to communicate vulnerabilities better. At the same time, the technicians gave analysts a vocabulary for talking to model developers to make sure they were choosing the most robust model for their needs.
Opening pathways between analysts and technicians helps scientists build more robust models and prioritize resources in continued applications.
Kritzstein, a PNNL staff member based at LAS, looks forward to organizing similar activities and future PNNL/LAS collaborations. He and others saw the event as an opportunity to build on the success of PNNL’s integration with the LAS. They called that integration a natural fit for the kind of national laboratory engagement that identifies areas of mutual interest to collaboratively meet the needs of a partner community.
Ideally, future partnerships would include scientists, analysts, and members of the public. Creating technical relationships and unconventional pairings for analysis could help tackle national security challenges. Uniquely, LAS is an overt and collaborative intelligence community element on a college campus. Because of that, researchers can reach out to individuals without government clearance, including partners in industry and students.
“Around 90 percent of the work done at LAS is unclassified,” said Alyson Wilson, a North Carolina State University professor and LAS principal investigator.
The program, founded in 2013, includes faculty and students at eight universities, seven industry partners, and counterparts from national laboratories and government. PNNL has been an LAS collaborator since 2013 and has been fully integrated with LAS operations since 2017.