Cache

Pacific Northwest National Laboratory partnered with the Department of the Treasury and Amazon Web Services to develop Cache, a cloud-based tool that allows the Treasury’s disparate data to be easily searched, translated, extracted, linked, and analyzed.

Collab

The U.S. Department of the Treasury is the leading entity in the U.S. government dealing with financial and monetary issues. Its powers have been effectively and strategically used when dealing with terrorists, nuclear proliferators, drug cartels, rogue leaders, and nations or individuals engaged in unwanted behaviors. 

To combat these threats, the Treasury relies on a wealth of data collected from various sources, including the following: 

  • The Financial Crimes Enforcement Network collects information from banks when suspicious activities are detected.
  • The Office of Terrorist Financing and Financial Crime deploys technical experts worldwide to collect intelligence from partner nations.
  • The Office of Foreign Assets Control is responsible for implementing and enforcing sanctions.

While the Treasury's mission extends to areas like research and analysis conducted by the Office of Intelligence and Analysis, a common thread uniting all Treasury entities is their need for fast, reliable, and accessible data.

Adopting AWS Technology

In 2019, Treasury leadership sought to enhance its employees' effectiveness in their national security roles by improving their access to IT systems, data, and tools. This initiative sparked discussions with Congress, Pacific Northwest National Laboratory (PNNL), and Amazon Web Services (AWS). Congress supported the modernization of the Treasury’s processes and authorized the necessary funding. AWS was tasked with centralizing data to ensure that it could be better utilized and leveraged by AWS tools. 

Cache-ing in on a New Tool

After assessing the needs of Treasury employees, PNNL created the tool, called Cache. Created through a multi-year collaboration with Treasury’s Office of Terrorism and Financial Intelligence and PNNL, Cache allows efficient exploration of vast amounts of both structured and unstructured data, including PDFs, images, web files, and emails. Further, it facilitates entity resolution across datasets, near-instantaneous document translation, and graphical representations of entity connections. 

These features allow over 1,000 users to search and analyze more than six terabytes of data across secure cloud environments within the Treasury system in an efficient manner, supporting investigations by highlighting essential links to guide enforcement actions, inform sanction decisions, and enhance analytical judgment through its ability to visualize complex relationships and patterns. 

Built with a scalable architecture, Cache manages increasing data sizes without performance loss and seamlessly integrates domain-specific data. It ensures data security through access controls at the user or group level and employs advanced analytical tools, like network, diagrams to explore interconnections in user-uploaded data. 

Cache exemplifies PNNL’s rich history of collaborating with sponsors to advance operational effectiveness, methodologies, and skills crucial for national security.

Training the Next Generation

PNNL has developed several training programs to prepare the next generation of data scientists with practical experience in this methodology. Specifically, the Data Science Training program and the Distinguished Graduate Research program were established in partnership with the University of Washington and North Carolina State University, respectively.

Both programs share similar goals: to assist data science students in tackling real-world problems while pursuing their advanced degrees and to familiarize them with national security challenges. Participants have the opportunity to work on actual Treasury challenges and may potentially intern or work with Treasury or other U.S. government entities in the future.