Data essential for examining hard scientific and national security questions are often large, messy, or incomplete. Frequently, these data exist in networks or graphs that grow more complex as the amount of data grows.
Take the energy sector, for example. Technologies continually evolve, and new sources of energy generation come to fruition. These rapid changes create challenges for data integration and understanding.
Likewise, related data carried over digital networks can be nearly impossible to connect. These data can be captured or conveyed with graphs, but at a very high level.
Our researchers are pioneering data and graph analytics using novel visualization and machine learning techniques to tease out data connections.
Data integration across devices, networks, and data sets
Pacific Northwest National Laboratory (PNNL) researchers developed the VOLTTRONTM software platform to integrate smart device operations and communications over the power grid. The platform’s analytical environment seamlessly and securely connects a wide range of data and devices to make automatic decisions based on user needs and preferences. When used in building systems to manage energy consumption, VOLTTRON improves overall system performance and creates a more flexible and reliable grid.
Our researchers are also developing scalable graph platforms to understand network structures and extract actionable information embedded in diverse data sets. Ripples, a first-of-its-kind network analysis tool, can solve complex graph analytics problems in less than a minute on a high-performance computing platform. Grappolo can perform blazing fast graph clustering (community detection) on graphs with millions to billions of nodes. Vite, a distributed version of Grappolo, can scale computation on leadership-class machines for graphs with tens to hundreds of billions of edges, with demonstrated performance on modern multi-GPU systems.
Networks frequently present data of higher complexity than can be captured natively in graphs. Hypergraphs are networks in which entities can be related into groups (hyperedges) of one or more entities, not just as pairs of entities as in graphs. Hypergraph concepts like paths and walks, in addition to length, also have width, or strength of interaction. Methods from network science extend naturally to hypergraphs, and their complex multidimensional structure yields representations from computational topology, including abstract simplicial complexes. HyperNetX(HNX) is an open-source Python library developed to analyze and visualize multi-way relationships modeled as hypergraphs. These relationships can be found in cyber data, protein pathways, bibliographic networks, and social media where interactions involve multiple entities simultaneously.
The Exascale Computing Project’s co-design centers, ExaGraph and ExaLearn, are other examples of our work in this area. Through these projects, we are developing scalable graph algorithms targeting exascale systems, integration with domain science applications, generative models for molecular structures, and scalable implementations of deep reinforcement learning algorithms.
Additionally, PNNL researchers develop mathematical foundations for data-driven decision control for complex systems, such as high-energy physics facilities and smart buildings. This work is part of our partnership with Oak Ridge National Laboratory, the University of Arizona, and the University of California, Santa Barbara.
PNNL has a deep understanding of the mathematical principles that govern the digital landscape. Our data and graph analytics technologies have been deployed in domains including threat detection for national security, cyber analytics, scientific computing, intellectual property portfolio analysis, energy grid reliability, environmental safety, training, and law enforcement. Further, in collaboration with joint institutes such as the University of Washington, our researchers are discovering innovative solutions to complex analytic challenges.