PNNL @ NeurIPS 2022

Conference

PNNL @ NeurIPS 2022

Researchers will address a variety of topics related to artificial intelligence and machine learning

November 28-December 9, 2022

Pacific Northwest National Laboratory (PNNL) will be at the Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), November 28–December 9, 2022. PNNL data scientists and engineers will present 10 posters and workshops, participate in a competition, and present on a panel.

Video: Pacific Northwest National Laboratory

NeurIPS is an annual conference that brings together researchers from a wide variety of disciplines, including machine learning, neuroscience, life sciences, and statistics, among many others. This year's conference will be hybrid, with an in-person component held at the New Orleans Convention Center during the first week, moving to a virtual component for the second week. Below is a current list of PNNL presenters who will be sharing their research in artificial intelligence and machine learning at NeurIPS. Read more about the PNNL work being highlighted at NeurIPS here.

Video: Pacific Northwest National Laboratory

CAREERS IN SOFTWARE AND COMPUTING SYSTEMS

PNNL Presentations, Workshops, and Competitions

Characteristics of White Helmets Disinformation vs COVID-19 Misinformation

Women in Machine Learning Workshop

Anika Halappanavar | Maria Glenski

Disinformation campaigns and misinformation hinder the very foundations of democracy and can negatively influence public opinion. It is critical to understand how it spreads to enable mitigations to be developed. In this analysis, we contrast shared and unique misinformation spread patterns in different settings, comparing the rapid spreading of multiple misinformation narratives regarding the global COVID-19 pandemic to disinformation campaigns against a specific organization (White Helmets).

We use two datasets for our analyses. The first is a Twitter dataset that was collected from March 7th to April 19th, 2020, to observe the early response to the COVID -19 pandemic and included 40 narratives, with 250,202 posts from 197,715 users. COVID-19 Misinformation has been spread through many narratives as we observe in this dataset: false cures, origin of the virus, weaponization of the virus, nature of the virus, emergency responses, etc. The second is a dataset collected from April 1st, 2018, to June 6th, 2019, that encompasses 48 unique narratives among two social media platforms – Twitter (167,017 posts from 56,679 users) and YouTube (15,567 posts from 9,176 users) – that target the reputation of the White Helmets (Syrian Civil Defense) organization. The organization has been a target of disinformation campaigns that have been launched against them to change public opinion about them.

Curriculum Based Reinforcement Learning to Avert Cascading Failures in the Electric Grid

Climate Change AI Workshop

December 9 | 7:00 a.m. PST

Kishan Prudhvi Guddanti | Amarsagar Reddy Ramapuram Matavalam (Arizona State University) | Yang Weng (Arizona State University)

We present an approach to integrate the domain knowledge of the electric power grid operations into reinforcement learning (RL) frameworks for effectively learning RL agents to prevent cascading failures. This helps to maximize the utilization of existing grid infrastructure to its maximum capacity, while ensuring the grid operations are within limits. A curriculum-based approach with reward tuning is incorporated into the training procedure by modifying the environment using the network physics. Our procedure is tested on an actor-critic-based agent on the IEEE 14-bus test system using the RL environment developed by RTE, the French transmission system operator (TSO). We observed that naively training the RL agent without the curriculum approach failed to prevent cascading for most test scenarios, while the curriculum based RL agents succeeded in most test scenarios, illustrating the importance of properly integrating domain knowledge of physical systems for real-world RL applications.

Do Neural Networks Trained with Topological Features Learn Different Internal Representations?

Symmetry and Geometry in Neural Representations Workshop

Tegan Emerson | Henry Kvinge | Sarah McGuire (Michigan State University) | Shane Jackson (University of Southern California)

There is a growing body of work that leverages features extracted via topological data analysis to train machine learning models. While this field, sometimes known as topological machine learning (TML), has seen some notable successes, an understanding of how the process of learning from topological features differs from the process of learning from raw data is still limited. In this work, we begin to address one component of this larger issue by asking whether a model trained with topological features learns internal representations of data that are fundamentally different than those learned by a model trained with the original raw data. To quantify "different", we exploit two popular metrics that can be used to measure the similarity of the hidden representations of data within neural networks, neural stitching and centered kernel alignment. From these we draw a range of conclusions about how training with topological features does and does not change the representations that a model learns. Perhaps unsurprisingly, we find that structurally, the hidden representations of models trained and evaluated on topological features differ substantially compared to those trained and evaluated on the corresponding raw data. On the other hand, our experiments show that in some cases, these representations can be reconciled (at least to the degree required to solve the corresponding task) using a simple affine transformation. We conjecture that this means that neural networks trained on raw data may extract some limited topological features in the process of making predictions.

Domain-specific Metrics for Evaluation and Integration of AI

Panel at Tackling Climate Change with Machine Learning Workshop

December 9 | 8:00 a.m. PST

Veronica Adetola | David Dao (ETH Zurich, Switzerland) | Antoine Marot (Réseau de Transport d'Électricité (RTE), France)

Graph Transformer Networks for Nuclear Proliferation Detection in Urban Environments

Women in Machine Learning Workshop

December 5 | 11:00 a.m. PST

Anastasiya Usenko | Yasanka Horawalavithana | Ellyn Ayton | Svitlana Volkova | Joon-Seok Kim (Oak Ridge National Laboratory)

A network of sensors deployed in urban environments continuously monitor for the presence of radioactive isotopes whether routine (i.e., medical procedures) or nefarious (i.e., nuclear proliferation). Unattended radiological sensor networks must take advantage of contextual data (open-source and historical sensor signals) to anticipate background isotope signatures across locations and sensors to mitigate nuance alarms. In our approach, we develop novel graph transformer networks to predict radiological sensor and isotope alerts with signals extracted from historical time series and context from nearby radiation sources.

In What Ways Are Deep Neural Networks Invariant and How Should We Measure This?

Main Track Poster

It is often said that a deep learning model is "invariant" to some specific type of transformation. However, what is meant by this statement strongly depends on the context in which it is made. In this paper we explore the nature of invariance and equivariance of deep learning models with the goal of better understanding the ways in which they actually capture these concepts on a formal level. We introduce a family of invariance and equivariance metrics that allows us to quantify these properties in a way that disentangles them from other metrics such as loss or accuracy. We use our metrics to better understand the two most popular methods used to build invariance into networks: data augmentation and equivariant layers. We draw a range of conclusions about invariance and equivariance in deep learning models, ranging from whether initializing a model with pretrained weights has an effect on a trained model's invariance, to the extent to which invariance learned via training can generalize to out-of-distribution data.

Lessons from Developing Multimodal Models with Code and Developer Interactions

I Can't Believe It's Not Better Workshop: Understanding Deep Learning Through Empirical Falsification

Nicholas Botzer | Yasanka Horawalavithana | Svitlana Volkova | Tim Weninger (University of Notre Dame)

Recent advances in natural language processing has seen the rise of language models trained on code. Of great interest is the ability of these models to find and classify defects in existing code bases. These models have been applied to defect detection but improvements between these models has been minor. Literature from cyber security highlights how developer behaviors are often the cause of these defects. In this work we propose to approach the defect detection problem in a multimodal manner using weakly-aligned code and the developer workflow data.

We find that models trained on code and developer interactions tend to overfit and do not generalize because of weak-alignment between the code and developer workflow data.

On the Symmetries of Deep Learning Models and their Internal Representations

Main Track Poster

Charles Godfrey | Davis Brown | Tegan Emerson | Henry Kvinge

Symmetry is a fundamental tool in the exploration of a broad range of complex systems. In machine learning symmetry has been explored in both models and data. In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. We do this by calculating a set of fundamental symmetry groups, which we call the intertwiner groups of the model. We connect intertwiner groups to a model's internal representations of data through a range of experiments that probe similarities between hidden states across models with the same architecture. Our work suggests that the symmetries of a network are propagated into the symmetries in that network's representation of data, providing us with a better understanding of how architecture affects the learning and prediction process. Finally, we speculate that for ReLU networks, the intertwiner groups may provide a justification for the common practice of concentrating model interpretability exploration on the activation basis in hidden layers rather than arbitrary linear combinations thereof.

Petri Nets Enable Causal Reasoning in Dynamical Systems

Causal Dynamics Workshop

December 3 | 9:30 a.m.-10:30 a.m. PST

Jeremy Zucker | Ritwik Anand (Northeastern University) | Vartika Tewari (Northeastern University) | Karen Sachs (Next Generation Analytics) | Olga Vitek (Northeastern University)

Dynamical systems, e.g. economic systems or biomolecular signaling networks, are processes comprised of states that evolve in time. Causal models represent these processes, and support causal queries inferring outcomes of system perturbations. Unfortunately, Structural Causal Models, the traditional causal models of choice, require the system to be in steady state and don't extend to dynamical systems. Recent formulations of causal models with a compatible dynamic syntax, such as Probability Trees, lack a semantics for representing both states and transitions of a system, limiting their ability to fully represent the system and ability to encode the underlying causal assumptions. In contrast, Petri Nets are well-studied models of dynamical systems, with the ability to encode states and transitions. However, their use for causal reasoning has so far been under-explored. This manuscript expands the scope of causal reasoning in dynamical systems by proposing a causal semantics for Petri Nets. We define a pipeline constructing a Petri Net model and calculating the fundamental causal queries: conditioning, interventions, and counterfactuals. A novel aspect of the proposed causal semantics is an unwrapping procedure, which allows for a dichotomy of Petri Net models when calculating a query. On one hand, a base Petri Net model visually represents the system, implicitly encodes the traces defined by the system, and models the underlying causal assumptions. On the other hand, an unwrapped Petri Net explicitly represents traces, and answers causal queries of interest. We demonstrate the utility of the proposed approach on a case study of a dynamical system where Structural Causal Models fail.

Reducing Down(stream)time: Pretraining Molecular GNNs Using Heterogeneous AI Accelorators

Machine Learning and the Physical Sciences Workshop

December 3

Recent advancements in self-supervised learning and transfer learning methods have popularized approaches that involve pretraining models from massive data sources and subsequent finetuning of such models towards a specific task. While such approaches have become the norm in fields such as natural language processing, implementation and evaluation of transfer learning approaches for chemistry are in the early stages. In this work, we demonstrate finetuning for downstream tasks on a graph neural network (GNN) trained over a molecular database containing 2.7 million water clusters. The use of Graphcore IPUs as an AI accelerator for training molecular GNNs reduces training time from a reported 2.7 days on 0.5M clusters to 92 minutes on 2.7M clusters. Finetuning the pretrained model for downstream tasks of molecular dynamics and level-of-theory transfer took only 8.3 hours and 28 minutes, respectively, on a single GPU.

Tackling Climate Change with Machine Learning Workshop

Climate Change AI Workshop

December 9

Co-organizer: Jan Drgona

The focus of this workshop is the use of machine learning to help address climate change, encompassing mitigation efforts (reducing greenhouse gas emissions), adaptation measures (preparing for unavoidable consequences), and climate science (our understanding of the climate and future climate predictions).

The CityLearn Challenge 2022

Main Track Competition

Co-organizer: Jan Drgona

Reinforcement learning has gained popularity as a model-free and adaptive controller for the built-environment in demand-response applications. However, a lack of standardization on previous research has made it difficult to compare different RL algorithms with each other. Also, it is unclear how much effort is required in solving each specific problem in the building domain and how well a trained RL agent will scale up to new environments. The CityLearn Challenge 2022 provides an avenue to address these problems by leveraging CityLearn, an OpenAI Gym Environment for the implementation of RL agents for demand response. The challenge utilizes operational electricity demand data to develop an equivalent digital twin model of the 20 buildings. Participants are to develop energy management agents for battery charge and discharge control in each building with a goal of minimizing electricity demand from the grid, electricity bill and greenhouse gas emissions. We provide a baseline RBC agent for the evaluation of the RL agents performance and rank the participants' according to their solution's ability to outperform the baseline.

Research topics

Artificial Intelligence

Computing & Analytics

Data Science & Computing

Graph and Data Analytics

National Security

Computational Mathematics & Statistics

Scientific Discovery