## PNNL @ NeurIPS 2023

*PNNL data scientists and engineers will be at NeurIPS 2023*

Pacific Northwest National Laboratory (PNNL) will be at the Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), Sunday, December 10 through Saturday, December 16. PNNL data scientists and engineers will present posters, lead workshops, and participate in competitions.

The annual NeurIPS conference brings together researchers from various fields, including machine learning (ML), neuroscience, life sciences, and statistics, among many others. The remarkable advancements in ML and artificial intelligence (AI) have paved the way for a new era of applications, revolutionizing various aspects of our daily lives. From situational awareness to analyzing and detecting threats and interpreting online signals to ensure system reliability, PNNL researchers are at the forefront of scientific exploration and national security, harnessing the power of AI to tackle complex scientific problems.

This year's conference will take place at the New Orleans Ernest N. Morial Convention Center, featuring virtual competitions alongside the in-person event. The PNNL presenters who will be showcasing their research in the fields of AI and ML at NeurIPS are identified below.

## PNNL Presentations, Workshops, and Competitions

### The CityLearn Challenge 2023

December 15 | 8:00 a.m. – 12 p.m. PT

Reinforcement learning (RL) has gained popularity as a model-free and adaptive controller for the built-environment in demand-response applications. However, a lack of standardization of previous research has made it difficult to compare different RL algorithms with each other. Also, it is unclear how much effort is required in solving each specific problem in the building domain and how well a trained RL agent will scale up to new environments. The CityLearn Challenge 2023 provides an avenue for addressing these problems by leveraging CityLearn, an OpenAI Gym Environment for the implementation of RL agents for demand response. The challenge uses a novel data set based on the US end-use load profile database. Participants are to develop energy management agents for battery charge and discharge control in each building with a goal of minimizing (1) electricity demand from the grid, (2) the electricity bill, and (3) greenhouse gas emissions. We provide a baseline RBC agent for the evaluation of the RL agents’ performance and rank the participants according to their solution's ability to outperform the baseline.

### Spectral Evolution and Invariance in Linear-width Neural Networks

###### Main Track Poster

December 14 | 8:45 a.m. – 10:45 a.m. PT

**Tony Chiang** |** Andrew Engel**

We investigate the spectral properties of linear-width feed-forward neural networks, where the sample size is asymptotically proportional to network width. Empirically, we show that the spectra of weight in this high dimensional regime are invariant when trained by gradient descent for small constant learning rates; we provide a theoretical justification for this observation and prove the invariance of the bulk spectra for both conjugate and neural tangent kernels. We demonstrate similar characteristics when training with stochastic gradient descent with small learning rates. When the learning rate is large, we exhibit the emergence of an outlier whose corresponding eigenvector is aligned with the training data structure. We also show that after adaptive gradient training, where a lower test error and feature learning emerge, both weight and kernel matrixes exhibit heavy tail behavior. Simple examples are provided to explain when heavy tails can have better generalizations. We exhibit different spectral properties such as invariant bulk, spike, and heavy-tailed distribution from a two-layer neural network using different training strategies, and then correlate them to the feature learning. Analogous phenomena also appear when we train conventional neural networks with real-world data. We conclude that monitoring the evolution of the spectra during training is an essential step toward understanding the training dynamics and feature learning.

### Faster Approximate Subgraph Counts with Privacy

###### Main Track Poster

December 13 | 8:45 a.m. – 10:45 a.m. PT

One of the most common problems studied in the context of differential privacy for graph data is counting the number of non induced embeddings of a subgraph in a given graph. These counts have very high global sensitivity. Therefore, adding noise based on powerful alternative techniques, such as smooth sensitivity and higher-order local sensitivity have been shown give significantly better accuracy. However, all these alternatives to global sensitivity become computationally very expensive, and to date efficient polynomial time algorithms are known only for few selected subgraphs, such as triangles, k-triangles, and k-stars. In this paper, we show that good approximations to these sensitivity metrics can be still used to get private algorithms. Using this approach, we show the first quasilinear time and parallel algorithms for privately counting the number of triangles. We also give a private polynomial time algorithm for counting any constant size subgraph using less noise than the global sensitivity; we show this can be improved significantly for counting paths in special classes of graphs.

### ClimSim: A Large Multi-scale Data Set for Hybrid Physics-ML Climate Emulation

###### Poster at the Data Sets and Benchmarks workshop

December 13 | 3:00 p.m. – 5:00 p.m. PT

Modern climate projections lack adequate spatial and temporal resolution because of computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with ML have introduced a new generation of higher-fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of the lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever data set designed for hybrid ML-physics research. It comprises multi-scale climate simulations developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state. The data set is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society.

### Contextual Reinforcement Learning for Offshore Wind Farm Bidding

###### spotlight talk and poster at the Tackling Climate Change with Machine Learning workshop

December 16 | 7:00 a.m. PT

**Himanshu Sharma** |** Wei Wang**

We propose a framework for applying reinforcement learning to contextual two-stage stochastic optimization and apply this framework to the problem of the energy market bidding of an off-shore wind farm. Reinforcement learning could potentially be used to learn close to optimal solutions for the first-stage variables of a two-stage stochastic program in different contexts. Under the proposed framework, these solutions would be learned without having to solve the full two-stage stochastic program. We present initial results of training using the deep deterministic policy gradient algorithm and present intended future steps to be taken to improve performance.

### Surrogate Model Training Data for FIDVR-related Voltage Control in Large-scale Power Grids

###### poster at the Machine Learning and the Physical Sciences workshop

December 15 | 7:00 a.m. PT

**Tianzhixi (Tim) Yin **|** Ramij Raja Hossain**

This work presents an effective ML data set related to the short-term voltage dynamics in power systems. Power system dynamics are highly nonlinear and intricate. Model designs/specifications in power systems need expertise to capture dynamic phenomena. ML has become an important tool for analyzing the complex behaviors of physical systems, but ML models need quality data sets for training and testing. Learning surrogate models to replicate certain dynamic behaviors of power systems is a growing area of interest; however, building required data sets can be challenging. We use the high-performance computing (HPC)-based grid simulator GridPACK to create the voltage dynamics of a bulk power system, namely the IEEE 300 bus test system, and capture the fault-induced delayed voltage recovery (FIDVR) phenomenon. This FIDVR phenomenon is generally mitigated by the under-voltage load shedding (UVLS)-based control strategy. The data set created here contains the trajectory data of voltage dynamics under different control actions generated by standard UVLS strategy and random noise. We present the structure of the data set and its application in learning a dynamic surrogate model. Finally, other suitable ML-based applications of the given data set are discussed, thereby helping to strengthen reusable science practices.

### Haldane Bundles: A Dataset for Learning to Predict the Chern Number of Line Bundles on the Torus

###### poster at the AI for Materials workshop

December 15 | 7:00 a.m. PT

**Cody Tipton **|** Elizabeth Coda **| **Davis Brown** |** Alyson Bittner **|** Jung Lee **|** Grayson Jorgenson **|** Tegan Emerson **|** Henry Kvinge**

Characteristic classes, which are abstract topological invariants associated with vector bundles, have become an important notion in modern physics and have surprising real-world consequences. As a representative example, the incredible properties of topological insulators, which are insulators in their bulk but conductors on their surface, can be completely characterized by a specific characteristic class associated with their electronic band structure, the first Chern class. Given their importance to next-generation computing and the computational challenge of calculating them using first principles approaches, there is a need to develop machine learning approaches to predict the characteristic classes associated with a material system. To aid in this program, we introduce the Haldane bundle data set, which consists of synthetically generated complex line bundles on the 2-torus. We envision this data set, which is not as challenging as noisy and sparsely measured real-world data sets but (as we show) still difficult for off-the-shelf architectures, to be a testing ground for architectures that incorporate the rich topological and geometric priors underlying characteristic classes.

### Internal Representations of Vision Models Through the Lens of Frames on Data Manifolds

###### Oral presentation at the Workshop on Symmetry and Geometry in Neural Representations

December 16 | 7:00 a.m. PT

**Henry Kvinge** |** Grayson Jorgenson **|** Davis Brown **| **Tegan Emerson**

While the last 5 years have seen considerable progress in understanding the internal representations of deep learning models, many questions remain. This is especially true when trying to understand the impact of model design choices, such as model architecture or training algorithm, on hidden representation geometry and dynamics. In this work we present a new approach to studying such representations inspired by the idea of a frame on the tangent bundle of a manifold. Our construction, which we call a neural frame, is formed by assembling a set of vectors representing specific types of perturbations of a data point, for example infinitesimal augmentations, noise perturbations, or perturbations produced by a generative model, and studying how these change as they pass through a network. Using neural frames, we make observations about the way models process, layer-by-layer, specific modes of variation within a small neighborhood of a datapoint. Our results provide new perspectives on a number of phenomena, such as the manner in which training with augmentation produces model invariance or the proposed trade-off between adversarial training and model generalization. Finally, we use neural frames to propose a new application of CKA—frame CKA—that can be used to compare the way different models capture infinitesimal changes in specific directions around a datapoint.

### Can We Count on Deep Learning: Exploring and Characterizing Combinatorial Structures Using Machine Learning

###### poster at the Workshop on Mathematical Reasoning and AI

December 15 | 7:00 a.m. PT

**Helen Jenne **|** Davis Brown **|** Jackson Warley **|** Timothy Doster **|** Henry Kvinge**

With its exceptional pattern-matching ability, deep learning has proven to be a powerful tool in a range of scientific domains. This is increasingly true in research mathematics, where recent work has demonstrated deep learning's ability to highlight subtle connections between mathematical objects that might escape the notice of a human expert. In this work, we describe a simple method for helping domain experts characterize a set of mathematical objects using deep learning. Such *characterization problems* often occur when some particular class of function, space, linear representation, etc. naturally emerges in calculations or other means but lacks a simple description. The goal is to find simple rules that also ideally shed light on the underlying mathematics. Our method, which we call *Feature Attribution Clustering for Exploration (FACE)*, clusters the feature attribution representations extracted from a trained model, arriving at a short list of prototype attributions that the domain expert can then try to convert into formal and rigorous rules. As a case study, we use our method to derive a new result in combinatorics by characterizing a subset of 0-1 matrixes that corresponds to certain representations of permutations known as two-sided ordered words.

### Attributing Learned Concepts in Neural Networks to Training Data

###### Oral Presentation at the Workshop on Attributing Model Behavior at Scale

December 15 | 7:00 a.m. – 3:00 p.m. PT

**Nicholas Konz**** **|** Madelyn Shapiro **|** Jonathan Tu **|** Henry Kvinge** |** Davis Brown**

By now there is substantial evidence that deep learning models learn certain human-interpretable features as part of their internal representations of data. Because having the right (or wrong) concepts is critical to trustworthy machine learning systems, it is natural to ask which inputs from the model's original training set were most important for learning a concept at a given layer. To answer this, we combine data attribution methods with methods for probing the concepts learned by a model. Training network and probe ensembles for two concept data sets on a range of network layers, we use the recently developed TRAK method for large-scale data attribution. We find some evidence for convergence, where removing the 10,000 top attributing images for a concept and retraining the model does not change the location of the concept in the network nor the probing sparsity of the concept. This suggests that rather than being highly dependent on a few specific examples, the features that inform the development of a concept are spread in a more diffuse manner across its exemplars, implying robustness in concept formation.

### Evaluating Physically Motivated Loss Functions for Photometric Redshift Estimation

###### Poster at the Machine Learning and the Physical Sciences workshop

December 15 | 7:00 a.m. PT

**Andrew Engel **|** Jan Strube**

Physical constraints have been suggested to make neural network models more generalizable, act scientifically plausible, and be more data-efficient over unconstrained baselines. In this report, we present preliminary work on evaluating the effects of adding soft physical constraints to computer vision neural networks trained to estimate the conditional density of redshift on input galaxy images for the Sloan Digital Sky Survey. We introduce physically motivated soft constraint terms that are not implemented with differential or integral operators. We frame this work as a simple ablation study in which the effect of including soft physical constraints is compared to an unconstrained baseline. We compare networks using standard point estimate metrics for photometric redshift estimation, as well as metrics to evaluate how faithful our conditional density estimate represents the probability over the ensemble of our test data set. We find no evidence that the implemented soft physical constraints are more effective regularizers than augmentation.

### Unsupervised Segmentation of Irradiation-induced Order–disorder Phase Transitions in Electron Microscopy

###### Poster at the Machine Learning and the Physical Sciences workshop

December 15 | 7:00 a.m. PT

**Arman Ter-Petrosyan **|** Jenna A Bilbrey **|** Christina Doty **|** Bethany Matthews **|** Le Wang **|** Yingge Du** |** Steven Spurgeon**

We present a method for the unsupervised segmentation of electron microscopy images, which are powerful descriptors of material and chemical systems. Images are oversegmented into overlapping chips, and similarity graphs are generated from embeddings extracted from a domain-pretrained convolutional neural network. The Louvain method for community detection is then applied to perform segmentation. The graph representation provides an intuitive way of presenting the relationship between chips and communities. We demonstrate our method to track irradiation-induced amorphous fronts in thin films used for catalysis and electronics. This method has potential for "on-the-fly'' segmentation to guide emerging automated electron microscopes.

### Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds

###### Poster at the Machine Learning and the Physical Sciences workshop

December 15 | 7:00 a.m. PT

**Colleen Kaul **|** Po-Lun Ma **|** Jacob Shpund **| **Kyle Pressel**

Thorough analysis of local droplet-level interactions is crucial to better understanding the microphysical processes in clouds and their effects on the global climate. High-accuracy simulations of relevant droplet size distributions from large-eddy simulations of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and a continuous range of droplet sizes. Using the compact latent representations from Variational Autoencoders, we produce novel and intuitive visualizations for the organization of droplet sizes and their evolution over time, beyond what is possible using clustering techniques. This greatly improves interpretation and allows us to examine aerosol-cloud interactions by contrasting simulations with different aerosol concentrations. We find that the evolution of the droplet spectrum is similar across aerosol levels but occurs at different paces. This similarity suggests that precipitation initiation processes are alike, despite variations in onset times.

### Assessing the Impact of Distribution Shift on Reinforcement Learning Performance

###### Poster at the regulatable ML Workshop

December 16 | 7:00 a.m. PT

**Ted Fujimoto** |** Joshua Sutterlein** |** Sam Chatterjee **| **Auroop Ganguly**

Research in machine learning is making progress in fixing its own reproducibility crisis. RL, in particular, faces its own set of unique challenges. Comparison of point estimates and plots that show successful convergence to the optimal policy during training may obfuscate overfitting or dependence on the experimental setup. Although researchers in RL have proposed reliability metrics that account for uncertainty to better understand each algorithm's strengths and weaknesses, the recommendations of past work do not assume the presence of out-of-distribution observations. We propose a set of evaluation methods that measure the robustness of RL algorithms under distribution shifts. The tools presented here argue for the need to account for performance over time while the agent is acting in its environment. In particular, we recommend time-series analysis as a method of observational RL evaluation. We also show that the unique properties of RL and simulated dynamic environments allow us to make stronger assumptions to justify the measurement of causal impact in our evaluations. We then apply these tools to single-agent and multi-agent environments to show the impact of introducing distribution shifts during test time. We present this methodology as a first step toward rigorous RL evaluation in the presence of distribution shifts.

### Impacts of Data and Models on Unsupervised Pre-training for Molecular Property Prediction

###### Paper at AI for Accelerated Materials Design

* Elizabeth Coda | Gihan Uthpala Panapitiya* |

**Emily Saldanha**The available labeled data to support molecular property prediction are limited in size due to experimental time and cost requirements. However, unsupervised learning techniques can leverage vast databases of molecular structures, thus significantly expanding the scope of training data. We compare the effectiveness of pre-training data and modeling choices to support the downstream task of molecular aqueous solubility prediction. We also compare the global and local structure of the learned latent spaces to probe the properties of effective pre-training approaches. We find that the pre-training modeling choices affect predictive performance and the latent space structure much more than the data choices.

### 18th Women in Machine Learning Workshop (WiML 2023)

December 11 | 12:00 p.m. – 2:00 p.m. UTC

**PNNL Mentor: Maria Glenski**

WiML 2023 is happening in-person on Monday, December 11, co-located with the NeurIPS 2023 conference. Mentorship roundtable sessions will tentatively take place from 12:00 – 2:00 p.m.; lunch will be served.

We will have a diverse set of discussion groups in the mentorship roundtables, with mentors leading the discussion on a particular topic within each group. WiML attendees will rotate between tables roughly every 20 minutes. This allows attendees to gain exposure to different topics and interact with multiple mentors.