News & Media
Secretary of Energy Advisory Board (SEAB) Report Recognizes PNNL Contributions
Report features how PNNL’s computing capabilities are affecting the nation’s security, science, and energy missions
Contributions from researchers across Pacific Northwest National Laboratory (PNNL) were recognized in the preliminary findings of a Secretary of Energy Advisory Board (SEAB) report from a working group dedicated to the U.S. Department of Energy’s (DOE’s) capabilities and future in artificial intelligence (AI) and machine learning. PNNL researchers’ expertise is prominent throughout DOE’s AI efforts, particularly in the areas of data sciences and national security.
Based largely on input from DOE sponsors, the report features how PNNL’s computing capabilities are affecting the nation’s security, science, and energy missions. Key highlights include:
- Studying how AI affects the global landscape for securing nuclear materials, potentially using deep learning to enhance physical and digital protections against material concealment, delivery, theft, and sabotage.
- Describing how the United States and its partners might employ deep learning to combat attack efforts for enhanced nuclear security.
- Designing advanced deep learning models to characterize operations with buildings, using electrical signatures on power lines, enabling new designs for energy-efficient buildings in addition to enhanced security features for nuclear facilities.
- Leading the nuclear explosive monitoring project with data scientists working to significantly lower detection thresholds of low-yield, evasive underground nuclear explosions without increasing time-to-detection or the amount of human analysis.
- Co-design of advanced accelerator, memory and data movement concepts to support convergence of AI and machine learning methods with other forms of data analytics and traditional scientific high performance computing (HPC).
The report highlights PNNL’s support to the National Nuclear Security Administration, featuring joint laboratory collaborations between PNNL and others, including the Y-12 National Security Complex, Sandia National Laboratories, Lawrence Livermore National Laboratory, Los Alamos National Laboratory, and Oak Ridge National Laboratory. Additionally, PNNL is working as part of DOE’s comparative advantages in AI, providing the Office of Energy Efficiency and Renewable Energy access to AI subject matter experts.
Deep learning to generate in silico chemical property libraries and candidate molecules for small molecule identification in complex samples
Comprehensive and unambiguous identification of small molecules in complex samples will revolutionize our understanding of the role of metabolites in biological systems. Existing and emerging technologies have enabled measurement of chemical properties of molecules in complex mixtures and, in concert, are sensitive enough to resolve even stereoisomers. Despite these experimental advances, small molecule identification is inhibited by (i) chemical reference libraries (e.g. mass spectra, collision cross section, and other measurable property libraries) representing <1% of known molecules, limiting the number of possible identifications, and (ii) the lack of a method to generate candidate matches directly from experimental features (i.e. without a library). To this end, we developed a variational autoencoder (VAE) to learn a continuous numerical, or latent, representation of molecular structure to expand reference libraries for small molecule identification. We extended the VAE to include a chemical property decoder, trained as a multitask network, in order to shape the latent representation such that it assembles according to desired chemical properties. The approach is unique in its application to metabolomics and small molecule identification, with its focus on properties that can be obtained from experimental measurements (m/z, CCS) paired with its training paradigm, which involved a cascade of transfer learning iterations. First, molecular representation is learned from a large dataset of structures with m/z labels. Next, in silico property values are used to continue training, as experimental property data is limited. Finally, the network is further refined by being trained with the experimental data. This allows the network to learn as much as possible at each stage, enabling success with progressively smaller datasets without overfitting. Once trained, the network can be used to predict chemical properties directly from structure, as well as generate candidate structures with desired chemical properties. Our approach is orders of magnitude faster than first-principles simulation for CCS property prediction. Additionally, the ability to generate novel molecules along manifolds, defined by chemical property analogues, positions DarkChem as highly useful in a number of application areas, including metabolomics and small molecule identification, drug discovery and design, chemical forensics, and beyond.
Colby S.M., J. Nunez, N.O. Hodas, C.D. Corley, and R.S. Renslow. 2020. "Deep learning to generate in silico chemical property libraries and candidate molecules for small molecule identification in complex samples." Analytical Chemistry 92, no. 2:1720-1729. PNNL-SA-144150. doi:10.1021/acs.analchem.9b02348
A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems
Currently the training of neural networks relies on data of comparable accuracy but in real applications only a very small set of high-fidelity data is available while inexpensive lower fidelity data may be plentiful. We propose a new composite neural network (NN) that can be trained based on multi-fidelity data. It is comprised of three NNs, with the first NN trained using the low-fidelity data and coupled to two high-fidelity NNs, one with activation functions and another one without, in order to discover and exploit nonlinear and linear correlations, respectively, between the low-fidelity and the high-fidelity data. We first demonstrate the accuracy of the new multi-fidelity NN for approximating some standard benchmark functions but also a 20-dimensional function that is not easy to approximate with other methods, e.g. Gaussian process regression. Subsequently, we extend the recently developed physics-informed neural networks (PINNs) to be trained with multi-fidelity data sets (MPINNs). MPINNs contain four fully-connected neural networks, where the first one approximates the low-fidelity data, while the second and third construct the correlation between the low- and high-fidelity data and produce the multi-fidelity approximation, which is then used in the last NN that encodes the partial differential equations (PDEs). Specifically, by decomposing the correlation into a linear and nonlinear part, the present model is capable of learning both the linear and complex nonlinear correlations between the low- and high-fidelity data adaptively. By training the MPINNs, we can: (1) obtain the correlation between the low- and high-fidelity data, (2) infer the quantities of interest based on a few scattered data, and (3) identify the unknown parameters in the PDEs. In particular, we employ the MPINNs to learn the hydraulic conductivity field for unsaturated flows as well as the reactive models for reactive transport. The results demonstrate that MPINNs can achieve relatively high accuracy based on a very small set of high-fidelity data. Despite the relatively low dimension and limited number of fidelities (two-fidelity levels) for the benchmark problems in the present study, the proposed model can be readily extended to very high-dimensional regression and classification problems involving multi-fidelity data.
Meng X., and G.E. Karniadakis. 2020. "A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems." Journal of Computational Physics 401. PNNL-SA-152903. doi:10.1016/j.jcp.2019.109020
Extraction of mechanical properties of materials through deep learning from instrumented indentation
Instrumented indentation has been developed and widely utilized as one of the most versatile and practical means of extracting mechanical properties of materials. This method is particularly desirable for those applications where it is difficult to experimentally determine the mechanical properties using stress–strain data obtained from coupon specimens. Such applications include material processing and manufacturing of small and large engineering components and structures involving the following: three dimensional (3D) printing, thin-film and multilayered structures, and integrated manufacturing of materials for coupled mechanical and functional properties. Here, we utilize the latest developments in neural networks, including a multifidelity approach whereby deep-learning algorithms are trained to extract elastoplastic properties of metals an d alloys from instrumented indentation results using multiple datasets for desired levels of improved accuracy. We have established algorithms for solving inverse problems by recourse to single, dual, and multiple indentation and demonstrate that these algorithms significantly outperform traditional brute force computations and function-fitting methods. Moreover, we present several multifidelity approaches specifically for solving the inverse indentation problem which 1) significantly reduce the number of high-fidelity datasets required to achieve a given level of accuracy, 2) utilize known physical and scaling laws to improve training efficiency and accuracy, and 3) integrate simulation and experimental data for training disparate datasets to learn and minimize systematic errors. The predictive capabilities and advantages of these multifidelity methods have been assessed by direct comparisons with experimental results for indentation for different commercial alloys, including two wrought aluminum alloys and several 3D printed titanium alloys.
Lu L., M. Dao, P. Kumar, U. Ramamurtyc, G.E. Karniadakis, and S. Suresh. 2020. "Extraction of mechanical properties of materials through deep learning from instrumented indentation." Proceedings of the National Academy of Sciences (PNAS). 117, no. 13:7052–7062. PNNL-SA-152699. doi:10.1073/pnas.1922210117
Adaptive Activation Functions Accelerate Convergence in Deep and Physics-informed Neural Networks
We employ adaptive activation functions for regression in deep and physics-informed neural networks (PINNs) to approximate smooth and discontinuous functions as well as solutions of linear and nonlinear partial differential equations. In particular, we solve the nonlinear Klein-Gordon equation, which has smooth solutions, the nonlinear Burgers equation, which can admit high gradient solutions, and the Helmholtz equation. We introduce a scalable hyper-parameter in the activation function, which can be optimized to achieve best performance of the network as it changes dynamically the topology of the loss function involved in the optimization process. The adaptive activation function has better learning capabilities than the traditional one (fixed activation) as it improves greatly the convergence rate, especially at early training, as well as the solution accuracy. To better understand the learning process, we plot the neural network solution in the frequency domain to examine how the network captures successively different frequency bands present in the solution. We consider both forward problems, where the approximate solutions are obtained, as well as inverse problems, where parameters involved in the governing equation are identified. Our simulation results show that the proposed method is a very simple and effective approach to increase the efficiency, robustness and accuracy of the neural network approximation of nonlinear functions as well as solutions of partial differential equations, especially for forward problems. We theoretically prove that in the proposed method, gradient descent algorithms are not attracted to suboptimal critical points or local minima.
Jagtap A., K. Kawaguchi, and G.E. Karniadakis. 2020. "Adaptive Activation Functions Accelerate Convergence in Deep and Physics-informed Neural Networks." Journal of Computational Physics 404. PNNL-SA-152708. doi:10.1016/j.jcp.2019.109136
Enhancing Neutrino Event Reconstruction with Pixel-Based 3D Readout for Liquid Argon Time Projection Chambers
In this paper we explore the potential improvements in neutrino event reconstruction that a 3D pixelated readout could offer over a 2D projective wire readout for liquid argon time projection chambers. We simulate and study events in two generic, idealized detector configurations for these two designs, classifying events in each sample with deep convolutional neural networks to compare the best 2D results to the best 3D results. In almost all cases we find that the 3D readout provides better reconstruction efficiency and purity than the 2D projective wire readout, with the advantages of 3D being particularly evident in more complex topologies, such as electron neutrino charged current events. We conclude that the use of a 3D pixelated detector could significantly enhance the reach and impact of future liquid argon TPC experiments physics program, such as DUNE.
Adams C., M. Del Tutto, J. Asaadi, M. Bernstein, E.D. Church, R. Guenette, and J.M. Rojas, et al. 2020. "Enhancing Neutrino Event Reconstruction with Pixel-Based 3D Readout for Liquid Argon Time Projection Chambers." Journal of Instrumentation 15, no. 4:Article No. P04009. PNNL-SA-150347. doi:10.1088/1748-0221/15/04/P04009
How Do Visual Explanations Foster End Users' Appropriate Trust in Machine Learning?
We investigated the effects of different visual explanations on users' trust in machine learning classification. We proposed three forms of visual explanations of a classification based on identifying relevant training instances. We conducted a user study to evaluate these visual explanations as well as a no explanation condition. We measured users' trust of a classifier, quantified the effects of these three forms of explanations, and assessed the changes in users' trust. We found that participants trust a classifier appropriately when an explanation is available. The combination of human, classification algorithm and understandable explanation makes better decisions than the classifier and human alone.This work advances the state-of-the-art closer to building trust-able machine learning models and informs the design and appropriate use of automated systems.
Yang F., Z. Huang, J. Scholtz, and D.L. Arendt. 2020. "How Do Visual Explanations Foster End Users' Appropriate Trust in Machine Learning?." In Proceedings of the 25th International Conference on Intelligent User Interfaces (IUI 2020), March 17-20, 2020, Cagliari, Italy, 189–201. New York, New York:Association for Computing Machinery (ACM). PNNL-SA-138276. doi:10.1145/3377325.3377480
Learning Koopman Operators for Systems with Isolated Critical Points
The Koopman operator provides a way to transform a (potentially) nonlinear finite-dimensional dynamical system into an infinite-dimensional linear system by lifting the nonlinear state dynamics into a functional space of observables, where the dynamics are linear. Previous literature has claimed that it is not possible to represent nonlinear dynamics with multiple isolated critical points if the set of observables is finite and contains the state; more precisely, such a set cannot be invariant under the Koopman operator. In this paper, we investigate this claim in more detail and provide an analytical counterexample to disprove it. We also consider the convergence of discretetime Koopman approximation error to the continuous-time error: we show both how this convergence occurs in general and how it can fail for systems with multiple isolated critical points. In particular, discontinuities in Koopman observables at the boundaries of basins of attraction may cause the continuous-time error to diverge; the discretetime error also suffers from this as the sampling time step goes to zero.
Bakker C., K.E. Nowak, and W.S. Rosenthal. 2019. "Learning Koopman Operators for Systems with Isolated Critical Points." In Proceedings of the IEEE 58th Conference on Decision and Control, December 11-13, 2019, Nice, France, 7733-7739. Piscataway, New Jersey:IEEE. PNNL-SA-141977. doi:10.1109/CDC40024.2019.9029818
Cyclotron Radiation Emission Spectroscopy Signal Classification with machine Learning in Project 8
The Cyclotron Radiation Emission Spectroscopy (CRES) technique pioneered by Project 8 measures cyclotron radiation from individual electrons in a background magnetic field to construct a highly precise energy spectrum for beta decay studies and other applications. The detector, magnetic trap geometry, and electron dynamics give rise to a multitude of complex electron signal structures which carry information about distinguishing physical traits. We develop machine learning models to classify CRES signals with high accuracy based on these traits, improve the resultant frequency spectrum, and offer the potential for a sophisticated analysis which will help Project 8 achieve tritium endpoint measurement in the future.
Ashtari Esfahani A., S. Boser, N.G. Buzinsky, R. Cervantes, C. Claessens, L. De Viveiros, and M. Fertl, et al. 2020. "Cyclotron Radiation Emission Spectroscopy Signal Classification with machine Learning in Project 8." New Journal of Physics 22, no. 3:Article No. 033004. PNNL-SA-146046. doi:10.1088/1367-2630/ab71bd
Path-Based Dictionary Augmentation: A Framework for Improving k-Sparse Image Processing
We augment orthogonal matching pursuit (OMP) by introducing an additional step in the identification stage of each pursuit iteration. At each iteration a “path,” or geodesic, is generated between the two dictionary atoms that are most correlated with the residual and from this path select a new atom that has a greater correlation to the residual than either of the two bracketing atoms. Two methods of constructing a path are investigated: the Euclidean geodesic formed by a linear combination of the two atoms and the 2-Wasserstein geodesic corresponding to the optimal transport map between the atoms. The existence of a higher-correlation atom is proven in the Euclidean case under assumptions on the two bracketing atoms. In addition, we provide computational results illustrating improvements in sparse coding and denoising relative to baseline OMP. Although we demonstrate our augmentation on OMP alone, in general it may be applied to any reconstruction algorithm that relies on the selection and sorting of high-similarity atoms during an analysis or identification phase.
Emerson T.H., C.C. Olson, and T.J. Doster. 2020. "Path-Based Dictionary Augmentation: A Framework for Improving k-Sparse Image Processing." IEEE Transactions on Image Processing 29. PNNL-SA-148884. doi:10.1109/TIP.2019.2927331
Multiple social platforms reveal actionable signals for software vulnerability awareness: A study of GitHub, Twitter and Reddit
Software vulnerabilities are flaws in computer systems that leave users open to attack. In many cases, these vulnerabilities go unnoticed and remain unresolved in codebases. Thus, awareness of software vulnerabilities among the public is crucial to ensure effective cybersecurity practices, the development of high quality software, and ultimately national security. This awareness can be better understood by studying the spread and evolution of software vulnerability discussions in online communities. This work is the first to evaluate and contrast how discussions about software vulnerabilities spread on three social platforms -- Twitter, GitHub, and Reddit. To lay the groundwork, we showcase a novel fundamental framework for measuring information spread that identifies the spread mechanisms and observables across platforms, the units of information, and the groups of measurements that can be applied to focus on a specific phenomena e.g., information cascades. We then analyze and contrast social network topologies for three example social networks and measure the scale and speed of the spread of discussion of specific vulnerabilities to understand how far and how widely they spread, how many users participate in discussions, and the duration of their spread. To demonstrate the awareness of more impactful software vulnerabilities, a subset of our analysis focuses on vulnerabilities targeted during recent major cyber attacks as well as vulnerabilities exploited by advanced persistent threat groups. We discover that usually, vulnerability discussions start on GitHub, before occurring on Twitter and Reddit. While studying how some user-level and content-level characteristics influence vulnerability spread, we observe that Twitter discussions started by users predicted to be humans have larger size, breadth, depth, adoption rate, lifetime, and structural virality compared to those started by users predicted to be bots. On Reddit, we contrast the differences in thread structure that originate from posts with positive, negative and neutral polarity. We find that posts that are positive have larger, deeper and wider discussions compared to negative and neutral posts. We anticipate the results of our analysis to not only increase the understanding of software vulnerability awareness but also inform models for simulating information spread across multiple social environments online.
Shrestha P., A. Visweswara Sathanur, S. Maharjan, E.G. Saldanha, D.L. Arendt, and S. Volkova. 2020. "Multiple social platforms reveal actionable signals for software vulnerability awareness: A study of GitHub, Twitter and Reddit." PLoS One 15, no. 3:Article No. e0230250. PNNL-SA-143971. doi:10.1371/journal.pone.0230250