February 11, 2025
Journal Article

Analyzing Inference Workloads for Spatiotemporal Modeling

Abstract

Ensuring power grid resiliency, forecasting climate conditions, optimization of transportation infrastructure are some of the many application areas where data is collected in both space and time. Spatiotemporal modeling is about modeling those patterns for forecasting future trends and carrying out critical decision-making by leveraging machine learning/deep learning. Once trained offline, field deployment of trained models for near real-time inference could be challenging because performance can vary significantly depending on the environment, available compute, and tolerance to ambiguity in results. To facilitate the co-design of next-generation hardware architectures for field deployment of trained models, it is critical to characterize the workloads of these deep learning (DL) applications during inference and assess their computational patterns at different levels of the execution stack. In this paper, we develop several variants of deep learning applications that use spatiotemporal data from dynamical systems. We study the associated computational patterns for inference workloads at different levels, considering relevant models (Long short-term Memory, Convolutional Neural Network and Spatio-Temporal Graph Convolution Network), DL frameworks (Tensorflow and PyTorch), precision (FP64, FP32 and AMP), inference runtime (ONNX and AI Template), post-training quantization (TensorRT) and platforms (Nvidia DGX A100, Sambanova SN10 RDU and SODA High-Level Synthesizer). Overall, our findings indicate that although there is potential in mixed-precision models and post-training quantization for spatiotemporal modeling, extracting efficiency from contemporary GPGPU systems might be challenging. Instead, co-designing custom accelerators by leveraging optimized High Level Synthesis frameworks can make workload-specific adjustments to improve sustainable performance.

Published: February 11, 2025

Citation

Jain M., N. Bohm Agostini, S. Ghosh, and A. Tumeo. 2025. Analyzing Inference Workloads for Spatiotemporal Modeling. Future Generation Computer Systems 163, no. _:Art No. 107513. PNNL-SA-187612. doi:10.1016/j.future.2024.107513