April 8, 2015
Journal Article

An Approach to Enhance pnetCDF Performance in Environmental Modeling Applications

Abstract

I/O has been considered as a bottleneck in parallel applications. The software package, pnetCDF which works with parallel file systems, was developed to address this issue and provide parallel I/O capability. This study examines the performance of a novel approach which performs data aggregation along either row or column dimension of the spatial domain, and then applies the pnetCDF parallel I/O paradigm. The test was done with three different domain sizes which represents small, moderately large and large data domains, using a small scale Community Multi-scale Air Quality model (CMAQ) mocked up code. The examination includes comparing I/O performance with traditional serial I/O technique, straight application of pnetCDF, and the data aggregation along row and column dimension before applying pnetCDF. After the comparison, “optimal” I/O configurations for this new novel approach were quantified. Data aggregation along the row dimension (pnetCDFcr) works better than along the column dimension (pnetCDFcc) although it may perform slightly worse than straight the pnetCDF method with a small number of processors. When the number of processors becomes larger, pnetCDFcr out performs pnetCDF significantly. If the number of processors keeps increasing, pnetCDF reaches a point that the performance is even worse than the serial I/O technique. This new approach has also been tested on a real application where it performs two times better than the straight pnetCDF paradigm.

Revised: May 21, 2015 | Published: April 8, 2015

Citation

Wong D., C. Yang, J.S. Fu, K. Wong, and Y. Gao. 2015. An Approach to Enhance pnetCDF Performance in Environmental Modeling Applications. Geoscientific Model Development 8, no. 4:1033-1046. PNNL-SA-103315. doi:10.5194/gmd-8-1033-2015