As high performance computing (HPC) infrastructures continue to grow in capability and complexity, so do the applications that they serve. HPC and distributed-area computing (DAC) (e.g. grid and cloud) users are looking increasingly toward workflow solutions to orchestrate their complex application coupling, pre- and post-processing needs To gain insight and a more quantitative understanding of a workflow’s performance our method includes not only the capture of traditional provenance information, but also the capture and integration of system environment metrics helping to give context and explanation for a workflow’s execution. In this paper, we describe IPPD’s provenance management solution (ProvEn) and its hybrid data store combining both of these data provenance perspectives.
Revised: February 15, 2017 |
Published: November 21, 2016
Citation
Elsethagen T.O., E.G. Stephan, B. Raju, M. Schram, M.C. Macduff, D.J. Kerbyson, and K. Kleese-Van Dam, et al. 2016.Data Provenance Hybridization Supporting Extreme-Scale Scientific WorkflowApplications. In New York Scientific Data Summit (NYSDS 2016), August 14-17, 2016, New York. Piscataway, New Jersey:IEEE.PNNL-SA-119959.doi:10.1109/NYSDS.2016.7747819