CDI Project: Integration and Infrastructure Toolsets

Return to Data and Integration Activities
PI: Mathew Thomas
Project Team: Malachi Schram, Noah Oblath, Kevin Fox, Elvis Offor
Project Term: October 2017 to October 2020
Key Science Questions:
- Can we provide a scalable computing framework to address the need of the multidisciplinary effort to study chemical dynamics?
- Can we enable scientists to process and store experimental data, run large-scale computationally expensive high-fidelity physical simulations, and analyze these results using state-of-the-art data analytics, machine learning, and uncertainty quantification methods using heterogeneous computing resources?
Project Description: The project goal is to develop a common framework across projects to study chemical signature dynamics. This will improve the integration of models and measurements within a coherent mathematical framework. This framework would then assist in identifying sensitivity parameters that can guide the design of new measurements.
This work will improve the integration of models and measurements within a coherent mathematical framework. The project teams approach includes:
- using DIRAC (Distributed Infrastructure with Remote Agent Control) INTERWARE, a software framework for distributed computing supports user communities that need access to distributed resources
- working closely with modelers and experimentalists, adapting algorithms and workflows into the DIRAC framework and identifying metadata triggers for the workflow
- setting up separate containers for the workflow components that can talk to each other as needed

The scientific objectives include developing:
- infrastructure to provide automated and reproducible workflows
- analytical tools to facilitate model and experimental comparisons
- analytical tools to provide a statistically meaningful answer to problems
Key Design Considerations:
- data management
- job execution
- collaborative workflow design
- ease of integration of existing scientific workflows
- provenance tracking
- resource management