Advanced Comput, Math & Data
PNNL Rolls Out the MeDICi Integration Framework
Computer scientists at Pacific Northwest National Laboratory have rolled out the MeDICi Integration Framework, a middleware platform (computer software that connects software components or applications) that makes it easy to integrate separate codes into complex applications that operate as a data analysis pipeline. "The framework is the first step in an evolving development project to create an underlying architecture for high performance analytical applications," said Ian Gorton, chief architect of the MeDICi project.
Building such analytical pipelines is typically fraught with difficulties, as the codes and components that process and transform the data are written in different programming languages, requiring the user to invest time in translating the data for each application. Additional problems with data-intensive computing applications are processing data from sensors or instruments that produces high volume data streams and moving large data sets from one application to another. The MeDICi Integration Framework is designed to address these issues and provides the following functions:
Pipeline creation. The MeDICi Integration Framework makes it easy to translate data as it moves from one application to another, turning a set of distributed heterogeneous components into an integrated pipeline.
Data handling. The MeDICi Integration Framework provides features that give pipeline designers choices on how to pass data through pipelines to maximize the performance of the applications.
Component libraries. The MeDICi Integration Framework enables analytical codes written in any language and running on any platform to be plugged into a MeDICi pipeline, simply through the creation of a few lines of code, and without changing the analysis code itself.
Several applications have been built using the MeDICi Integration Framework, demonstrating its robustness and high performance in a variety of challenging, data-intensive computing problems:
- A distributed (running on multiple interconnected computers) cyber-security application deployed in production at a major conference
- Integration of bio-informatics tools into an analytic application
- A distributed fusion and triage architecture, which integrates the components that make up an enterprise analytics applications, for analysis of simulated shipment and cargo data
- High speed video processing in a security context
- Data stream filtering, to identify and select needed information, and processing for real-time scientific instrument control
- Integrating several independently developed computational models into a workflow.
The MeDICi Integration Framework is publicly available for free download.
Sponsor: The development of MeDICi is funded by Pacific Northwest National Laboratory's Data Intensive Computing Initiative.
Research team: The MeDICi development team was led by chief architect Ian Gorton, assisted by Justin Almquist, Jack Chatterton, Adam Wynne, Alan Chappell, George Chin, Jared Chase, Jack Chatterton, Karen Schuchardt, Eric Stephan, and Tara Gibson.