Advanced Comput, Math & Data
Staff Awards & Honors
Parasol Takes its Place in the Cloud
Architecture for cross-cloud federated graph queries to be featured at ACM SIGMOD workshop
Dr. Sutanay Choudhury
Working with a team from John Hopkins University-Applied Physics Laboratory, Sutanay Choudhury, a computer scientist with PNNL’s Scientific Data Management group (Computational Sciences & Mathematics Division), will be on hand to present, “Parasol: An Architecture for Cross-Cloud Federated Graph Querying,” at the upcoming DanaC: Workshop on Data analytics in the Cloud on June 22, 2014.
Parasol, a flexible architecture for performing effective cross-cloud federated graph queries, provides an interesting solution to querying, analyzing, and fusing data at scale present in multiple data sets and spread across numerous data centers. Such integration can be hampered by data residing in physically distant centers within technologically different cloud platforms. Data fusion can even be stymied by regulations and policies surrounding privacy or data sharing.
Parasol’s architecture features a central coordinator process running in a coordinator cloud that manages query execution over multiple client clouds. Its flexibility, which extends to Parasol’s query optimization design, allows it to make minimal assumptions about each participant cloud architecture and capability, affording easier cloud integration.
As part of their presentation, Choudhury and the other co-authors will describe the experiments used to test Parasol, which involved executing queries on clusters and collecting several metrics for each query using two data sets: Linked Movie Database, a semantic database based on Internet Movie Database that contains information related to films, actors, etc., and a semantic graph database extracted from structured information in Wikipedia.
DanaC, which will be held in conjunction with the 2014 Association for Computing Machinery Special Interest Group on Management of Data/Principles of Database Systems symposium (SIGMOD/PODS) in Snowbird, Utah (June 22-27), unites business analysts and scientists as they tackle complexities involving data and analysis in cloud computing environments.