While it is impossible to predict exactly what the future will hold, one thing is certain: big data is here to stay. As larger and more complex datasets are being produced, computer scientists and computational researchers are seeking ways to make this data more useful and understandable. In a recent contributed article featured on the cover of the Communications of the ACM, a diverse group of data management and large-scale systems researchers, including Pacific Northwest National Laboratory (PNNL) computer scientist Antonino Tumeo, discuss the future of graph processing systems that make sense of this data.
Graph analytics underlie many of our daily activities, from browsing social media to shopping online. Whenever a website ‘recommends’ products, pages, or people to you, those recommendations come thanks to graph analytics of your past browsing history or your social networks. This type of analysis can do much more than personalize advertisements, however. The Graphs 4 COVID-19 initiative is helping combat the pandemic by providing contact tracing, connecting COVID-19 publications to drug repurposing, and tracking misinformation to educate communities.
The authors of the Communications of the ACM article met in a 2019 Dagstuhl seminar on Big Graph Processing Systems. Their discussions on the opportunities and challenges within graph processing inspired the creation of the collaborative manuscript.
“It was an honor to participate in this seminar connecting two very different but interconnected communities of researchers and presenting the point of views of the high-performance graph processing community,” said Tumeo. “As reflected in the vision paper that summarizes our discussions, the group agreed that solutions to existing challenges and new opportunities to enable efficient large graph processing can only arise by considering the whole computing system stack.”
At PNNL, Tumeo leads the Software-Defined Architecture for Data Analysis (SO(DA)2) project for the Data-Model Convergence (DMC) Initiative. He, along with other researchers from PNNL and the Polytechnic University of Milan, developed Svelto, a way to generate graph analytics accelerators starting from the code of algorithms, which can be accessed via GitHub.
Tumeo believes that within the next 5 to 10 years, researchers will need to address what performance really means in the context of graph processing and how to make the most effective use of hardware and software specializations for graph processing, while at the same time providing high productivity and portability. Researchers will also need to consider if it is possible to design data analytics systems able to support complex data models where graphs are one of many ways to visualize the data. Soon, graph processing systems will need to be made interoperable with machine learning and scientific simulation to support a new generation of converged applications.