February 5, 2016
Conference Paper

Effective Tooling for Linked Data Publishing in Scientific Research

Abstract

Challenges that make it difficult to find, share, and combine published data, such as data heterogeneity and resource discovery, have led to increased adoption of semantic data standards and data publishing technologies. To make data more accessible, interconnected and discoverable, some domains are being encouraged to publish their data as Linked Data. Consequently, this trend greatly increases the amount of data that semantic web tools are required to process, store, and interconnect. In attempting to process and manipulate large data sets, tools–ranging from simple text editors to modern triplestores– eventually breakdown upon reaching undefined thresholds. This paper offers a systematic approach that data publishers can use to categorize suitable tools to meet their data publishing needs. We present a real-world use case, the Resource Discovery for Extreme Scale Collaboration (RDESC), which features a scientific dataset(maximum size of 1.4 billion triples) used to evaluate a toolbox for data publishing in climate research. This paper also introduces a semantic data publishing software suite developed for the RDESC project.

Revised: May 13, 2016 | Published: February 5, 2016

Citation

Purohit S., W.P. Smith, A.R. Chappell, P. West, B. Lee, E.G. Stephan, and P. Fox. 2016. Effective Tooling for Linked Data Publishing in Scientific Research. In 10th IEEE International Conference on Semantic Computing (ICSC 2016), February 4-6, 2016, Laguna Hills, California, 24-31. Piscataway, New Jersey:IEEE. PNNL-SA-113974. doi:10.1109/ICSC.2016.87