June 14, 2011
Conference Paper

Combinatorial Information Theoretical Measurement of the Semantic Significance of Semantic Graph Motifs

Abstract

Given an arbitrary semantic graph data set, perhaps one lacking in explicit ontological information, we wish to first identify its significant semantic structures, and then measure the extent of their significance. Casting a semantic graph dataset as an edge-labeled, directed graph, this task can be built on the ability to mine frequent {\em labeled} subgraphs in edge-labeled, directed graphs. We begin by considering the fundamentals of the enumerative combinatorics of subgraph motif structures in edge-labeled directed graphs. We identify its frequent labeled, directed subgraph motif patterns, and measure the significance of the resulting motifs by the information gain relative to the expected value of the motif based on the empirical frequency distribution of the link types which compose them, assuming indpendence. We illustrate the method on a small test graph, and discuss results obtained for small linear motifs (link type bigrams and trigrams) in a larger graph structure.

Revised: September 5, 2013 | Published: June 14, 2011

Citation

Joslyn C.A., S. al-Saffar, D.J. Haglin, and L. Holder. 2011. Combinatorial Information Theoretical Measurement of the Semantic Significance of Semantic Graph Motifs. In Mining Data Semantics Workshop (MDS 2011) in conjunction with SIGKDD2011, August 21-24, 2011, San Diego, California. New York, New York:Association for Computing Machinery. PNNL-SA-80237.