Abstract
We present example measurements obtained using approximately 3 years of GitHub data (aka GitHub training data): from January 2015 to August 2017. The original GitHub graph size is 52,260,372 nodes and 870,532,947 edges. We first subsampled the original GitHub graph to eliminate non-active user and repo nodes using weekly connected components implemented in networkX. The subsampled GitHub graph has 50,677,259 nodes and 773,974,620 edges. We present node-level measurement examples for popular repos e.g., tensorflow or rockstar users, and population-level measurement examples. See: https://confluence.pnnl.gov/confluence/display/SOCIALSIM/Implementing+Measurements
Exploratory License
Eligible for exploratory license
Market Sector
Data Sciences