October 11, 2010
Conference Paper

Massive Social Network Analysis: Mining Twitter for Social Good

Abstract

Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Tooklit for massive graphs representing social network data. On a 128-processor Cray XMT, GraphCT estimates the betweenness centrality of an artificially generated (R-MAT) 537 million vertex, 8.6 billion edge graph in 55 minutes. We use GraphCT to analyze public data from Twitter, a microblogging network. Twitter’s message connections appear primarily tree-structured as a news dissemination system. Within the public data, however, are clusters of conversations. Using GraphCT, we can rank actors within these conversations and help analysts focus attention on a much smaller data subset.

Revised: December 30, 2011 | Published: October 11, 2010

Citation

Ediger D., D. Ediger, K. Jiang, E.J. Riedy, D.A. Bader, D.A. Bader, and C.D. Corley, et al. 2010. Massive Social Network Analysis: Mining Twitter for Social Good. In 39th International Conference on Parallel Processing (ICPP 2010), September 13-16, 2010, San Diego, California, 583-893. Los Alamitos, California:IEEE Computer Society. PNNL-SA-71335. doi:10.1109/ICPP.2010.66