April 6, 2011
Conference Paper

A Highly Parallel Implementation of K-Means for Multithreaded Architecture

Abstract

We present a parallel implementation of the popular k-means clustering algorithm for massively multithreaded computer systems, as well as a parallelized version of the KKZ seed selection algorithm. We demonstrate that as system size increases, sequential seed selection can become a bottleneck. We also present an early attempt at parallelizing k-means that highlights critical performance issues when programming massively multithreaded systems. For our case studies, we used data collected from electric power simulations and run on the Cray XMT.

Revised: December 2, 2011 | Published: April 6, 2011

Citation

Mackey P.S., J.T. Feo, P.C. Wong, and Y. Chen. 2011. A Highly Parallel Implementation of K-Means for Multithreaded Architecture. In Proceedings of the 19th High Performance Computing Symposia (HPC 2011): SCS Spring Simulation Multiconference (SpringSim 2011), April 3-7, 2011, Boston, MA. San Diego, California:Society for Computer Simulation International. PNNL-SA-76703.