August 20, 2025
Journal Article

G-Mapper: Learning a Cover in the Mapper Construction

Abstract

The Mapper algorithm is a visualization technique in topological data analysis (TDA) that outputs a graph reflecting the structure of a given dataset. The Mapper algorithm requires tuning several parameters in order to generate a ``nice" Mapper graph. The paper focuses on selecting the cover parameter. We present an algorithm that optimizes the cover of a Mapper graph by splitting a cover repeatedly according to a statistical test for normality. Our algorithm is based on $G$-means clustering which searches for the optimal number of clusters in $k$-means by conducting iteratively the Anderson-Darling test. Our splitting procedure employs a Gaussian mixture model in order to choose carefully the cover based on the distribution of a given data. Experiments for synthetic and real-world datasets demonstrate that our algorithm generates covers so that the Mapper graphs retain the essence of the datasets.

Published: August 20, 2025

Citation

Alvarado E., R. Belton, E. Fischer, K. Lee, S. Palande, S. Percival, and E. Purvine. 2025. G-Mapper: Learning a Cover in the Mapper Construction. SIAM Journal on Mathematics of Data Science 7, no. 2:572-596. PNNL-SA-189983. doi:10.1137/24M1641312