While geospatial libraries abound, no single library allows for classic geospatial coordinate transforms, geometric calculations, time-series analysis, machine-learning methodologies, and ingest of open-source data from multiple sources without transferring between application programming interfaces and formats.
Through GeoBOSS, the Geospatial Analysis team at PNNL provides an interface to a growing number of analytics and data sources, including Global Administrative Boundaries, U.S. Census, Data.Gov, and others.
The initial use case for establishing the library focused on simplifying geospatial analysis that could scale to large datasets.
Today, PNNL’s library helps analysts answer these key questions:
PNNL developed GeoBOSS with the goal of making it broadly applicable to the geospatial community.
Within the library, individual data sources are transformed to abstract concepts of points, paths, and polygons. This helps users create analytics on a common framework that can be applied to many datasets and easily extended by the user. These range from simple feature generation (e.g., speed and bearing) to more advanced analytics (e.g., group movement identification, clustering, and other functions). Functionality is also included to simplify data cleanup, hashing, and plotting.
The biggest benefit of this library is its design to work with data at multiple scales—from small problems on a laptop to large database queries on a cluster—and in multiple environments, including local, cloud, and Databricks. This unified interface at multiple scales simplifies algorithm development and testing across datasets and environments.
GeoBOSS provides the necessary flexible architecture to minimize data preparation for analytic development and mission needs.
PNNL’s approach offers:
The library features an automated build and deployment pipeline, with pipelines made of building blocks that can be easily swapped out. The library also works within Databricks, local, or cloud environments.