The Data Sciences and Analytics group operates on the data-to-knowledge continuum distilling large, fast, distributed, and messy data into knowledge to support decision processes. We apply expertise in data engineering, semantic and human language technologies, machine learning, data architectures, systems integration, and software development to create advanced computational solutions that address our sponsor's complex data and analytic challenges.


  • Data Engineering - knowledge of data sources and data manipulation technologies combined with software design and development to collect, clean, translate, transform, and store data for use by analytic processes and computational systems. Data collected from systems ranging from field sensors to commercial vendor data services often are messy and incomplete and require specialized software to massage the data into usable form.
  • Semantic Technologies - enables computers to represent, reason over, and communicate using encoded meanings of data. Semantic technologies represent meaning separate from data or code so that knowledge is more explicit, inspectable, and computationally usable. This enables more appropriate use of data, allows for automated mechanisms to discover new uses for data, and enables computational systems to communicate using more familiar domain-relevant vocabulary.
  • Human Language Technologies - the interdisciplinary domains of computational linguistics, natural language processing, computer science, artificial intelligence, psychology, philosophy, mathematics, and statistics. These disciplines are applied to develop novel insights through understanding the variety and breadth of human communication channels.
  • Machine Learning - application of machine learning techniques to diverse problem sets, including forecasting diseases, understanding social media, and helping to improve disaster response. Cutting-edge tools are developed and adapted to create analytics that support national security challenges, leveraging our expertise in numerical analysis, linguistics, and behavioral modeling.
  • Data Architectures - the design, development and deployment of systems to collect, transport, store and provide data for use in computational systems and/or by users while adhering to policies and standards defining what data can be collected and how it can be used. Our experiences span from local data access to highly distributed data collection; and from traditional storage systems, to high performance warehousing technologies and cloud based solutions.
  • Systems Integration - the process of connecting hardware and software products together through communication protocols and programming interfaces to create functional computational systems. Facilitates production applications that range in capability from performing reliable data collection from distributed sensor networks to the amalgamation of scientific instruments, data archives, data indexing and retrieval strategies, and web applications to deliver open science data.

For more information, contact at (509) 372-6311.

