February 1, 2025
Report

Mega AI: Scaling AI for Science and Security

Abstract

State-of-the-art, large-scale language models or multimodal foundation models incorporating a text modality are trained on large?collections of pretraining data, largely focusing on general-purpose language and/or vision data sources. However, performance often degrades when applying foundation models trained on general-purpose datasets to science and security domains, such as handling the vocabulary shift between general language versus domain knowledge in areas like molecular chemistry and climate. By leveraging a large collection of scientific literature,?the Mega AI project focused on developing next-generation foundation models addressing science and security missions. The project explored the tradeoffs of development choices (pretraining ?from scratch, fine-tuning off-the-shelf base models, and targeted ?fine-tuning and/or task-prompts) and model performance to support: ?on premise model use, ?mission informed training/tuning of usable LLMs, ?& traceable model development and evaluation.??

Published: February 1, 2025

Citation

Glenski M.F., R.J. Cosbey, S. Sharma, M. Subramanian, A. Acharya, and E.M. Ayton. 2025. Mega AI: Scaling AI for Science and Security Richland, WA: Pacific Northwest National Laboratory.