January 8, 2026
Report

Ranking Biological Features in Soil-Based Microbial Multi-Omics Data with Integration Modeling

Abstract

Distinguishing the most important features (e.g. proteins, metabolites, etc.) per group (e.g. control and treatment) is a critical challenge in feature-rich multi-omics experiments, especially in soil data. Traditional feature identification and ranking approaches, such as differential expression, are based on single omics and thus not directly translatable to multi-omics experiments. Here, 5 multi-omics integration models (DIABLO, JACA, MOFA, MultiMLP, and SLIDE) that were not explicitly built for soil data applications were tested using a soil-based multi-omics experiment. The data were obtained from an experimental setup of an autoclaved soil system inoculated with 8 bacteria and using chitin as the carbon source and including samples collected at 0- (control), 4-, 8-, and 12-weeks post-inoculation. The omics data included metaproteomics, 16S rRNA sequencing, and LC-MS/MS metabolomics (in positive and negative mode). Each multi-omics integration model was implemented, and top features were compared to differential univariate statistics per omic type, demonstrating that integration approaches cut the potential number of top features from 2957 identified by differential statistics to 13-224 (a 99.6% to 92.4% reduction). Interestingly, most top features across integration models were not shared; though, scaling and averaging ranks across models shared similar patterns. This work highlights the usefulness of multi-omics integration models in soil-based microbial studies and the power of using multiple integration models together to interpret results.

Published: January 8, 2026

Citation

Degnan D.J., R.S. McClure, D.M. Claborne, L.M. Bramer, and J.E. Flores. 2025. Ranking Biological Features in Soil-Based Microbial Multi-Omics Data with Integration Modeling Richland, WA: Pacific Northwest National Laboratory.