AbstractTandem mass spectrometry (MS/MS) is a primary tool for identification of small molecules and metabolites where resultant spectra are most commonly identified by matching them with spectra in MS/MS reference libraries. The high degree of variability in MS/MS spectrum acquisition techniques and parameters creates a significant challenge for building standardized reference libraries. Here we present a method to improve the usefulness of existing MS/MS libraries by augmenting available experimental spectra datasets with statistically interpolated spectra at unreported collision energies. We find that highly accurate spectral approximations can be interpolated from as few as three experimental spectra and that the interpolated spectra will be consistent with true spectra gathered from the same instrument as the experimental spectra. Supplementing existing spectral databases with interpolated spectra yields consistent improvements to identification accuracy on a range of instruments and precursor types. Applying this method yields significant improvements (around 10% more spectra correctly identified) on large datasets (2,000 - 10,000 spectra), indicating this is a quick yet adept tool for improving spectral matching in situations where available reference libraries are not yet sufficient. We also find improvements of matching spectra across instrument types (between an Agilent Q-TOF and an Orbitrap Elite), at high collision energies (50 - 90 eV), and with smaller datasets available through MassBank.
Published: September 21, 2022