November 6, 2020
Journal Article

Better Understanding and Prediction of Antiviral Peptides through Primary and Secondary Structure Feature Importance

Abstract

The emergence of viral epidemics throughout the world is of particular concern due to the scarcity of available effective antiviral therapeutics. The discovery of new antiviral therapies is imperative to address this challenge, and antiviral peptides (AVPs) represent a valuable resource for the development of novel therapies to combat viral infection. We present a new machine learning model to distinguish AVPs and non-AVPs using the most informative features derived from the physicochemical and structural properties of their amino acid sequences. To focus on those features that are most likely to contribute to antiviral performance, we filter potential features by performing correlation and mean decrease of Gini index (MDGI) analyses. A recursive feature elimination (RFE) technique with support vector machine (SVM) is further applied to rank the features based on their importance for classification. These RFE analyses suggest that secondary structure is the most important peptide sequence feature for predicting AVPs. Our Feature-Informed Reduced Machine Learning for Antiviral Peptide Prediction (FIRM-AVP) approach achieves a higher accuracy and Matthew's correlation coefficient (MCC) as compared to the current state-of-the-art, 92.38% and 0.84, respectively. The FIRM-AVP code and standalone software package are available at https://github.com/pmartR/FIRM-AVP with an accompanying web application at https://msc-viz.emsl.pnnl.gov/AVPR.

Revised: December 21, 2020 | Published: November 6, 2020

Citation

Chowdhury A.S., S.M. Reehl, K. Kehn-Hall, B. Bishop, and B.M. Webb-Robertson. 2020. Better Understanding and Prediction of Antiviral Peptides through Primary and Secondary Structure Feature Importance. Scientific Reports 10, no. 1:Article No. 19260. PNNL-SA-154696. doi:10.1038/s41598-020-76161-8