Abstract
This code is to support a paper in which we have developed a novel method for representing protein sequence for machine learning approaches to function. The innovation is in use of arbitrary mappings to reduce the complexity of the protein sequence and allow flexible identification of common sequence features from disparate proteins. The code also includes support for cross-validation and other analyses that went in to the paper.
Exploratory License
Eligible for exploratory license
Market Sector
Biological Sciences and Omics