Background:Understanding of biological processes necessitates knowing not only which proteins exist in a certain organism or cell type but also how these proteins interact with each other. However, the determination of the protein-protein interaction (PPI) networks is a daunting task and it has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist as is evident from the small overlap between the high-throughput experimental approaches. Results:In this manuscript we present DomainGA which is a Genetic Algorithm based method that optimizes the domain-domain interactions scores which can be used to predict protein-protein interactions (PPI). We show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules using the Yeast PPI data. In a two-fold cross-validation study, the DomainGA optimization achieves an explanation ratio of 99% and the cross validation test results are 97% and 88% for the positive and negative PPIs, respectively. We discuss how the DomainGA method significantly improves on the random predictions, particularly for predicting the non-interacting protein pairs. Based on our cross-verification tests on Human PPI, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity & specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. Conclusions:We envision the DomainGA as a first step of a multiple tier approach to constructing PPIs. As it is based on the fundamental structural information, DomainGA approach can be used to create the potential PPIs and the accuracy of the constructed interaction template can be improved later using complementary methods such as literature search or other prediction methods. Obtained explanation ratios during the reported test case studies clearly show that the false prediction rates of the obtained templates would be reasonable low, which can be lowered even further with additional secondary tests.
Revised: July 20, 2007 |
Published: June 13, 2007
Citation
Singhal M., and H. Resat. 2007.A Domain-Based Approach to Predict Protein-Protein Interactions.BMC Bioinformatics 8.PNNL-SA-51945.doi:10.1186/1471-2105-8-199