Microarray gene expression data provide a unique information resource for learning biological networks using "reverse engineering" methods. However, there are a variety of cases in which we know which genes are involved in a given pathology of interest, but we do not have enough experimental evidence to support the use of fully-supervised/reverse-engineering learning methods. In this paper, we explore a novel semi-supervised approach in which biological networks are learned from a reference list of genes and a partial set of links for these genes extracted automatically from PubMed abstracts, using a knowledge-driven bootstrapping algorithm. We show how new relevant links across genes can be iteratively derived using a gene similarity measure based on the Gene Ontology that is optimized on the input network at each iteration. We describe an application of this approach to the TGFB pathway as a case study and show how the ensuing results prove the feasibility of the approach as an alternate or complementary technique to fully supervised methods.
Revised: September 24, 2010 |
Published: August 2, 2010
Citation
Taylor R.C., A.P. Sanfilippo, J.E. McDermott, R.L. Baddeley, R.M. Riensche, R.S. Jensen, and M. Verhagen. 2010.Learning Biological Networks via Bootstrapping with Optimized GO-based Gene Similarity. In Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, 515-519. New York, New York:Association for Computing Machinery.PNNL-SA-71923.doi:10.1145/1854776.1854875