July 19, 2004
Conference Paper

Identifying DNA Binding Motifs by Combining Data from Different Sources

Abstract

A transcription factor regulates the expression of its target genes by binding to their operator regions. It functions by affecting the interactions between RNA polymerases and the gene’s promoter. Many transcription factors bind to their targets by recognizing a specific DNA sequence pattern, which is referred to as a consensus sequence or a motif. Since it would remove the possible biases, combining biological data from different sources can be expected to improve the quality of the information extracted from the biological data. We analyzed the microarray gene expression data and the organism’s genome sequence jointly to determine the transcription factor recognition sequences with more accuracy. Utilizing such a data integration approach, we have investigated the regulation of the photosynthesis genes of the purple non-sulphur photosynthetic bacterium Rhodobacter sphaeroides. The photosynthesis genes in this organism are tightly regulated as a function of environmental growth conditions by three major regulatory systems, PrrB/PrrA, AppA/PpsR and FnrL. In this study, we have detected a previously undefined PrrA consensus sequence, improved the previously known DNA-binding motif of PpsR, and confirmed the consensus sequence of the global regulator FnrL.

Revised: December 7, 2004 | Published: July 19, 2004

Citation

Mao L., and H. Resat. 2004. Identifying DNA Binding Motifs by Combining Data from Different Sources. In Eight World Multi-conference on Systemics, Cybernetics, and Informatics. July 18-21, 2004. Orlando, Florida. Applications of Informatics and Cybernetics in Science and Engineering, edited by Nagib Callaos; Katsuhisa Horimoto; Jake Chen; Amy Sze Chan, VII, 172-176. Orlando, Florida:International Institute of Informatics and Systemics. PNNL-SA-41361.