Programming Research Group Research Report RR-03-21

Finding transcription factor binding sites in DNA Sequences: A template based approach

Sumedha Gunewardena Peter Jeavons

Revised October 2003, 11pp.

Abstract

A problem faced by many algorithms for finding transcription factor binding sites is the high number of false positive hits that result with the increased sensitivity of their prediction. A main contributing factor to this is the short and degenerate nature of these sites which results in a low signal to noise ratio. In order to counter this problem one needs to look beyond the base independence assumption. We propose a model based on templates designed to capture not only the vertical consensus but also the correlation of individual bases with the other bases of the site.


This paper is available as a 326,650 bytes PostScript file.