Computational Methods for the characterization and detection of protein binding sequences through information theory
Regulatory sequence detection is a critical facet for understanding the cell mechanisms in order to coordinate the response to stimuli. Protein synthesis involves the binding of a transcription factor to specific sequences in a process related to the gene expression initiation. A characteristic of this binding process is that the same factor binds with different sequences placed along all genome. Thus, any computational approach shows many difficulties related with this variability observed from the binding sequences. Our job proposes the detection of transcription factor binding sites based on a parametric uncertainty measurement (Rényi entropy). This detection algorithm evaluates the variation on the total Rényi entropy of a set of sequences when a candidate sequence is assumed to be a true binding site belonging to the set.