Harlan Robins, Institute for Advanced Study “Extending the Principle of Maximum Entropy to Find Motifs in the Genome”

Jump to:

Seminar Abstract

“Extending the Principle of Maximum Entropy to Find Motifs in the Genome”

The degeneracy of codons allows a multitude of possible sequences to code for the same protein. Hidden within the particular choice of sequence for each organism are over a hundred previously undiscovered biologically significant short (length 2-7) oligonucleotides. We present an information-theoretic algorithm that finds these novel signals. Applying this algorithm to the 209 sequenced bacterial genomes in the NCBI database, we determine a set of oligonucleotides for each bacterium which uniquely characterizes the organism. Additionally, applying the algorithm to the human and HIV genome, we find evidence for particular binding site that is involved in the HIV life cycle. The methods developed here can be readily extended to other problems in bioinformatics.

 

JHU - Institute for Computational Medicine