Tuesday, 12 July 2016

Locating Restriction Sites

In this problem we are asked to find the reverse palindromes of a given DNA sequence. A piece of DNA is said to be a reverse palindrome when it is equal to it's reverse complement.

This problem was quite similar to the previous problem "Finding a shared motif" so I adapted that code slightly and ended up with this:

from Bio import SeqIO                           

record = SeqIO.read('sampledata.fasta', 'fasta')
frw_seq = str(record.seq)                       
rev_seq = str(record.seq.complement())          

for i in range(len(frw_seq)):                   
    for j in range(i, len(frw_seq)):            
        m = frw_seq[i:j + 1]                    
        rev_m = rev_seq[i:j + 1]                
        if len(m) >= 4 and len(m) <= 12:        
            if m == rev_m[::-1]:                
                print(i + 1, len(m))            

The program works through the forward string and its complementary string and picks out all pieces that are  longer or equal to 4 but shorter or equal to 12. Then it compares each forward piece with its reverse complement and if they are equal, prints the position and the size of the palindrome. There are probably much more efficient ways to do this, but it works and I managed to write the program a lot quicker than I thought I would!

No comments:

Post a Comment