Friday, 24 June 2016

Finding a Motif in DNA

This time we are asked to find the positions of a given motif in a given DNA-sequence. Having just read about Seq Objects in Biopython, I had noticed a function called find, which can be used to find the position of motifs in sequences. However, this function only seems to output the position of the first occurrence of the motif. Instead I decided to go with the Biopython function motifs (have a look at chapter 14.6). Here is my code:

from Bio import motifs                                      
from Bio.Seq import Seq                                     
data = [line.strip('\n') for line in open('sampledata.txt')]
instances =[Seq(data[1])]                                   
m = motifs.create(instances)                                
sequence = Seq(data[0])                                     
positions = ''                                              
for pos, seq in m.instances.search(sequence):               
    positions += str(pos+1) + ' '                           
print(positions)                                            

The function returns all the positions of the motifs, but it seems to have a different definition of the position than Rosalind. In order to receive the correct result I had to add 1 to all the positions, as you can see in line 9. I also added a blank space in this step to achieve the correct formatting.

No comments:

Post a Comment