Thursday, 24 November 2016

Global Alignment with Scoring Matrix

In this problem we are asked to find the alignment score of two protein sequences using the scoring matrix BLOSUM62 and a linear gap penalty of 5. I decided to use Biopython's pairwise2 for this, as you can see in the code below and on my GitHub.

from Bio import pairwise2
from Bio import SeqIO
from Bio.SubsMat.MatrixInfo import blosum62

seqs = []
with open('rosalind_glob.fasta', 'r') as f:
    for record in SeqIO.parse(f, 'fasta'):
        seqs.append(record.seq)
s = seqs[0]
t = seqs[1]
alignments = pairwise2.align.globalds(s, t, blosum62, -5, -5)
print(pairwise2.format_alignment(*alignments[0]))

Note that the gap open penalty and the gap extension penalty are both set to -5, as we are told in the problem description to use a linear gap penalty of 5. 

No comments:

Post a Comment