Tuesday, 12 July 2016

Calculating Protein Mass

In this problem we are asked to calculate the molecular weight of a peptide. The peptide is assumed to come from the middle of a protein so we are to assume that it consists entirely of amino acid residues (meaning we don't have to account for the extra weight of the "water molecule" present when we include the two edges of the protein).

I wrote two programs for this problem, the first one uses Biopython and the second doesn't. The following code is the one using Biopython:

from Bio.Seq import Seq                  
from Bio.Alphabet import generic_protein 
from Bio.SeqUtils import molecular_weight

with open('sampledata.txt', 'r') as f:   
    for line in f:                       
        prot_seq = line.strip('\n')      

print('%0.3f' % (molecular_weight(       
    Seq(prot_seq, generic_protein),      
    monoisotopic=True) - 18.01056))      

This program uses the Biopython function molecular_weight from SeqUtils. The function sets monoisotopic=False by default, but because the problem specified that we should use the monoisotopic weitghts we need to set it to monoisotopic=True. Also, the function includes the extra weight of one water molecule, so we need to remove that manually (hence the - 18.01056). Come to think of it there is an option called circular that states that the sequence has no ends. Maybe that would work as well?

Anyhow, I thought I'd try to make this program without the use of Biopython as well, and the following is what I came up with:

weights = {'A': 71.03711,             
           'C': 103.00919,            
           'D': 115.02694,            
           'E': 129.04259,            
           'F': 147.06841,            
           'G': 57.02146,             
           'H': 137.05891,            
           'I': 113.08406,            
           'K': 128.09496,            
           'L': 113.08406,            
           'M': 131.04049,            
           'N': 114.04293,            
           'P': 97.05276,             
           'Q': 128.05858,            
           'R': 156.10111,            
           'S': 87.03203,             
           'T': 101.04768,            
           'V': 99.06841,             
           'W': 186.07931,            
           'Y': 163.06333}            

with open('sampledata.txt', 'r') as f:
    for line in f:                    
        prot_seq = line.strip('\n')   

weight = 0                            
for aa in prot_seq:                   
    weight += weights[aa]             

print('%0.3f' % weight)               

No comments:

Post a Comment