Wednesday 30 November 2016

Matching a Spectrum to a Protein

In this problem we are given a set of n protein strings and a multiset of weights corresponding to the complete spectrum of some unknown protein string. We are then asked to find the maximum multiplicity and the protein string that corresponds to it. My solution to the problem can be found on my GitHub. For each protein string, it calculates the weight of each prefix and suffix and checks if they correspond to any of the given weights of the protein spectrum. Each time they do a counter is increased by one for that specific protein string. Finally, the highest value is printed together with the corresponding string.

Sample Dataset
4
GSDMQS
VWICN
IASWMQS
PVSMGAD
445.17838
115.02694
186.07931
314.13789
317.1198
215.09061

Sample Output
3
IASWMQS

I actually didn't manage to replicate the result above, which is the example given in the problem description. However, when I tried my program with a larger dataset and uploaded my answer to Rosalind, I got the correct result, so I must have done something right!

No comments:

Post a Comment