Sample Dataset
90000 0.6
ATAGCCGA
Sample Output
0.689
To calculate the probability of randomly getting a string that equals s when constructing it with GC content x, we can use the same equation as in introduction to random strings. To go from this probability (lets call it s_prob) to the probability of getting at least one string that is equal to s we use the following equation:
P(at least 1 match of s) = 1 − P(no matches out of N strings) = 1 − [1 - s_prob]^N
The following program makes the above calculations and outputs the answer with three significant figures:
x = 0.6
s = 'ATAGCCGA'
AT = 0
GC = 0
for nt in s:
if nt == 'A' or nt == 'T':
AT += 1
elif nt == 'G' or nt == 'C':
GC += 1
s_prob = (((1 - x) / 2)**AT) * (((x) / 2)**GC)
prob = 1 - (1 - s_prob)**N
print('%0.3f' % prob)
No comments:
Post a Comment