Sara does Bioinformatics: Independent Alleles

We're back in probability again. This time we are looking at independent events, in our case two alleles that are inherited independently (Mendel's 2nd law). In the problem we are to start with an organism with the genotype Aa Bb for traits A and B. He mates with an organism of genotype Aa Bb and gets two offspring. These in turn both mate with organisms of genotype Aa Bb and get two offspring each. This continues for k generations, always mating with organisms of genotype Aa Bb and getting two offspring. We are given the task to calculate the probability that at least N organisms in generation k (excluding the mates) have the genotype Aa Bb.

The key to this problem is realizing that no matter what genotype the organism that mates with someone of genotype Aa Bb has, the probability that the offspring is Aa Bb is always 0.25 (if you want to you can draw the 9 different 4x4 Punnett squares to convince yourself). When you have realized this you just need to figure out how to calculate the probability that N or more of the population in generation k have the correct genotype, which leads us to binomial distribution.

For the final population P, we want to know the probability that N to P organisms have the correct allele. This means that we want the sum of all the separate probabilities [N, N+1, N+2, ..., P]. We can use the formula for the binomial distribution to calculate all the probabilities separately, and then sum them up to get the overall probability that we are after. For a population of P = k², where k is the number of generations, and when N is the least number of the population with genotype Aa Bb we are looking for, we get the following formula for the overall probability (derived from the general formula of the binomial distribution):

Turning this formula into Python code we get:

import math

k = 5

N = 8

P = 2**k

probability = 0

for i in range(N, P + 1):

prob = (math.factorial(P) /

(math.factorial(i) * math.factorial(P - i))) * (0.25**i) * (0.75**(

P - i))

probability += prob

print(probability)

k and N are the variables given by Rosalind.

Sara does Bioinformatics

Tuesday, 5 July 2016

Independent Alleles

1 comment: