Sara does Bioinformatics: Longest Increasing Subsequence

This problem had me going for quite a while, and I ended up getting a lot of help from various places on the internet. But I have now managed to solve it and I will try to explain my solution here. Let's start with the description.

In this problem we are given a permutation of length n (where n is smaller than or equal to 10 000) and we are asked to find the longest increasing and decreasing subsequence of the permutation. A subsequence of a permutation is a collection of elements from the permutation in the order that they appear (note that they don't have to come directly after one another). The subsequence is said to be increasing if the elements increase (from left to right) and it is said to be decreasing if the elements decrease (again, left to right). For example, in the permutation (5, 1, 4, 2, 3) the subsequence (1, 2, 3) is increasing and the subsequence (5, 4, 2) is decreasing.

The fact that the subsequence can 'skip' numbers messed up my initial solution for this problem, but after spending some time trying to figure out how the algorithm described in this Wikipedia page works and looking through a very good explanation of it on Stack Overflow (as well as working through the algorithm with pen and paper to figure out how it operates. I know, I'm a tad bit old-school), I managed to adapt the code on Stack Overflow to also find the longest decreasing subsequence. The code below is what I ended up with. The function for finding the decreasing subsequence is talmost he same as for finding the increasing subsequence, so I have highlighted the differences. I would highly recommend having a look at the thread on Stack Overflow if you would like a good explanation of the algorithm.

data = [] #This first bit reads the file which
with open('sampledata.txt', 'r') as f: #contains the the length of the
for line in f: #permutation and then the permutation,
for nr in line.split(): #separated by spaces. The numbers are
data.append(int(nr)) #appended to a list as integers.
perm = data[1:]

def increasing(seq):
P = [None] * len(seq)
M = [None] * len(seq)

L = 1
M[0] = 0
for i in range(1, len(seq)):
lo = 0
hi = L
if seq[M[hi - 1]] < seq[i]:
j = hi
else:
while hi - lo > 1:
mid = (hi + lo) // 2
if seq[M[mid - 1]] < seq[i]:
lo = mid
else:
hi = mid

j = lo
P[i] = M[j - 1]
if j == L or seq[i] < seq[M[j]]:
M[j] = i
L = max(L, j + 1)

result = []
pos = M[L - 1]
for k in range(L):
result.append(seq[pos])
pos = P[pos]

return (result[::-1])

def decreasing(seq):
  P = [None] * len(seq)
  M = [None] * len(seq)

  L = 1
  M[0] = 0
  for i in range(1, len(seq)):
  lo = 0
  hi = L
  if seq[M[hi - 1]] > seq[i]:
  j = hi
  else:
  while hi - lo > 1:
  mid = (hi + lo) // 2
  if seq[M[mid - 1]] > seq[i]:
  lo = mid
  else:
  hi = mid

  j = lo
  P[i] = M[j - 1]
  if j == L or seq[i] > seq[M[j]]:
  M[j] = i
  L = max(L, j + 1)

  result = []
  pos = M[L - 1]
  for k in range(L):
result.append(seq[pos])
  pos = P[pos]

  return (result[::-1])

incr = increasing(perm)
decr = decreasing(perm)

print(*incr)
print(*decr)

Sara does Bioinformatics

Monday, 18 July 2016

Longest Increasing Subsequence

No comments:

Post a Comment