Longest Increasing Subsequence

Problem

Given a sequence of integers (say, an array), output the length of an increasing subsequence (not necessarily contiguous) of maximum length.

For example, if the input sequence was $\{8,3,6,4,5,7,7\}$ , the answer would be $4$ since the LIS would be $\{3,4,5,7\}$ .

Attempt 1 : Graph Modelling

Think of the elements in the sequence as nodes. Draw a directed edge between the nodes $(u,v)$ only if $v > u$ and $v$ occurs after $u$ in the sequence. This forms a DAG. Notice that in this case the topological ordering is the order in which the elements occur in the sequence. The problem is now essentially to find the longest path in the DAG. We can run our DAG SSSP from every node with the edges reversed.

Our DAG SSSP takes $O(V+E)$ time to run once. In the worst case, $E = O(V^2)$ (think of an increasing sequence) and so, each SSSP takes $O(n^2)$ where $n$ is the length of our input sequence. We need to run this $n$ times, and so our algorithm takes $O(n^3)$ .

Observe that we are recomputing a lot here! In particular, we relax each edge at most $n$ times. Can we do better?

In fact, we can do much better. Let us try to think in terms of DP and break our problem into smaller subproblems.

The trivial case is when the sequence is of length $1$ . Then, we know that the LIS is of length 1.

For each outgoing edge (in reverse topological order of nodes), find the maximum LIS of the destination node and add 1. That gives you the LIS starting at that node.

Formally, define $S[i] = LIS(A[i \dots n])$ starting at $A[i]$ where $A[1, \dots n]$ is the input array. In terms of our table view, at each index $i$ of our table, we store the length of the longest increasing subsequence starting at index $i$ of the array. Our problem is to find the maximum entry in the table.

Our DP recurrence relation is:

Base Case: $S[n] = 1$ (length of LIS in an array of length 1 is 1)
Recurrence: $S[i] = (max_{(i,j) \in E} S[j]) + 1$ .

Some pseudocode is as follows:

The running time of the above algorithm is essentially $O(V+E) = O(n^2)$ (Be sure to be able to prove this)

Attempt 2 : Dynamic Programming

Let’s stop thinking about this as a graph now.

Define $S[i] = LIS(A[1\dots i])$ ending at $A[i]$

So, each index $i$ in the dp table stores the length of the LIS using the first $i$ elements.

Then,

$S[1] = 1$ (length of LIS ending at first index)
$S[i] = (max_{(j < i, \ A[j] < A[i])} S[j]) + 1$ (Look at everything to your left which is smaller than you. Find the maximum length of LIS ending at all those indices. Add 1 to get your maximum length (since you are essentially extending the sequence). If there is nothing to the left of you that is smaller than you, you cannot extend any of those LISs and so, you need to start a new LIS and your value will be 1.

Note that “ending at index $i$ ” necessarily means that $i$ is a part of the LIS. It does not include those LIS which come before it but do not contain this element itself (as then you wouldn’t know whether to extend this LIS or not since you don’t know the maximum element of the LIS) - To be able to extend an LIS, all you need to know is the current maximum element in the LIS. Here, index $i$ stores the maximum element of the LIS (since it ends at index $i$ ). This is also why to determine the LIS of the entire array, you need to look at every element in the end again. We cannot be sure that the LIS contains the last element and so, we cannot simply look at the last index.

Some pseudocode:

Again, this algorithm is $O(n^2)$ , which should be fairly obvious by the 2 nested for-loops.

Another way of looking at it would be to realise that there are $n$ subproblems (where a subproblem is defined as the length of the LIS ending at a particular index). A subproblem at index $i$ takes $O(i)$ (since you need to look at all previously solved subproblems, and take the maximum). Then, $\sum_{i = 1}^n O(i) = O(n^2)$ .

The key invariant that we are maintaing is that at every iteration of the outer-loop, $S[i]$ has the correct value of the maximum length of LIS ending at index $i$ .

Faster, Faster, Faster

It is possible to solve LIS in $O(n\log n)$ using binary search to solve subproblems faster. The key observation that allows us to use binary search is that as we extend the sequence, the length of the LIS can only increase or stay the same - hence, there is a monotonic behaviour of LIS.

PreviousAPSP Floyd-Warshall Next0-1 Knapsack

Last updated 10 months ago

Was this helpful?