Longest Common Subsequence

This is a very common DP problem.

Given two strings string1\small \tt string1 and string2\small \tt string2, return the length of their longest common subsequence. A common subsequence is made by deleting characters until the strings are the same.


string1 = "abcde"string2 = "acze"=> returns 3\notag \begin{align*} & \texttt{string1 = "abcde"}\\ & \texttt{string2 = "acze"} \\[8bp] & \texttt{=> returns 3} \end{align*}

Explanation: The longest common subsequence is "ace", with a length of 3. You get it by deleting "b" and "d" from string1\small \tt string1, and deleting "z" from string2\small \tt string2.

First Few Test Cases:

In order to solve this problem, you need to understand how to find the longest common subsequence of two strings.

To find a longest common subsequence, you only need to consider the starting letter of each string. If both starting letters are the same, you should keep that letter and move on. If both starting letters are different, you need to delete one of the starting letters. For example,

1. Recursion

To write the recursion for the length of the LCS, you use the same exact idea.

2. Base case

The recursion will keep breaking the problem down into lenLCS on smaller and smaller strings. The recursion should stop when it reaches the empty string, since it can't keep breaking the string down. The longest common subsequence in this case is just the empty string, with has a length of 0.

3. Code

To code this up, we should use Tabulation to avoid making redundant function calls.

To do this, you should realize that lenLCS(i,j) depends on the three other function calls lenLCS(i+1,j), lenLCS(i,j+1), lenLCS(i+1,j+1).

Here's a picture of this - the blue arrows in the picture indicate everything you need to compute before you compute lenLCS(i, j). The base cases all have a value of 0.

From this diagram, you can see that you can compute all the results by starting at the bottom right square, and moving leftwards row-by-row, or upwards column-by-column, computing values as you go. When you do this, you follow the blue arrows in the correct order. Eventually, you will get to the top left square lenLCS[0][0], which is the solution to the problem.

Time Complexity O(mn)O(\tt m \cdot n), where m\tt m is the length of string1\tt string1 and n\tt n is the length of string2\tt string2.

Space Complexity O(mn)O(\tt m \cdot n), since LCS[i][j]\tt LCS[i][j] is an m×n\tt m \times n matrix.

You could optimize this solution using "Tabulation+". This is because you only have to remeber the previous row, or the previous column, but not both of them, which means you can get the memory complexity down to O(min(m,n))O(\min(m,n)) memory. This will probably never show up in an interview though.

Mark as Completed:
Test your code to get an output here!
Test your code to get an output here!