1143. Longest Common Subsequence

Problem Statement in English

You’re given $2$ strings - text1, and text2. Your task is to return the longest common subsequence in those $2$ strings.

How does a subsequence differ form a substring? You can skip characters in between. So for example valid subsequences in the string “beagle” are “bea”, “beg”, “eagle”, “ale” etc. So long as the order of occurence in the subsequence matches the original string, you’re good.

Approach

The solution to this actually isn’t intuitive, and I feel that unless you’ve seen this before, it’s not very likely that you’ll figure it out.

You need to arrange the $2$ strings in a grid. The first forms the topmost row, and the second, the leftmost column.

Let’s work our way through an example.

Here’s how the grid first starts out. I’m directly adding $0$ as the value in the cells so that we can update the values as we go, rather than having to add them to the grid.

I just think it looks and feels nicer this way, but hey, you do you.

	a	b	c	d	e
a	$0$	$0$	$0$	$0$	$0$
c	$0$	$0$	$0$	$0$	$0$
e	$0$	$0$	$0$	$0$	$0$

In our first pass we go over the first row and see if the row character (“a” in this case) matches any column character. If it does, we increment the local longest common subsequence count by $1$.

In case it doesn’t, we use the $\text{max}$ value of the cell above, or on the left - essentially using the previously computed best length up until that point.

	a	b	c	d	e
a	$1$	$1$	$1$	$1$	$1$
c	$0$	$0$	$0$	$0$	$0$
e	$0$	$0$	$0$	$0$	$0$

When we see that the row and column char are the same we add $1$ to the diagonally left element on the top, and use that as our local LCS answer.

	a	b	c	d	e
a	$1$	$1$	$1$	$1$	$1$
c	$0$	$0$	$2$	$2$	$2$
e	$0$	$0$	$0$	$0$	$0$

And here’s the last pass.

	a	b	c	d	e
a	$1$	$1$	$1$	$1$	$1$
c	$0$	$0$	$2$	$2$	$2$
e	$0$	$0$	$2$	$2$	$3$

The final answer will always be the last cell in the last row, $3$, in this case, since “ace” is the longest common subsequence.

Solution in Python

The simple approach that just gets the job done.


class Solution:
    def longestCommonSubsequence(self, text1: str, text2: str) -> int:
        grid = []

        for i in range(len(text1)):
            # pre-populate grid row as required
            grid.append([0 for _ in range(len(text2))])

            for j in range(len(text2)):
                # create some local vars
                localLCS = 0
                cellAbove = 0
                cellOnLeft = 0

                if text1[i] != text2[j]:
                    # then the lcs so far,
                    # is the max of cell on the right and on the top
                    if i >= 1:
                        cellAbove = grid[i-1][j]
                    if j>= 1:
                        cellOnLeft = grid[i][j-1]
                    localLCS = max(cellAbove, cellOnLeft)
                else:
                    # then the lcs so far,
                    # is the cell on the top left diagonally + 1
                    if i >= 1 and j>=1:
                        localLCS = grid[i-1][j-1] + 1
                    else:
                        localLCS = 1
                # save the localLCS to this cell
                grid[i][j] = localLCS

        # return the answer
        return grid[-1][-1]

The space optimised approach. We ensure that we only store 2 rows at most. This row, and the previous one.


from collections import deque


class Solution:
    def longestCommonSubsequence(self, text1: str, text2: str) -> int:
        # we're going to guarantee that we have 2 rows atmost in the queue
        grid = deque()

        for i in range(len(text1)):
            if len(grid) == 2:
                grid.popleft()

            # pre-populate grid row as required
            grid.append([0 for _ in range(len(text2))])

            for j in range(len(text2)):
                # create some local vars
                localLCS = 0
                cellAbove = 0
                cellOnLeft = 0

                if text1[i] != text2[j]:
                    # then the lcs so far,
                    # is the max of cell on the left and on the top
                    if len(grid) > 1:
                        cellAbove = grid[-2][j]
                    if j >= 1:
                        cellOnLeft = grid[-1][j - 1]

                    localLCS = max(cellAbove, cellOnLeft)
                else:
                    # then the lcs so far,
                    # is the cell on the top left diagonally + 1
                    if len(grid) > 1 and j >= 1:
                        localLCS = grid[-2][j - 1] + 1
                    else:
                        localLCS = 1

                # save the `localLCS` to this cell
                grid[-1][j] = localLCS

        # return the answer
        return grid[-1][-1]

Complexity

Time: $O(m\times{n})$
Here, $m$ is the number of characters that forms the first column of the grid, and $n$ is the number of characters that forms the first row of the grid.

This is because we iterate over atmost $m \times n$ cells.

- Space: $O(n)$
Here, $n$ is the number of characters in the string that forms the row of the grid. (assuming we're using the space optimised version of the solution), since we only store $2n$ cells.

Mistakes I Made

I bungled up a whole bunch with the space optimised version’s implementation. Sigh.

And we are done.

Problem Statement in English#

Approach#

Solution in Python#

Complexity#

Mistakes I Made#

Problem Statement in English

Approach

Solution in Python

Complexity

Mistakes I Made