# Longest common subsequence (dynamic programming)

Posted by provision on Mon, 03 Jan 2022 21:23:07 +0100

### 1. Longest common subsequence (discontinuous)

It refers to the subsequence formed by randomly removing some characters from a given sequence. (random means that some can be removed discontinuously or none can be removed)

For example, for the following two sequences
a: abcbdb
b: acbbabdbb
Their longest common subsequence: acbdb, length 5.

It is solved by dynamic programming, and f [I, J] is defined as( a 0 , a 1 , . . . a i − 1 a_0,a_1,...a_{i-1} a0​,a1​,... ai − 1) and( b 0 , b 1 , . . . b j − 1 b_0,b_1,...b_{j-1} b0​,b1​,... The longest common subsequence length of bj − 1).

Therefore, according to the definition, the search process is divided into the following three cases:

1. When i=0 or j=0, f [I, J] = 0, corresponding to the boundary condition.
2. When a [I-1] = B [J-1], further solve a sub problem and continue to find( a 0 , a 1 , . . . a i − 2 a_0,a_1,...a_{i-2} a0​,a1​,... ai − 2) and( b 0 , b 1 , . . . b j − 2 b_0,b_1,...b_{j-2} b0​,b1​,... The longest common subsequence length of bj − 2). State transition equation f [I, J] = f [I-1, J-1] + 1.
3. When a [I-1] ≠ B [J-1], it is divided into two subproblems, which need to be found out separately( a 0 , a 1 , . . . a i − 1 a_0,a_1,...a_{i-1} a0​,a1​,... ai − 1) and( b 0 , b 1 , . . . b j − 2 b_0,b_1,...b_{j-2} b0​,b1​,... bj − 2) and( a 0 , a 1 , . . . a i − 2 a_0,a_1,...a_{i-2} a0​,a1​,... ai − 2) and( b 0 , b 1 , . . . b j − 1 b_0,b_1,...b_{j-1} b0​,b1​,... bj − 1), and take the maximum of the two. Corresponding state transition equation f [I, J] = max (f [I, J-1], f [I-1, J]).
```public class LCS {
char[] a; 						// Storage sequence a
char[] b; 						// Storage sequence b
int[][] dp;

public LCS(String str1, String str2) {
a = str1.toCharArray();
b = str2.toCharArray();
dp = new int[a.length + 1][b.length + 1];
}

// Get maximum length
public int getLenth() {
for (int i = 1; i <= a.length; i++) {
for (int j = 1; j <= b.length; j++) {
if (a[i - 1] == b[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
dp[i][j] = Math.max(dp[i][j - 1], dp[i - 1][j]);
}
}
}
return dp[a.length][b.length];
}

// Finding subsequences according to dp array is actually the process of reverse restoring the length above
public StringBuilder getSubSequence() {
int i = a.length;
int j = b.length;
int len = dp[i][j];

StringBuilder subs = new StringBuilder("");
while (len > 0) {
if (dp[i][j] == dp[i - 1][j]) {
i--;
} else if (dp[i][j] == dp[i][j - 1]) {
j--;
} else {
// If the above two conditions are not satisfied, there must be dp[i][j]=dp[i-1][j-1]+1, corresponding to a[i-1]=b[j-1]
subs.append(a[i - 1]);
i--;
j--;
len--;
}
}
return subs.reverse();
}

public static void main(String[] args) {
String str1 = "abcbdb";
String str2 = "acbbabdbb";
LCS ls = new LCS(str1, str2);
System.out.println("Length:" + ls.getLenth());
System.out.println("Longest common subsequence:" + ls.getSubSequence());
}
}
```

dp array in solving process

Because two layers of for loops are created and a two-dimensional array is created in the process of solving, the time and space complexity of solving two sequences with lengths of M and N are O ( m n ) Ο(mn) O(mn).

### 2. Longest common subsequence (continuous)

The longest common continuous subsequence of the above two sequences is:
a: abcbdb
b: acbbabdbb

This time, f [I, J] is defined as( a 0 , a 1 , . . . a i − 1 a_0,a_1,...a_{i-1} a0​,a1​,... ai − 1) and( b 0 , b 1 , . . . b j − 1 b_0,b_1,...b_{j-1} b0​,b1​,... The longest common continuous subsequence length of bj − 1), and the last character of the subsequence is a i − 1 or b i − 1 a_{i-1} or b_{i-1} ai − 1 or bi − 1, that is, this continuous subsequence is the second half of the two sequences at the same time.

This can be divided into the following three cases:

1. Boundary case f [I, J] = 0, when i=0 or j=0.
2. A [I-1] = B [J-1], continue to look forward. There is f [I, J] = f [I-1, J-1] + 1.
3. A [I-1] ≠ B [J-1], according to the definition, this continuous subsequence must end with a [I-1] or B [J-1]. Such subsequence does not exist, so there is f [I, J] = 0.
```public class LCS {
char[] a; 						// Storage sequence a
char[] b; 						// Storage sequence b
int[][] dp;
int len; 						// Save maximum length
int index; 						// Save the starting subscript of the common subsequence

public LCS(String str1, String str2) {
a = str1.toCharArray();
b = str2.toCharArray();
dp = new int[a.length + 1][b.length + 1];
}

public int getLenth() {
for (int i = 1; i <= a.length; i++) {
for (int j = 1; j <= b.length; j++) {
if (a[i - 1] == b[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
}

if (dp[i][j] > len) {
len = dp[i][j];
index = i - len;
}
}
}
return len;
}

public StringBuilder getSubSequence() {
StringBuilder subs = new StringBuilder("");
for (int i = 0; i < len; i++) {
subs.append(a[index + i]);
}
return subs;
}

public static void main(String[] args) {
String str1 = "abcbdb";
String str2 = "acbbabdbb";
LCS ls = new LCS(str1, str2);
System.out.println("Length:" + ls.getLenth());
System.out.println("Longest continuous common subsequence:" + ls.getSubSequence());
}
}
```

dp array during solution:

Space and time complexity are the same O ( m n ) Ο(mn) O(mn).

Topics: OJ