0 basic society on the dynamic planning of editing distance (with C + + source code)

Posted by social_experiment on Thu, 10 Mar 2022 21:51:02 +0100

catalogue

Question 1: judgment subsequence

Question 2: different subsequences

Question 3: deletion of two strings

Question 4: editing distance

Today, we continue to study the algorithm problems of dynamic programming, which are a series of algorithm problems about finding editing distance. Let's make a summary. I'll copy these problems to the top first.

Question 1: judgment subsequence

Given the strings S and t, judge whether s is a subsequence of t.

A subsequence of a string is a new string formed by deleting some (or not deleting) characters from the original string without changing the relative position of the remaining characters. (for example, "ace" is a subsequence of "abcde" and "aec" is not).

Example 1: input: s = "abc", t = "ahbgdc" output: true

Example 2: input: s = "axc", t = "ahbgdc" output: false

Analysis: we will discuss this problem in two situations:

if(s[i-1]==t[j-1])

A character found in t also appears in s;

if(s[i-1]!=t[j-1])

It's equivalent to t deleting the element and continuing to match, so the equation can be listed

if(s[i-1]==t[j-1])
dp[i][j]=dp[i-1][j-1]+1;
else
dp[i][j]=dp[i][j-1];//Delete the elements in the array with more elements

Therefore, coding:

#include <iostream>
using namespace std;
bool isSubquence(string &s,string &t)
{
    vector<vector<int>>dp(s.size()+1,vector<int>(t.size()+1));
    for(int i=1;i<=s.size();++i)
    {
        for(int j=1;j<=t.size();++j)
        {
            if(s[i-1]==t[j-1])
            {
                dp[i][j]=dp[i-1][j-1]+1;
            }
            else
            {
                dp[i][j]=dp[i][j-1]
            }
        }
    }
        if(dp[s.size()][t.size()]==s.size())
        return true;
        return false;
}
int main()
{
    string s="ace";
    string t="abcse";
    bool val=isSubquence(s,t);
    cout<<"val="<<val<<endl;
    return 0;
}

Question 2: different subsequences

Given a string s and a string t, calculate the number of occurrences of T in the subsequence of S.

A subsequence of a string refers to a new string formed by deleting some (or no) characters without disturbing the relative position of the remaining characters. (for example, "ACE" is a subsequence of "ABCDE" and "AEC" is not)

Analysis: this question is more complicated than the above question, because when s[i-1] is equal to t[j-1], dp[i][j] can be composed of two parts.

Use the number J - s to match a [I - DP].

One part is that s[i - 1] is not used to match, and the number is dp[i - 1][j].

So when s[i - 1] is equal to t[j - 1], dp[i][j] = dp[i - 1][j - 1] + dp[i - 1][j];

When s[i - 1] is not equal to t[j - 1], dp[i][j] has only one part and does not need s[i - 1] to match, that is, dp[i - 1][j]

So the recurrence formula is: dp[i][j] = dp[i - 1][j];

#include <iostream>
#include <vector>
using namespace std;
int numDistinct(string &s,string &t)
{
    vector<vector<int>>dp(s.size()+1,vector<int>(t.size()+1),0);
    for(int i=0;i<s.size();++i)
    dp[i][0]=1;
    for(int j=1;j<t.size();++j)
    dp[0][j]=0;
    for(int i=1;i<=s.size();++i)
    {
        for(int j=1;j<=t.size();++j)
        {
            if(s[i-1]==t[j-1])
            dp[i][j]=dp[i-1][j-1]+dp[i-1][j];//Whoever has more will lose 1;
            else
            dp[i][j]=dp[i-1][j];
        }
    }
    return dp[s.size()][t.size()];
}
int main()
{
    string s="bagg";
    string t="bag";
    int ret=numDistinct(s,t);
    cout<<"ret="<<ret<<endl;
    return 0;
}

Question 3: deletion of two strings

Given two words word1 and word2, find the minimum number of steps required to make word1 and word2 the same, and one character in any string can be deleted in each step.

Example:

Input: "sea", "eat"
Output: 2 explanation: the first step is to change "sea" to "ea", and the second step is to change "eat" to "ea"

Analysis: this question looks very complicated compared with the above two questions. Why? Because the first two questions only need to delete the elements of one of the arrays, but this question has to operate on the elements of the two arrays.

However, this problem is to find the minimum number of steps, so you only need to add up the number of two strings - the maximum number of steps * 2 = the minimum number of steps

The following code is represented by s and t, which will make it easier to write.

#include <iostream>
#include <vector>
using namespace std;
int minDistance(string &s,string &t)
{
    vector<vector<int>>dp(s.size()+1,vector<int>(t.size()+1));
    for(int i=1;i<=s.size();++i)
    {
        for(int j=1;j<=t.size();++j)
        {
            if(s[i-1]==t[j-1])
            dp[i][j]=dp[i-1][j-1]+1;
            else
            dp[i][j]=max(dp[i-1][j],dp[i][j-1]);
        }
    }
        return s.size()+t.size()-dp[s.size()][t.size()]*2;//Subtracting the length of the two longest common subsequences from the total length of the two strings is the minimum number of steps to delete.
}
int main()
{
    string s="sea";
    string t="eat";
    int val=minDistance(s,t);
    cout<<"val="<<val<<endl;
    return 0;
}

Question 4: editing distance

Here are two words , word1 and , word2. Please calculate the minimum operand used to convert , word1 , into , word2.

You can perform the following three operations on a word:

  • Insert a character
  • Delete a character
  • Replace one character

Example 1: input: word1 = "horse", word2 = "ros" output: 3 explanation: Horse - > horse (replace 'h' with 'R') horse - > Rose (delete 'R') rose - > ROS (delete 'e');

Example 2: input: word1 = "intent", word2 = "execution" output: 5 explanation: intent - > inention (delete't ') inention - > enention (replace' i 'with' e ') enention - > exception (replace' n 'with' x ') execution - > execution (replace' n 'with' C ') execution - > execution (insert' u ').

analysis:

  • if (word1[i - 1] == word2[j - 1])
    • No operation
  • if (word1[i - 1] != word2[j - 1])
    • increase
    • Delete
    • change

That is, the above four cases.

if (word1[i - 1] == word2[j - 1]), then dp[i][j] should be dp[i - 1][j - 1] without any editing, that is, dp[i][j] = dp[i - 1][j - 1];

if (word1[i - 1] != word2[j - 1])

Operation 1: add an element to word1 so that word1[i - 1] is the same as word2[j - 1], then it is the nearest editing distance between word1 ending with i-2 and word2 ending with i-1 plus an operation to add an element.

dp[i][j] = dp[i - 1][j] + 1;

Operation 2: add an element to word2 so that word1[i - 1] is the same as word2[j - 1], then it is the nearest editing distance between word1 ending with i-1 and word2 ending with j-2, plus an operation to add an element.

Namely dp[i][j] = dp[i][j - 1] + 1;

Operation 3: replace the element. Word1 replaces word1[i - 1] to make it the same as word2[j - 1]. At this time, there is no need to add elements. Then the nearest editing distance between word1 ending with i-2 and word2 ending with j-2 plus a replacement element.

Namely dp[i][j] = dp[i - 1][j - 1] + 1;

To sum up, when if (word1 [I - 1]! = word2 [J - 1]), take the smallest, that is, dp[i][j] = min({dp[i - 1][j - 1], dp[i - 1][j], dp[i][j - 1]}) + 1;

So the code is as follows:

#include <iostream>
#include <vector>
using namespace std;
int minDistance(string &word1,string &word2)
{
    vector<vector<int>>dp(word1.size()+1,vector<int>(word2.size()+1));
    for(int i=0;i<word1.size();++i)
    dp[i][0]=i;
    for(int j=0;j<word2.size();++j)
    dp[0][j]=j;
    for(int i=1;i<=word1.size();++i)
    {
        for(int j=1;j<=word2.size();++j)
        {
            if(word1[i-1]==word2[j-1])
            {
                dp[i][j]=dp[i-1][j-1];
            }
            else
            {
                dp[i][j]=min({dp[i-1][j],dp[i][j-1],dp[i-1][j-1]})+1;
            }
        }
    }
        return dp[word1.size()][word2.size()];
}
int main()
{
    string s="horse";
    string t="ros";
    int val=minDistance(s,t);
    cout<<"val="<<val<<endl;
    return 0;
}

After you understand, you can write your own code and try it on your own machine.

Topics: C++ Algorithm Dynamic Programming