This blog records some learning experiences of prefix function, kmp algorithm and z function, as well as some practice writing done.
Prefix function
Prefix function is a very important idea in string algorithm.
\(\ pi[i] \) is defined as the same length of the longest prefix and suffix of: \ (s[:i] \), that is: \ (s[0:pi[i]]==s[i-pi[i]:i] \).
In many problems, this equation can be used to transform the problem into a small-scale problem, often combined with the state transition of dp.
KMP algorithm
The idea of kmp algorithm is to use prefix function jump in the matching process, which can retain some information and reduce the consumption of subsequent matching.
\In the matching process of (j \) variables, each \ (i \) is increased at most once, and the number of times it decreases will not exceed the number of times it increases, so the algorithm is linear.
The search algorithm of Kmp algorithm is very similar to the construction method of prefix function when traversing strings
[P2375 Zoo]
A KMP with restrictions, a half board problem, was tortured for a while with my konjac code force.
It is probably to follow the idea of vanilla kmp algorithm, reduce the complexity and count the answers under conditional constraints.
#include <bits/stdc++.h> using namespace std; typedef long long ll; typedef long double ldb; #define pb push_back int n, N; const int maxn = 1e6 + 5; const ll m = 1e9 + 7; int pi[maxn]; ll cnt[maxn]; // i has several layers of nxt arrays void make(char *s) { memset(pi, -1, sizeof(pi)); memset(cnt, 0, sizeof(cnt)); for (int i = 0, j = -1; s[i];) { while (j != -1 && s[i] != s[j]) j = pi[j]; if (j != -1) cnt[i] = cnt[j] + 1; pi[++i] = ++j; } } ll solve(char *s) { make(s); ll ans = 1; for (int i = 1, j = 0; i < n; ++i) { while (j != -1 && s[i] != s[j]) j = pi[j]; j++; while (j * 2 > i + 1) j = pi[j]; // cout << j << " "; ans = ans * (j == 0 ? 1 : cnt[j - 1] + 2) % m; } // puts(""); return ans; } int main() { #ifndef ONLINE_JUDGE freopen("P2375_1.in", "r", stdin); #endif scanf("%d", &N); char s[maxn]; for (int i = 0; i < N; ++i) { scanf("%s", s); n = strlen(s); printf("%lld", solve(s)); #ifdef ONLINE_JUDGE if (i != N - 1) #endif puts(""); } return 0; }
[P3426 [POI2005]SZA-Template]
Seal, a very interesting topic, the idea is kmp+dp.
There is no A yet.
z function
z-function is also called extended kmp algorithm, which is very similar to the idea of kmp,
The string p, z[i] is defined as the longest common prefix of p[i:] and p[0:].
Take the sample data of Luogu P5410 as an example,
a="aaaabaa",b="aaaaa"
Mismatch when i=4, ex[3]=1, z[4]=1.
z[4]=1 means that it should jump back to the 1 position of b to continue matching (because b starts from position 4 and its length is within the range of 1, which is equal to the substring with length 1 at the beginning of b string).
According to this property, we can make a jump similar to kmp algorithm and complete the solution within \ (O(n) \) complexity.
z function template problem
It includes two applications: self crossing of pattern string and hybridization of pattern string and target string.
The details have been introduced in the learning notes of z function, but I will not repeat them.
It should be noted that the position 0 of selfing and hybridization is different.
#include <bits/stdc++.h> using namespace std; typedef long long ll; typedef long double ldb; #define pb push_back const int maxn = 2e7 + 5; char a[maxn], b[maxn]; int z[maxn]; // Z array (function) is the pattern string self intersection array int ex[maxn]; // extend is a double string hybrid array, which represents the longest string length of the suffix of a and the prefix of b void single(char *p) { int m = strlen(p); z[0] = m; // The p string itself matches itself for (int i = 1, l = 0, r = 0; i < m; i++) { if (i >= r || i + z[i - l] >= r) { if (i >= r) r = i; while (r < m && p[r] == p[r - i]) r++; z[i] = max(0, r - i); l = i; } else z[i] = z[i - l]; } } void dual(char *s, char *p) { int n = strlen(s), m = strlen(p); single(p); for (int i = 0, l = 0, r = 0; i < n; i++) { if (i >= r || i + z[i - l] >= r) { if (i >= r) r = i; while (r < n && r - i < m && s[r] == p[r - i]) r++; ex[i] = max(0, r - i); l = i; } else ex[i] = z[i - l]; } } void solve(char *s, char *p) { dual(s, p); ll ans = 0; for (int i = 0; p[i]; ++i) ans = ans ^ ((ll)(i + 1) * (z[i] + 1)); printf("%lld\n", ans); ans = 0; for (int i = 0; s[i]; ++i) ans = ans ^ ((ll)(i + 1) * (ex[i] + 1)); printf("%lld\n", ans); } int main() { #ifndef ONLINE_JUDGE freopen("P5410_1.in", "r", stdin); #endif scanf("%s%s", a, b); solve(a, b); return 0; }
Password
A question of cf div1 to the effect that it is necessary to find a string t as long as possible in a given string s, so that the following conditions are met at the same time:
- t is the prefix of s
- t is the suffix of s
- t is the middle of s (neither prefix nor suffix)
It's easy to get the answer from the idea of z function.
After extracting the z function of the string, find the most linear value, and then compare it with the next array of kmp to find the answer.
Here, we need to do something when extracting the z function: because t cannot be a suffix when it appears in the middle, we should ensure that the z function cannot match to the last character. Just castrate it when calculating.
Think about it carefully. It seems that we can do this problem by using the next array in kmp algorithm. It's embarrassing.
Sure enough, I still have to write a solution to find out where I'm stupid.
#include <bits/stdc++.h> using namespace std; typedef long long ll; typedef long double ldb; #define pb push_back int n; const int maxn = 1e6 + 5; char s[maxn]; int z[maxn], pi[maxn]; void build(char *s) { for (int i = 0, j = pi[0] = -1; s[i]; pi[++i] = ++j) while (j != -1 && s[i] != s[j]) j = pi[j]; } void single(char *p) { for (int i = 1, l = 0, r = 0; i < n; ++i) { if (i <= r && z[i - l] < r - i + 1) z[i] = z[i - l]; else { z[i] = max(0, r - i + 1); while (i + z[i] < n - 1 && p[z[i]] == p[i + z[i]]) ++z[i]; } if (i + z[i] - 1 > r) l = i, r = i + z[i] - 1; } } void solve(char *s) { int idx, m = 0; for (int i = 0; i < n; ++i) if (z[i] > m && z[i] + i < n) m = z[i], idx = i; for (int j = pi[n]; j != -1; j = pi[j]) { if (j && m >= j) { printf("%s\n", s + n - j); return; } } puts("Just a legend"); } int main() { #ifndef ONLINE_JUDGE freopen("bin", "r", stdin); #endif scanf("%s", s); n = strlen(s); single(s); build(s); solve(s); return 0; }