AC automata (virus invasion ongoing)

Posted by JC99 on Thu, 30 Dec 2021 19:19:19 +0100

This question is also a board question, which is very similar to the previous question, that is, multiple counting and storage. It was explained in detail in the previous blog post. Then this question talked about maxn. It must be + 5 / + 10 when taking the size of the array, especially when manually assigning the size, such as ans[1000], it is easy to forget. If you forget jj, Then I returned the MLE when my algorithm was right, which should be because I made a mistake in the numerical range. I increased the MLE by 10 times. Generally, OJ allows 10 times the space for the correct algorithm, giving 32MB. Then I turned it up ten times and it was very unstable. I jumped between 34-54MB. Later, I found that I turned it up.....

Virus invasion continues

Problem Description

Small t thank you very much for helping to solve his last problem. However, the virus invasion continues. With Xiao t's unremitting efforts, he found the "source of all evil" in the Internet. This is a huge virus website. It has a lot of viruses, but the viruses contained in this website are very strange. The characteristic codes of these viruses are very short and only contain "English uppercase characters". Of course, little t wants to do harm to the people, but little t never fights an unprepared war. Know yourself and know the enemy, and be invincible in a hundred battles. The first thing for xiaot is to know the characteristics of the virus website: how many different viruses are contained and how many times each virus appears. Can you help him again?

Input

In the first line, an integer N (1 < = N < = 1000) represents the number of virus signature codes.
The next N lines represent a virus signature. The signature string length is between 1-50 and only contains "English uppercase characters". Any two virus signature codes will not be exactly the same.
The next line represents the source code of the "source of all evil" website, and the length of the source code string is within 2000000. All characters in the string are ASCII visible characters (excluding carriage return).

Output

Output the number of occurrences of each virus in the following format, one per line. Viruses that do not appear do not require output.
Virus signature: number of occurrences
There is a space after the colon, which is output according to the input order of virus signature.

Sample Input

3

AA

BB

CC

ooxxCC%dAAAoen....END

Sample Output

AA: 2

CC: 1

Hint

Hit: all situations not mentioned in the title description should be considered. For example, two virus signatures may contain each other or have overlapping signature segments. The counting strategy can also be inferred from the Sample to some extent.

Source: HDU3056

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include<iostream>
#include<algorithm>
using namespace std;
typedef long long ll;
const int maxn=5e4+10;
const int maxntext=2e6+10;
int trie[maxn][26],fail[maxn],queue[maxn],front,rear,cnt,idx[maxn],ans[1005];
char text[maxntext],word[1005][55];
void buildtrie(char *w,int n)
{
    int  len,c,u=0;
    len=strlen(w);
    for(int i=0;i<len;i++)
    {
        c=w[i]-'A';
        if(trie[u][c]==0)
            trie[u][c]=++cnt;
        u=trie[u][c];
    }
    idx[u]=n;//Record which word corresponds to the first input (corresponding to the string) 
}
void buildfail()
{
	int u=0;
		for(int i=0;i<26;i++)
			    if(trie[u][i])
		    queue[rear++]=trie[u][i];
	while(front<rear)
	{
		u=queue[front++];
		for(int i=0;i<26;i++)
		{
			if(trie[u][i])
			{
				fail[trie[u][i]]=trie[fail[u]][i];
				queue[rear++]=trie[u][i];
			}
			else
			    trie[u][i]=trie[fail[u]][i];
		}
	}
}
void query(char *t)
{
    int len,c,u=0;
    len=strlen(t);
    for(int i=0;i<len;i++)
    {
        c=t[i]-'A';
        if(c<0||c>26)
        {
            u=0;
            continue;
        }
        u=trie[u][c];
        for(int j=u;j;j=fail[j])
        {
            if(idx[j])
                ans[idx[j]]++;//This question is not repeated 
        }
    }
}
int main()
{
    int n;
    while(scanf("%d",&n)!=EOF)
{
    memset(trie,0,sizeof(trie));
    memset(fail,0,sizeof(fail));
    memset(ans,0,sizeof(ans));
    memset(idx,0,sizeof(idx));
    cnt=front=rear=0;
    getchar();
    for(int i=1;i<=n;i++)
    {
        scanf("%s",word[i]);
        buildtrie(word[i],i);
    }
    buildfail();
    scanf("%s",text);
    query(text);
    for(int i=1;i<=n;i++)
    {
        if(ans[i]!=0)
        printf("%s: %d\n",word[i],ans[i]);
    }
}
    return 0;
}

2021.8.4

Topics: acm