7-46 hot topics on Sina Weibo (30 points) (ideas + detailed explanation + set + map)pta click one by one, brothers

Posted by Oaquasis on Mon, 27 Dec 2021 12:47:30 +0100

1: Title

Sina Weibo can embed "topic" in the speech, that is, the topic text in the speech can be written between a pair of "#" to generate a topic link. Click the link to see how many people are discussing the same or similar topics with themselves. Sina Weibo will also update the list of hot topics at any time, and put the most popular topics in an eye-catching position to recommend everyone's attention.

This topic requires a simplified hot topic recommendation function to analyze topics from a large number of English microblogs (because Chinese word segmentation is troublesome) and find out the topics mentioned by the most microblogs.

Input format:
Input Description: the input first gives a positive integer N (≤ 10)
5
), followed by N lines, each line gives an English microblog with a length of no more than 140 characters. Any content contained in # a recent pair is considered as a topic, and the input is guaranteed to # appear in pairs.

Output format:
The first line outputs the topics mentioned by the most microblogs, and the second line outputs the number of microblogs mentioned. If such topics are not unique, the topics with the smallest alphabetical order will be output, and And k more... Will be output in the third line, where k is the number of other hot topics. The input ensures that at least one topic exists.

Note: the two topics are considered to be the same. If the symbols of all non English letters and numbers are removed and the case distinction is ignored, they are the same string; At the same time, they have exactly the same participle. When outputting, only lowercase English letters and numbers are reserved except the first capital letter, and a space is used to separate the words in the original text.

Input example:

4
This is a #test of topic#.
Another #Test of topic.#
This is a #Hot# #Hot# topic
Another #hot!# #Hot# topic

Output example:

Hot
2
And 1 more ...

2: Train of thought analysis

Idea: find a way to extract the data in ##, and then store it in the map container, m[str] +=1; If the same topic appears
Then add one, (m[str] without repetition defaults to 0, that is, the last m[str] == 1)
Next, the last value in the map container is the largest. Of course, duplicate values may occur, but it doesn't matter. We'll use the known values
The maximum value can be obtained, and then the number of the same topic is the second item required to be output by the topic

Here I use the set container to extract each topic from the microblog
It should also be noted that this question does not make the largest number of statistical topics. It is to look at the topics mentioned in each microblog. No matter how many times they are mentioned in a microblog, they are counted as the topics appearing in this microblog.

3: Upper code

/**
  Idea: find a way to extract the data in ##, and then store it in the map container, m[str] +=1; If the same topic appears
  		Then add one, (m[str] without repetition defaults to 0, that is, the last m[str] == 1)
		Next, the last value in the map container is the largest. Of course, duplicate values may occur, but it doesn't matter. We'll use the known values
		The maximum value can be obtained, and then the number of the same topic is the second item required to be output by the topic
        
        Here I use the set container to extract each topic from the microblog

*/

#include<bits/stdc++.h>
using namespace std;

set<string>s[100010]; 
set<string>:: iterator st; 

//Extract the content in ## the middle of the string
void deal(string str,int x){

   int flag = 0; 
   int count = 0;
   string word = "";//String initialization
   string str1 = "";
   str += "!";//Add an end flag to the end of the string 
   
   for( int i = 0; i < str.size(); i++ ){
		//This is a #test of topic#.
   		if( str[i] == '#'){
   			flag = 1;
   			count++;
   			if( i < str.size() - 1)
   			i++;
     	}
     	
     	if( isalpha(str[i]) && flag == 1 || isdigit(str[i]) && flag == 1){//Judge whether the character is alphanumeric 
			 
			if( isalpha(str[i])) 
				word += tolower(str[i]);//Change uppercase letters to lowercase letters 
			if( isdigit(str[i]))
			    word += str[i];	
				 	
		}else if( flag == 1  ) {//Here, flag = 1 ensures that the extracted characters are ## inside
		                                        
		 //  	cout << word << endl;
		 	if( word != " ")	 
				str1 += word;
			 
			 word = " ";//Clean up the last word, store a new word (and add spaces to other words except the first word)
		}
		//Represents a hot spot in a topic 
		if( count == 2){
     		flag = 0;
     		count = 0;//There may be more than one # number in a row
     		if(str1 != " " )
			
		    s[x].insert(str1);
			
			word = ""; //If there are two hot topics in a statement, then (because word is assigned to "" after str1 is finally executed, and the new hot topic needs no space at the beginning)
			str1.clear();  
		
		 }
		
   } 
   
} 
 

int main(){
	
	int N;
	map<string,int>m2;
	map<string,int>::iterator t;   

	cin >> N; 
	
	getchar();
	
	for( int i = 0; i < N; i++ ){
		string str;
		getline(cin,str);
		
		deal(str,i);
		
		for( st = s[i].begin(); st != s[i].end(); st++ ){
			   
			   m2[*st] += 1; 
		}
		 
		
	} 
	
//	cout << "******************"<< endl;
	
	int max = 0; 
	
	for( t = m2.begin(); t != m2.end(); t++ ){
		
		if( t->second > max ){
			max = t->second;
		}		
	}
	
//	cout << max << endl;
	
	int num; 
	
	for( t = m2.begin(); t != m2.end(); t++ ){
	
		if( t->second == max ){
			string str = t->first;
			str[0] = toupper(str[0]);
			 
			cout << str << endl;
			
			cout << t->second << endl;
			num = t->second;
			
			break;
		}			
	}
	
	int count = -1;//Eliminate yourself first 
	for( t = m2.begin(); t != m2.end(); t++ ){
		if(t->second == num )
		count++;
	}
	
	//cout << m2.size() << endl;
	
	if( count > 0)
	cout << "And " << count << " more ...";
	            
	
} 

//4
//This is a #test of topic#.
//

//This is a #Hot# #Hot# topic
//Another #hot!# #Hot# topic

//1
//This is a #test of topic#.

//1
//Another #hot!# #Hot# topic


//Another #Hot# topic



//4
//This is a #test of topic#.
//Another #Test of topic.# #Hot#
//This is a #Hot# #Hot# topic
//Another #hot!# #Hot# topic

//
//Another #Test1 of topic.#
//This is a #test of topic#.


//Another #hot!# #Hot# topic

4: Summary

Finally, there is a mistake at the beginning. That's because the range of set above me is too small. Increase the range to the 5th power of 10. Come on boy!

5: Knowledge express (if you are not familiar with map container and set container, come here to learn HA)

Basic usage of map
Basic usage of set

Topics: data structure string set map

Programmer Think