The second assignment of software engineering practice -- personal practice

Posted by thom2002 on Thu, 03 Mar 2022 12:29:52 +0100

Which course does this assignment belong to< software engineering practice spring 2022 - class F >
What are the requirements for this assignment< the second assignment of software engineering practice - personal practice >
The goal of this assignment< complete the crawling of the event data of the Winter Olympic Games (for teaching only), and realize a console program that can count the national ranking and the number of medals. >
Other referencesCSDN, blog network

1, Gitcode project address

Project address: Project address of xuxx

2, PSP table

PSPPersonal Software Process StagesEstimated time (minutes)Actual time (minutes)
Planningplan
• Estimate• estimate how long this task will take2020
Development• development
• Analysis• needs analysis (including learning new technologies)3030
• Design Spec• generate design documents3030
• Design Review• design review2020
• Coding Standard• code specifications (develop appropriate specifications for current development)2545
• Design• specific design3030
• Coding• specific coding180240
• Code Review• code review6060
• Test• test (self-test, modify code, submit modification)200300
Reportingreport
• Test Report• test report1010
• Size Measurement• calculate workload2020
• Postmortem & Process Improvement Plan• summarize afterwards and put forward process improvement plan4560
total670835

3, Description of problem solving ideas

1. Acquisition of question 1 data (for teaching only)

The data required for this project is divided into two parts. One part is medal data (total.json) and the other part is schedule data (xxxx.json), that is, daily competition data. The schedule data from 0202 to 0215 are the data given by the teacher. The rest is to find the corresponding json data through keyword search through the internet function in Google browser.

2. Analysis of question 2 json

The analysis of json data adopts the third-party library rapidjson of C + + (at the beginning, it adopts json CPP, but later, the version used may be too low, resulting in memory leakage and direct computer jamming). The data is analyzed by using the functions provided by rapidjson library.

3. Reading and output of problem 3 file

The reading and output of files are stream operations in C + +, using ofstream and ifstream.

4. Question 4 function realization

There are two main functions, one is to output the total medal data, the other is to output the schedule data of the specified date, and wrap the two functions into two functions.

5. Access to various reference materials

Data acquisition: CSDN, blog Park, baidu

3, Interface design and implementation process

The code is divided into two parts, one is the tool code (Lib.h and Lib.cpp), and the other is the main program (olympicearch. CPP). In this practice, after a series of requirements analysis, it is decided to use the simple factory design mode to complete this operation (because this operation determines the corresponding output according to the input of various instructions, which is in line with the design idea of simple factory), so there is only one class in Lib. The two function modules are packaged into two member functions, and the class object is directly called Function is OK.
Lib.h as follows

//Main program class
class OlympicProgram {
private:
	string input;
	string output;
	string totalout;//Used to save the output when the instruction is total
	string scheduleOut[19];//Used to save the output of the instruction schedulexxx
public:
	OlympicProgram(string input,string output);
	void resolveInput();//The output is determined according to the input instruction
	string outputTotal();//Get total medal information
	string outputSchedule(string date);//Get the schedule information of the specified date
	void outputWrong(string wrong);//Output error message
};
string solveOrder(string order);//Processing instruction
bool isTrueDate(string date);//Judge the correctness of the date
string ReadFile(string filename);//Convert the read file to string
int trimSpace(string& order);//Remove spaces from instructions

Procedure flow chart:

Explanation of procedure flow:
1. In the main method of the main program OlympicSearch, use the obtained command-line parameters as parameters to directly declare the OlympicProgram object and directly call the method of resolveInput().
2. When declaring an object, the constructor needs to assign values to four members, in which input and output are obtained directly from the incoming parameters, while totalOut is obtained directly from the file by calling the outputTotal() method, and scheduleOut member is an array, which is obtained by calling the outputSchedule() method through the for loop.
3. In the resolveInput() method, the stream is read through the file, the getline+for loop is used to obtain the instruction line by line, and the solveOrder() method is called to process the instruction, obtain its return value, and determine the output according to the different return values.
4. The solveorder () method is used to process instructions, that is, to feed back the instructions and call the isTrueDate() and trimSpace() methods.

4, Key code display

//Entry: processing input files
//Return value: None
//Function: call different functions to process according to the input information
void OlympicProgram::resolveInput()
{
	ifstream myfile(this->input,ios::in);
	string temp;
	if (!myfile.is_open())
	{
		cout<< "The file was not opened successfully" << endl;
		return;
	}
	ofstream file_writer(this->output, ios_base::out);//When running for the first time, first set output Empty TXT
	ofstream outfile(this->output, ios::app);
	string outBuf;//Set output buffer
	for (int i = 0; getline(myfile, temp); i++)
	{
		string order = solveOrder(temp);
		if (order._Equal("space"))
		{
			i--;
			continue;
		}
		if (i != 0)
		{
			outBuf += "\n";
		}
		if (order._Equal("total"))
		{
			outBuf += this->totalout;
		}
		else if(order._Equal("N/A")||order._Equal("ERROR"))
		{
			outBuf += outputWrong(order);
		}
		else
		{
			int datenum = atoi(order.c_str());
			outBuf += this->scheduleOut[datenum - 202];
		}
		if (outBuf.size() > (1024 * 1024 * 5))
		{
			outfile << outBuf;
			outBuf = "";
		}
	}
	if (outBuf != "")
		outfile << outBuf;
	myfile.close();
	outfile.close();
}

Explanation ideas and notes: first read the file input Txt, then read the instruction line by line through getline, call solveOrder to process the instruction, and control the output according to its return value. The output is controlled by outBuf. The content to be output is added to outBuf. When outBuf is greater than 5m, it is output again.
2.

//Entrance: None
//Return value: string
//Function: get the output when the instruction is total
string OlympicProgram::outputTotal()
{
	Document d;		//Document tree 
	if (d.Parse(ReadFile("./data/total.json").c_str()).HasParseError())
		cout << "Parsing error\n" ;
	const rapidjson::Value& data = d["data"];		//data member 
	const rapidjson::Value& medalsList = data["medalsList"];		//medalsList member 
	string os;
	string backtotal;
	for (unsigned int i = 0; i < medalsList.Size(); i++)
	{
		//Necessary temporary storage variable pointers 
		rapidjson::Value::ConstMemberIterator rank = medalsList[i].FindMember("rank");
		rapidjson::Value::ConstMemberIterator countryid = medalsList[i].FindMember("countryid");
		rapidjson::Value::ConstMemberIterator gold = medalsList[i].FindMember("gold");
		rapidjson::Value::ConstMemberIterator silver = medalsList[i].FindMember("silver");
		rapidjson::Value::ConstMemberIterator bronze = medalsList[i].FindMember("bronze");
		rapidjson::Value::ConstMemberIterator count = medalsList[i].FindMember("count");
		
		os =(string) "rank" + rank->value.GetString()+ ':' + countryid->value.GetString() + '\n'
			+ "gold:" + gold->value.GetString() + '\n'
			+ "silver:" + silver->value.GetString() + '\n'
			+ "bronze:" + bronze->value.GetString() + '\n'
			+ "total:" + count->value.GetString() + '\n'
			+ "-----";
		if (i != medalsList.Size() - 1)
			os += '\n';
		backtotal += os;
	}
	return backtotal;
}

Explanation ideas and notes: read total directly JSON file, analyze the data by calling the method in rapidjson, obtain and add it to the backtotal variable through the for loop, and finally return the backtotal directly
3.

//Entrance: competition date
//Return value: string type string
//Function: output when the processing instruction is a specific competition date
string OlympicProgram::outputSchedule(string date)
{
	Document d;		//Document tree 
	if (d.Parse(ReadFile("./data/" + date + ".json").c_str()).HasParseError())
		cout << "Parsing error\n" ;
	const rapidjson::Value& data = d["data"];		//data member 
	const rapidjson::Value& matchList = data["matchList"];		//matchList member 
	string os;
	string backSchedule;
	for (unsigned int i = 0; i < matchList.Size(); i++)
	{
		//Necessary temporary storage variable pointers	
		rapidjson::Value::ConstMemberIterator startdatecn = matchList[i].FindMember("startdatecn");
		rapidjson::Value::ConstMemberIterator itemcodename = matchList[i].FindMember("itemcodename");
		rapidjson::Value::ConstMemberIterator title = matchList[i].FindMember("title");
		rapidjson::Value::ConstMemberIterator venuename = matchList[i].FindMember("venuename");
		rapidjson::Value::ConstMemberIterator t1 = matchList[i].FindMember("homename");
		rapidjson::Value::ConstMemberIterator t2 = matchList[i].FindMember("awayname");		//Necessary temporary storage variable pointers
		string homename = t1->value.GetString();
		string awayname = t2->value.GetString();
		string time = startdatecn->value.GetString();
		os = (string)"time:" + time.substr(11,5 )+ '\n'
			+ "sport:" + itemcodename->value.GetString() + '\n';
		if (homename.empty() & awayname.empty())
			os +=(string) "name:" + title->value.GetString() + '\n';
		else
			os += (string)"name:" + title->value.GetString() + " " + homename + "vs" + awayname + '\n';
		os += (string)"venue:" + venuename->value.GetString() + '\n' + "-----";
		if (i != matchList.Size() - 1)
			os += '\n';
		backSchedule += os;
	}
	return backSchedule;
}

Explanation ideas and notes: read the corresponding json data file directly according to the entered date, parse it with the method in rapidjson, obtain and add it to the variable backSchedule through the for loop, and finally return to backSchedule directly
4.

//Entry: error message
//Return value: string type string
//Function: output in case of error instruction
string OlympicProgram::outputWrong(string wrong)
{
	string os = wrong + '\n' + "-----";
	return os;
}

Explanation ideas and notes: output error information directly

5.
//Entry: read instruction
//Return value: the information to respond to the instruction
//Function: process input instructions
string solveOrder(string order)
{
	string t = order;
	int size= trimSpace(t);
	if (t._Equal("total")&&size==1)
	{
		return "total";
	}
	else if (t.substr(0, 8)._Equal("schedule")&&size==2)
	{
		string date = t.substr(8, order.size() - 8);
		if (!isTrueDate(date))
		{
			return "N/A";
		}
		return date;
	}
	else if (t.empty())
	{
		return "space";
	}
	else
	{
		return "ERROR";
	}
}

Explanation ideas and notes: call the trimspace function to operate on the instruction first (remove all spaces and count the number of original strings), and then make a series of returns according to various instructions.
6.

//Entry: input file name and output file name
//Return value: None
//Function: initialize object
OlympicProgram::OlympicProgram(string input, string output)
{
	this->input = input;
	this->output = output;

	this->totalout=outputTotal();
	string date;
	for (int i = 202; i <= 220; i++)
	{
		date = "0" + to_string(i);
		int datenum = atoi(date.c_str());
		this->scheduleOut[datenum - 202] += outputSchedule(date);
	}
}

Explain the way of thinking and annotation: directly in the constructor call outputSchedule and outputTotal two methods, put all the corresponding output of instructions into the program. When testing, you can also directly output outTotal or outputschedule to verify

5, Performance improvement

Test with 20000 correct instructions (repeat total, schedule 0202-0220 for 1000 times) and input the file input Txt size is 286KB, output file output Txt is 74233KB
Improvement 1: change json CPP to rapid json (reduce json parsing time and memory usage)
Before improvement: the computer is directly stuck, which continuously reduces the memory and does not release it
After improvement: the time is 13 minutes and 14 seconds, the running memory is basically within 2.4m, and the IO output accounts for the largest proportion of CPU

Improvement 2: when reading the json file for the first time, it is directly stored in the program after reading from the file, and then directly output when encountering the same instruction later (that is, reduce file reading and optimize IO)
After improvement: the time is 1 minute and 08 seconds, and the running memory is basically within 1.0m

Improvement 3: on the basis of the second improvement, the output corresponding to all instructions is stored in the constructor in the program at the beginning. When outputting later, it is not necessary to judge whether there are existing instruction outputs in the current program. Some if judgment statements can be omitted. If there are few instructions, it may be a little slower, but it basically does not affect. When there are a large number of instructions, Then the speed can be partially improved.
After improvement: the time is 42s, and the running memory is basically within 1.2m

CPU,GPU

Improvement 4: in the third CPU and GPU performance analysis report, it was found that files account for a large proportion, so unnecessary file operations in the code were modified. Originally, the file output stream was opened once and then closed. Now, the file output stream is only opened once, and then closed after all instructions are output. That is, reduce file operations.
After improvement: the time is 18s and the memory is basically within 1.2m

CPU,GPU

Improvement 5: in the fourth CPU and GPU performance analysis report, I thought of the idea of changing space for time in the operating system, so I set a variable to store the current output. When it exceeds a certain size (it can't be too large or too small, I output it once in 5m here), I set a buffer.
After improvement: the time is 2s, and the memory is basically within 7.9m

6, Unit test

Program (exe file) coverage:

Unit test:
1. Test the program output file and correct output file (i.e. output.txt)
2. Test of date judgment function (isTrueDate())
3. Test of processing instruction function (solveOrder())
Example:

Evaluation of the test: I feel that my test may not be complete enough, and all possible special situations may not be considered, but most possible errors have passed the test. The test of the output file is to write code and directly compare whether the two files are the same

7, Exception handling

1. File reading failed

2.json parsing error

3. Command line parameter error

4. Fault tolerance processing
A large number of spaces or tabs can appear in the instruction
If there is a blank line between instructions, skip the blank line directly

8, Experience

1. Reviewed the grammar of C + + (because I haven't written C + + for too long, I use C + + for this assignment)
2.git is so fragrant!!! I haven't used git before. Sometimes I want to withdraw the deleted code, but I find that it can't be withdrawn. Git can return to the file at any time, which is really easy to use.
3. In terms of performance optimization and unit testing, it is the module that takes the longest time in my whole practice. Performance optimization allows me to learn a lot of knowledge, such as how to optimize IO output and input. In the optimization process, with the help of the knowledge of the operating system, we have a further understanding of space for time (although it is only used in the output). Unit testing is the first time I came into contact. The previous tests when writing code were simple input and output tests, which are not as strict as now.
4. I learned how to use jsoncpp and rapidjson, and I also understood that I can't use the third-party library of the old version because of the convenience of lazy figure. The old version of jsoncpp is very easy to use, but there is a problem of memory leakage (the computer almost hung up), so I directly replaced it with rapidjson.
5. I am more proficient in the use of VS2019. Before, both C + + and C were written by DEV, but DEV has fewer functions. It is better to use VS2019.