1 Purpose and content of the experiment
1.1 experimental purpose
(1) Through computer practice, deepen the understanding of grammar guided translation principle, and master the semantic translation method of transforming the grammatical category identified by grammar analysis into some intermediate code.
(2) Master the commonly used semantic analysis method - grammar guided translation technology.
(3) The PL/0 grammar specification is given, which requires adding semantic processing in the syntax analyzer, and outputting the intermediate code for the expression with correct syntax; For arithmetic expressions with correct syntax, output their calculated values.
1.2 experimental contents
The PL/0 language grammar has been given. In the expression syntax analysis program of Experiment 2 or Experiment 3, the semantic processing part has been added to output the intermediate code of the expression, which is represented by quaternion sequence.
1.3 test requirements
(1) The semantic analysis object focuses on the correct grammatical category after grammatical analysis. The focus of this experiment is the semantic subroutine.
(2) Add the semantic processing of PL/0 language "expression" in Experiment 2 or Experiment 3 "parser", output the intermediate code of the expression, and calculate the semantic value of the expression.
(3) The intermediate code is represented by a quaternion sequence.
2 design idea
2.1 semantic rules
The process of attribute calculation is the process of semantic processing. Each production of grammar is equipped with a set of attribute calculation rules, which is called semantic rules.
(1) The terminator has only a comprehensive attribute, which is provided by the lexical analyzer.
(2) Non terminators can have either comprehensive attributes or inherited attributes. All inherited attributes of the grammar start symbol are used as the initial value before attribute calculation.
(3) A calculation rule must be provided for the inheritance attribute of the symbol on the right side of the production and the comprehensive attribute of the symbol on the left side of the production.
(4) The inheritance attribute of the symbol on the left of the production and the comprehensive attribute of the symbol on the right of the production are calculated by the attribute rules of other production.
2.2 recursive descent translator
The principle of recursive descent analysis is to use recursive calls between functions to simulate the top-down construction process of syntax tree. Starting from the root node, find a leftmost matching sequence in the input string from top to bottom, and establish a syntax tree. The inheritance property of each non terminator is regarded as a formal parameter, and the return value of the function is regarded as the inheritance property of the non terminator; For terminators, initialize all inherited properties. In the process of further analysis, the non terminator determines which production candidate to use according to the current input symbol.
2.3 pseudo code of recursive descent subroutine
(1) Expression
function expression:string; string s1, s2, s3, result; BEGIN IF SYM='+' OR SYM='-' THEN ADVANCE; ELSE IF SYM =FIRST(term) ELSE ERROR; BEGIN s1:=term; END; WHILE SYM='+' OR SYM='-' THEN BEGIN ADVANCE; S2:=term; result := newtemp(); emit(SYM,s1,s2,result); s1:= result; END; Return result; END;
(2) Item
function term:string; string s1, s2, s3, result; BEGIN IF SYM =FIRST(factor) THEN BEGIN s1:=Factor; END; ELSE ERROR; WHILE SYM ='*'OR SYM='/' THEN IF SYM =FIRST(factor) THEN BEGIN ADVANCE; S2:=Factor; result := newtemp(); emit(SYM,s1,s2,result); s1:= result; END; Return result; END; ELSE ERROR;
(3) Factor
function factor:string; string s; BEGIN IF SYM ='(' THEN ADVANCE; s:=expression; ADVANCE; IF SYM=')' THEN ADVANCE; Return s; ELSE ERROR; ELSE IF SYM =FIRST(factor) THEN ADVANCE; ELSE ERROR; END;
3 algorithm flow
The flow chart of the algorithm is as follows: first, input the expression, then carry out lexical analysis, put the result of lexical analysis in the structure, then call the expression subprogram in the recursive descent parser to analyze it, and finally get the four tuple and have the corresponding structure. Next step is to judge, if it is arithmetic expression, calculate the value of the arithmetic expression and output it. If it is not an arithmetic expression, it will not be processed and the quad will be directly output. Finally, judge whether the input of the program is over. If it is not over, enter the expression again and repeat the above steps. If it is over, the program will exit.
Fig. 1 algorithm flow chart
4 source program
#include<iostream> #include<stdlib.h> #include<stdio.h> #include<string.h> using namespace std; //Store the results of lexical analysis struct cf_tv { string t; //Types of lexical analysis string v; //Value of lexical analysis variable }; //Storage Quad struct qua { string symbal; //Symbol string op_a; //First operand string op_b; //Second operand string result; //result }; string input; //Global input int cnt; //global variable int k=0; //tv input int ljw=0; cf_tv result[200]; //Storage results qua output[200]; //Store output Quads int x=0; //Subscript of qua int ans=0; //Subscript when traversing bool error=true; //Error flag int is_letter=0; int t[1001]; //Temporary storage space string item(); string factor(); //Generate new variable names t1,t2, etc string new_temp() { char *pq; char mm[18]; pq=(char*)malloc(18); ljw++; //Convert to string format snprintf(mm,sizeof(mm),"%d",ljw); strcpy(pq+1,mm); pq[0]='t'; string s; s=pq; return s; } //Determine whether it matches the target string bool judge (string input, string s) { if (input.length()!=s.length()) return false; else { for(unsigned int i=0;i<s.length();i++) { if(input[i]!=s[i]) return false; //ergodic } return true; } } //Determine whether it matches the target string bool judge1 (string input, string s) { if(input[0]==s[0]) return true; else return false; } //Judgment of non symbolic procedures, including judgment keywords, identifiers, constants void not_fh(string p) { //Judge whether it is the same as the target string, and output the result if it is the same if(judge (p,"begin")) { result[k].t="beginsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"call")) { result[k].t="callsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"const")) { result[k].t="constsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"do")) { result[k].t="dosym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"end")) { result[k].t="endsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"if")) { result[k].t="ifsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"odd")) { result[k].t="oddsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"procedure")) { result[k].t="proceduresym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"read")) { result[k].t="readsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"var")) { result[k].t="varsym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"then")) { result[k].t="thensym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"write")) { result[k].t="writesym"; result[k].v=p; k++; } //Judge whether it is the same as the target string, and output the result if it is the same else if(judge (p,"while")) { result[k].t="whilesym"; result[k].v=p; k++; } else { int flag = 0; for(unsigned int i=0;i<p.length();i++) { //Determine whether it is an identifier if(!isdigit(p[i])) { flag = 1; result[k].t="ident"; result[k].v=p; k++; break; } } //Judge whether it is a number if(!flag) { result[k].t="number"; result[k].v=p; k++; } } } //Prevent multiple operators from forming and return the correct subscript int change(string str,int cnt) { int y=0; char fh[15]={'+','-','*','/','=','<','>',':','(',')',',',';','.'}; for(int i=0;i<13;i++) { if(str[cnt]==fh[i]) { y=i; } } if(y==5) { //If the operator is composed of two symbols, cnt+1 if(str[cnt+1]=='>') { cnt=cnt+1; return cnt; } //Judge whether two operators are connected else if(str[cnt+1]=='=') { cnt=cnt+1; return cnt; } } //Judgment:= if(y==7) { cnt=cnt+1; return cnt; } return cnt; } //Output to operators and delimiters void fh_1(string str,int cnt) { int y=0; char fh[15]={'+','-','*','/','=','<','>',':','(',')',',',';','.'}; for(int i=0;i<13;i++) { if(str[cnt]==fh[i]) y=i; } //plus if(y==0) { result[k].t="plus"; result[k].v=fh[y]; k++; } //minus if(y==1) { result[k].t="minus"; result[k].v=fh[y]; k++; } //times if(y==2) { result[k].t="times"; result[k].v=fh[y]; k++; } //slash if(y==3) { result[k].t="slash"; result[k].v=fh[y]; k++; } //eql if(y==4) { result[k].t="eql"; result[k].v=fh[y]; k++; } if(y==5) { //neq if(str[cnt+1]=='>') { cnt=cnt+1; result[k].t="neq"; result[k].v="<>"; k++; } //leq else if(str[cnt+1]=='=') { result[k].t="leq"; result[k].v="<="; k++; } //lss else { result[k].t="lss"; result[k].v="<"; k++; } } if(y==6) { //geq if(str[cnt+1]=='=') { result[k].t="geq"; result[k].v=">="; k++; } //gtr else { result[k].t="gtr"; result[k].v=">"; k++; } } //becomes if(y==7) { result[k].t="becomes"; result[k].v=":="; k++; } //lparen if(y==8) { result[k].t="lparen"; result[k].v="("; k++; } //rparen if(y==9) { result[k].t="rparen"; result[k].v=")"; k++; } //comma if(y==10) { result[k].t="comma"; result[k].v=","; k++; } //semicolon if(y==11) { result[k].t="semicolon"; result[k].v=";"; k++; } //period if(y==12) { result[k].t="period"; result[k].v="."; k++; } } //lexical analysis void cifa() { string str; while(cin>>str) { cnt=0; const char *d = " +-*/=<>:(),;."; char *p; //Use spaces and operators and delimiters to split strings and traverse char buf[1001] ; //Convert string to array strcpy(buf , str.c_str()); //p is a char* p = strtok(buf,d); while(p) { //Current unsigned if(str[cnt]==p[0]) { not_fh(p); cnt=cnt+strlen(p); } //Is currently a symbol else { while(str[cnt]!=p[0]) { fh_1(str,cnt); cnt=change(str,cnt); cnt=cnt+1; } not_fh(p); cnt=cnt+strlen(p); } //Move down one bit for traversal p=strtok(NULL,d); } for(unsigned int i=cnt;i<str.length();i++) { //Prevent multiple symbols at the end fh_1(str,i); } } } //Determine what type of calculation it is void judge_type() { for(int i=0;i<k;i++) { if(judge(result[i].t,"ident")) { is_letter=1; break; } } } //Recursive descent analysis function of expression string bds() { string s; string s1,s2,s3; if(ans>k) return NULL; //Addition and subtraction symbol if(judge(result[ans].v,"+") || judge(result[ans].v,"-")) { ans++; if(ans>k) { cout<<1<<endl; //error error=false; } s1=item(); } else if( judge(result[ans].v,"(") ||judge(result[ans].t,"ident") ||judge(result[ans].t,"number")) { //Item determination, the preceding condition is the first set s1=item(); } else { cout<<2<<endl; //error error=false; }// while(judge(result[ans].v,"+") || judge(result[ans].v,"-")) { int ans_temp=ans; ans++; if(ans>k) { cout<<3<<endl; //error error=false; } //Project cycle s2=item(); output[x].symbal=result[ans_temp].v; output[x].op_a=s1; output[x].op_b=s2; output[x].result=new_temp(); s=output[x].result; s1=s; x++; } return s; } //Recursive descent analysis function of term string item() { string s; string s1,s2,s3; if(ans>k) return NULL; //Factor judgment s1=factor(); while(judge(result[ans].v,"*") || judge(result[ans].v,"/")) { int ans_temp=ans; ans++; if(ans>k) { cout<<4<<endl; //error error=false; } s2=factor(); output[x].op_a=s1; output[x].symbal=result[ans_temp].v; output[x].op_b=s2; output[x].result=new_temp(); s=output[x].result; s1=s; x++; } return s1; } //Recursive descent analysis function of factor string factor() { string s; if(ans>=k) return NULL; //First letter or number if(judge(result[ans].t,"ident") ||judge(result[ans].t,"number")) { s=result[ans].v; ans++; if(ans>k) { cout<<5<<endl; //error error=false; } } //Left parenthesis else if(judge(result[ans].v,"(")) { ans++; //expression s = bds(); //Right parenthesis if(judge(result[ans].v,")")) { ans++; if(ans>k) { cout<<6<<endl; //error error=false; } } } else { cout<<7<<endl; //error error=false; } return s; } //Delete first letter string del(string s) { char c[101]; for(unsigned int i=0;i<s.length()-1;i++) { c[i]=s[i+1]; } return c; } void js(int i) { char* end; //If it's multiplication if(judge(output[i].symbal,"*")) { //Determine whether the first symbol is a letter or a number if(!judge1(output[i].op_a,"t")) { if(!judge1(output[i].op_b,"t")) { //Cast type t[i+1]=static_cast<int>(strtol(output[i].op_a.c_str(),&end,10))*static_cast<int>(strtol(output[i].op_b.c_str(),&end,10)); } } } else { if(!judge1(output[i].op_b,"t")) { string ss; ss=del(output[i].op_a); //Cast type int z=static_cast<int>(strtol(ss.c_str(),&end,10)); t[i+1]=t[z]*static_cast<int>(strtol(output[i].op_b.c_str(),&end,10)); } else { string s; s=del(output[i].op_a); int yy=static_cast<int>(strtol(s.c_str(),&end,10)); string ss; ss=del(output[i].op_b); int zz=static_cast<int>(strtol(ss.c_str(),&end,10)); t[i+1]=t[yy]*t[zz]; } if(judge(output[i].symbal,"+")) { if(!judge1(output[i].op_a,"t")) { if(!judge1(output[i].op_b,"t")) { t[i+1]=static_cast<int>(strtol(output[i].op_a.c_str(),&end,10))+static_cast<int>(strtol(output[i].op_b.c_str(),&end,10)); } else { string ss; ss=del(output[i].op_b); int yy=static_cast<int>(strtol(output[i].op_a.c_str(),&end,10)); int zz=static_cast<int>(strtol(ss.c_str(),&end,10)); t[i+1]=yy+t[zz]; } } else { if(!judge1(output[i].op_b,"t")) { string ss; ss=del(output[i].op_a); int zz=static_cast<int>(strtol(ss.c_str(),&end,10)); t[i+1]=t[zz]+static_cast<int>(strtol(output[i].op_b.c_str(),&end,10)); } else { string s; s=del(output[i].op_a); int yy=static_cast<int>(strtol(s.c_str(),&end,10)); string ss; ss=del(output[i].op_b); int zz=static_cast<int>(strtol(ss.c_str(),&end,10)); t[i+1]=t[yy]+t[zz]; } } } } } int main() { //Lexical analysis function cifa(); //Judgment type judge_type(); //Syntax analysis and semantic analysis bds(); //Output if(is_letter==1) { for(int i=0;i<x;i++) { cout<<"("<<output[i].symbal<<","<<output[i].op_a<<","<<output[i].op_b<<","<<output[i].result<<")"<<endl; } } //Output and calculate the results else { for(int i=0;i<x;i++) { js(i); } cout<<t[x]<<endl; } return 0; }
5 commissioning data
5.1 test example I
[[sample input] 2+3*5 [[sample output] 17
The results of example 1 are as follows
Fig. 2 test results of sample 1
5.2 test example II
[[sample input] 2+3*5+7 [[sample output] 24
The results of example 2 are as follows:
Fig. 3 test results of sample 2
5.3 test example III
[[sample input] a*(b+c) [[sample output] (+,b,c,t1) (*,a,t1,t2)
The results of example 3 are as follows
Fig. 4 test results of sample 3
5.4 test example IV
[[sample input] a*(b+c)+d [[sample output] (+,b,c,t1) (*,a,t1,t2) (+,t2,d,t3)
The results of example 4 are as follows
Fig. 5 test results of example 4
6. Experimental debugging and experience
6.1 experimental commissioning
From the four test samples in the previous step, all test samples have obtained corresponding output results, indicating that the code is written successfully, and error handling is set in the code to solve other situations.
6.2 experimental experience
This experiment was previewed in time before class. Before writing the code, you need to write the pseudo code of the recursive descent translator. The key is to find out which attributes of each non terminator are inheritance attributes and which are comprehensive attributes. Then, the inherited attribute is used as the parameter and the comprehensive attribute is used as the return value for calculation.
When writing the code, you need to use the code of Experiment 1 and Experiment 2. When writing the code of Experiment 1, you do not consider that it will be used later. You directly output the results without saving the intermediate results, so that you need to store the results of Experiment 1 in a user-defined structure, which contains two factors of lexical analysis: value and type. When the analyzer analyzes, it directly calls the contents of this structure, and the results of the quaternion will also be placed in a special structure, in which the four values of the quaternion are recorded for easy output. If it is a digital expression, the analog calculator can calculate these four values, and the array and decision operator function are needed to judge whether it is a number or an auxiliary variable, and the operation is carried out according to the corresponding symbols.
Through this experiment, we have a general review of the knowledge points from lexical analysis to grammatical analysis to semantic analysis, and focus on what is input and output in each stage, how to store these information, and what algorithm to calculate. We also need to further optimize our own code. For example, in the process of this experimental code, what needs to be improved is to combine lexical analysis and syntax analysis to reduce time complexity and improve execution efficiency.
Through these four experiments, I have a clear understanding of the course of compilation principle. Maybe I understood the theory course at that time and might forget it in a while. By learning the compilation principle, I feel that I use the thinking understanding of data structure and algorithm, and need to understand and remember many concepts. This is also the difficulty of this course. Through this study, I understand that we should pay more attention to the mastery of basic subjects, and constantly strengthen and expand our computer thinking.
Finally, I would like to thank Mr. Liu Shanmei for his careful guidance for me for a semester. I will continue to strive to learn every professional course in the future and live up to the teacher's high expectations!