Program environment and preprocessing

Posted by ScotDiddle on Sun, 27 Feb 2022 09:37:26 +0100

1. Translation environment and execution environment of the program

1.1 in any implementation of ANSI C, there are two different environments

The first is the translation environment, in which the source code is converted into executable machine instructions.
The second is the execution environment, which is used to actually execute code.

1.2 general process of program operation

2. Translation environment

2.1 translation consists of compilation and linking

2.2 compilation consists of pre compilation, compilation and assembly

3. Operating environment

Process of program execution:

  1. The program must be loaded into memory. In an environment with an operating system: This is usually done by the operating system. In an independent environment, the loading of the program must be arranged manually, or it may be completed by putting the executable code into the read-only memory.
  2. The execution of the program begins. The main function is then called.
  3. Start executing program code. At this time, the program will use a run-time stack to store the local variables and return addresses of the function. Programs can also use static memory. Variables stored in static memory retain their values throughout the execution of the program.
  4. Terminate the procedure. Normally terminate the main function; It can also be an accidental termination.

Note:
Introduce a book "self cultivation of programmers"

4. Detailed explanation of pretreatment

4.1 predefined symbols

__FILE__      //Source file for compilation
__LINE__     //The current line number of the file
__DATE__    //Date the file was compiled
__TIME__    //The time when the file was compiled
__STDC__    //If the compiler follows ANSI C, its value is 1, otherwise it is undefined

These predefined symbols are built into the language.
Take chestnuts for example:

printf("file:%s line:%d\n", __FILE__, __LINE__);

4.2 #define

4.2.1 #define identifier

#define MAX 1000
#define reg register / / create a short name for the keyword register
#define do_forever for(;;)     // Replace an implementation with a more vivid symbol
#define CASE break;case / / automatically write break when writing case statements.
// If the defined stuff is too long, it can be written in several lines. Except for the last line, each line is followed by a backslash (continuation character).
#define DEBUG_PRINT printf("file:%s\tline:%d\t \
                          date:%s\ttime:%s\n" ,\
                          __FILE__,__LINE__ ,       \
                          __DATE__,__TIME__ )

When defining an identifier in define, do not add it at the end;

4.2.2 #define macro

#The define mechanism includes a provision that allows parameters to be replaced into text. This implementation is often called macro or define macro.

Here is how macros are declared:

#define name( parament-list ) stuff
The argument list is a comma separated symbol table, which may appear in the stuff

be careful:
The left parenthesis of the parameter list must be immediately adjacent to name.
If there is any blank space between the two, the parameter list will be interpreted as part of the stuff.

Specific analysis 1:

#define SQUARE( x ) x * x

This macro takes a parameter x
If you put square (5) after the above statement; Placed in the program, the preprocessor will replace the above expression with the following expression: 5 * 5

Warning:
There is a problem with this macro:
Observe the following code snippet:

int a = 5;
printf("%d\n" ,SQUARE( a + 1) );

At first glance, you might think this code will print 36 this value.
In fact, it will print 11
Why?

Parameter when replacing text x Replaced with a + 1,So this statement actually becomes:
printf ("%d\n",a + 1 * a + 1 );

This makes it clear that the expression generated by the substitution is not evaluated in the expected order.
By adding two parentheses to the macro definition, this problem can be easily solved:

#define SQUARE(x) (x) * (x)

In this way, the expected effect is produced after pretreatment:

printf ("%d\n",(a + 1) * (a + 1) );

Specific analysis 2:

#define DOUBLE(x) (x) + (x)

We use parentheses in the definition to avoid the previous problems, but this macro may have new errors.

int a = 5;
printf("%d\n" ,10 * DOUBLE(a));

What value will this print?
warning:
It looks like printing 100, but in fact it's 55
After we found the replacement:

printf ("%d\n",10 * (5) + (5));

Multiplication precedes macro defined addition, so it appears
55 .
The solution to this problem is to add a pair of parentheses around the macro definition expression.

#define DOUBLE( x)   ( ( x ) + ( x ) )

Tips:

Therefore, macro definitions used to evaluate numerical expressions should be bracketed in this way to avoid unpredictable interactions between operators in parameters or adjacent operators when using macros.

4.2.3 #define replacement rules

There are several steps involved in extending #define to define symbols and macros in a program

  1. When calling a macro, first check the parameters to see if they contain any symbols defined by #define. If so, they are replaced first.
  2. The replacement text is then inserted into the position of the original text in the program. For macros, parameter names are replaced by their values.
  3. Finally, scan the result file again to see if it contains any symbols defined by #define. If so, repeat the above process.

be careful:

  1. Variables defined by other #define can appear in macro parameters and #define definitions. But for macros, recursion cannot occur.
  2. When the preprocessor searches for #define defined symbols, the contents of string constants are not searched.

4.3 introduction of preprocessing operators # and ##

4.3.1 # the role of

How do I insert a parameter into a string?

First, let's look at this Code:

char* p = "hello ""bit\n";
printf("hello"," bit\n");
printf("%s", p);

Is hello bit output here?
The answer is yes.
We found that strings have the characteristics of automatic connection

  1. Can we write such code
#define PRINT(FORMAT, VALUE)\
 printf("the value is "FORMAT"\n", VALUE);
...
PRINT("%d", 10);

Here, the string can be placed in the string only when the string is used as a macro parameter.

  1. Another technique is:
    Use #, to change a macro parameter into a corresponding string.
    For example:
int i = 10;
#define PRINT(FORMAT, VALUE)\
 printf("the value of " #VALUE "is "FORMAT "\n", VALUE);
...
PRINT("%d", i+3);//What effect has it produced

The #VALUE in the code is preprocessed as:
"VALUE" .
The final output should be:

the value of i+3 is 13

4.3.2 ## the role of

##The symbols on both sides of it can be combined into one symbol.
It allows the macro definition to create identifiers from separate pieces of text.

#define ADD_TO_SUM(num, value) \
 sum##num += value;
...
ADD_TO_SUM(5, 10);//Function: increase sum5 by 10

Note:
Such a connection must produce a legal identifier. Otherwise, the result is undefined.

4.4 macro parameters with side effects

When the macro parameter appears more than once in the macro definition, if the parameter has side effects, you may be in danger when using this macro, resulting in unpredictable consequences. A side effect is a permanent effect that occurs when an expression is evaluated.

For example:

x+1;//No side effects
x++;//With side effects

The MAX macro can prove the problems caused by parameters with side effects:

#define MAX(a, b) ( (a) > (b) ? (a) : (b) )
...
x = 5; y = 8; z = MAX(x++, y++);
printf("x=%d y=%d z=%d\n", x, y, z);//What is the output?

Here we need to know the result of preprocessor processing:

z = ( (x++) > (y++) ? (x++) : (y++));

So the output result is:

x=6 y=10 z=9

4.5 macro and function comparison

Macros are usually used to perform simple operations. For example, find the larger one of the two numbers.
#define MAX(a, b) ((a)>(b)?(a):(b))

Then why not use functions to complete this task?
There are two reasons:

  1. The code used to call and return from the function may take more time than it actually takes to perform this small computational work. So macros are better than functions in terms of program size and speed.
  2. More importantly, the parameters of the function must be declared as a specific type. Therefore, functions can only be used on expressions of appropriate types. On the contrary, how can this macro be applied to integer, long integer, floating-point and other types that can be compared with >. Macros are type independent.

Of course, compared with macros, functions also have disadvantages:
3. Every time you use a macro, a copy of the code defined by the macro will be inserted into the program. Unless the macro is short, it may significantly increase the length of the program.
4. Macros cannot be debugged.
5. Macros are not rigorous enough because they are independent of type.
6. Macros may cause the problem of operator priority, which makes the process prone to errors.

Macros can sometimes do things that functions cannot. For example, macro parameters can have types, but functions cannot.

#define MALLOC(num, type)\
 (type *)malloc(num * sizeof(type))
...
//use
MALLOC(10, int);//Type as parameter
//After preprocessor replacement:
(int *)malloc(10 * sizeof(int));

Naming convention
Generally speaking, the usage syntax of function macros is very similar. So language itself can't help us distinguish between the two.
Then one of our usual habits is:

Capitalize all macro names
Function names should not be capitalized

4.6 #undef

This instruction is used to remove a macro definition.

#undef NAME
//If an existing name needs to be redefined, its old name must first be removed.

4.7 command definition

Many C compilers provide the ability to define symbols on the command line. Used to start the compilation process.
For example, this feature is useful when we need to compile different versions of a program according to the same source file. (suppose a program declares an array of a certain length. If the machine memory is limited, we need a small array, but if the other machine memory is capitalized, we need an array that can be capitalized.)

#include <stdio.h>
int main()
{
    int array [ARRAY_SIZE];
    int i = 0;
    for(i = 0; i< ARRAY_SIZE; i ++)
   {
        array[i] = i;
   }
    for(i = 0; i< ARRAY_SIZE; i ++)
   {
        printf("%d " ,array[i]);
   }
    printf("\n" );
    return 0; }

Compile instructions:

gcc -D ARRAY_SIZE=10 programe.c

4.8 conditional compilation

When compiling a program, it is very convenient for us to compile or give up a statement (a group of statements). Because we have conditional compilation instructions.
for instance:

It's a pity to delete the debuggable code and keep it, so we can compile it selectively.
#include<stdio.h>
int  main()
{
	int i = 0;
	int arr[10] = { 0 };
	for (int i = 0; i < 10; i++)
	{
		arr[i] = i;
#if 1 / / judge whether the following code is executed through this condition. 0 is false and non-0 is true
		printf("%d\n",arr[i]);//Observe whether the array can be assigned successfully
#endif 
#if 0 / / judge whether the following code is executed through this condition
		printf("%d\n", arr[i]);//Observe whether the array can be assigned successfully
#endif 
	}
    return 0;
}

Common conditional compilation instructions:

//Common conditional compilation instructions
//1.
#if constant expression
//...
#endif

//2. Conditional compilation of multiple branches
#if constant expression
//...
#elif constant expression
//...
#else
//...
#endif

//3. Judge whether it is defined
#if defined(symbol)
#ifdef symbol

#if !defined(symbol)
#ifndef symbol

//4. Nested instructions
#if defined(OS_UNIX)
#ifdef OPTTION1
	#endif 
	unix_version_option1();
	#ifdef OPTTION2
	unix_version_option2();
	#endif 
#elif defined(OS_MSDOS)
	#ifdef OPTTION2
	msdos_version_option2();
	#endif 
#endif

4.9 preprocessing instruction #include

4.9.1 nested file contains

We already know that the #include directive enables another file to be compiled. Just as it actually appears in the #include instruction.
This replacement is simple:
The preprocessor first deletes this instruction and replaces it with the contents of the containing file.
If such a source file is included 10 times, it will actually be compiled 10 times.
For example:

comm.h and comm.c are common modules.
test1.h and test1 C uses a common module.
test2.h and test2 C uses a common module.
test.h and test C uses test1 module and test2 module.
In this way, two copies of comm.h will appear in the final program. This results in duplication of file contents.

How to solve this problem?
Answer: conditional compilation.
Write at the beginning of each header file:

#ifndef __TEST_H__
#define __TEST_H__
//Contents of header file
#endif   //__TEST_H

perhaps

pragma once

The repeated introduction of header files can be avoided.

4.9.2 how header files are included

The local file contains:

#include "filename"

Search strategy: first search in the directory where the source file is located. If the header file is not found, the compiler looks for the header file in the standard location just like looking for the library function header file.
If it cannot be found, a compilation error will be prompted.
The library file contains:

#include <filename.h>

Find the header file directly to the standard path. If it cannot be found, it will prompt a compilation error.
In this way, it can be said that the library file can also be included in the form of ""?
The answer is yes, yes.
However, the efficiency of searching in this way is lower. Of course, it is not easy to distinguish whether it is a library file or a local file.

Topics: C