C++ static_cast,dynamic_cast,const_cast and reinterpret_cast (four types of conversion operators)

Posted by randydg on Sun, 13 Feb 2022 02:53:03 +0100

C++ static_cast,dynamic_cast,const_cast and reinterpret_cast (four types of conversion operators)

Implicit type conversion is safe and explicit type conversion is risky. The reason why C language adds the syntax of forced type conversion is to emphasize the risk and make programmers aware of what they are doing.

However, this way of emphasizing risk is still relatively extensive and granular. It does not indicate what risks exist and how risky they are. Moreover, C-style cast is used uniformly (), and () can be seen everywhere in the code, so it is not conducive to using text retrieval tools (such as Ctrl+F under Windows, grep command under Linux and Command+F under Mac) to locate key code.

In order to make the potential risks more detailed, the problem tracing more convenient and the writing format more standardized, C + + classifies the type conversion and adds four new keywords to support it, which are:

keywordexplain
static_castIt is used for benign conversion. Generally, it will not lead to accidents and the risk is very low.
const_castUsed for conversion between const and non const, volatile and non volatile.
reinterpret_castHighly dangerous conversion. This conversion is only the re interpretation of binary bits and will not adjust the data with the help of the existing conversion rules, but it can realize the most flexible C + + type conversion.
dynamic_castWith RTTI, it is used for type safe Downcasting.

The syntax format of these four keywords is the same, specifically:

xxx_cast<newType>(data)

newType is the new type to be converted, and data is the converted data. For example, the old C style double to int is written as follows:

double scores = 95.5;
int n = (int)scores;
C++ The new style is:
double scores = 95.5;
int n = static_cast<int>(scores);

static_cast keyword

static_cast can only be used for benign conversion, which has low conversion risk and generally will not cause any accidents, for example:

Original automatic type conversion, such as short to int, int to double, const to non const, upward transformation, etc;

Conversion between void pointer and specific type pointer, such as void * to int *, char * to void *;
Conversion between a class with a conversion constructor or type conversion function and other types, such as double to Complex (call conversion constructor), Complex to double (call type conversion function).

Note that static_cast cannot be used for conversions between unrelated types because these conversions are risky, for example:

Conversion between two specific type pointers, such as int * to double *, Student * to int *, etc. Different types of data have different storage formats and lengths. After pointing to type B data with the pointer of type A, the data will be processed in the way of type A: if it is a read operation, a pile of meaningless values may be obtained; If it is a write operation, the data of type B may be destroyed. When the data is read in the way of type B again, a pile of meaningless values will be obtained.

Conversion between int and pointer. Assigning a specific address to a pointer variable is very dangerous, because the memory on the address may not be allocated or may not have read and write permissions. It happens that the available memory is a small probability event.

static_ Nor can cast be used to remove const and volatile modifiers from expressions. In other words, const/volatile types cannot be converted to non const/volatile types.

static_cast means "static conversion", that is, conversion during compilation. If the conversion fails, a compilation error will be thrown.

The following code demonstrates static_ Correct and incorrect usage of cast:

#include <iostream>
#include <cstdlib>
using namespace std;
class Complex{
public:
    Complex(double real = 0.0, double imag = 0.0): m_real(real), m_imag(imag){ }
public:
    operator double() const { return m_real; }  //Type conversion function
private:
    double m_real;
    double m_imag;
};
int main(){
    //Here is the correct usage
    int m = 100;
    Complex c(12.5, 23.8);
    long n = static_cast<long>(m);  //Wide conversion, no information loss
    char ch = static_cast<char>(m);  //Narrow conversion, information may be lost
    int *p1 = static_cast<int*>( malloc(10 * sizeof(int)) );  //Convert void pointer to concrete type pointer
    void *p2 = static_cast<void*>(p1);  //Convert a specific type pointer to a void pointer
    double real= static_cast<double>(c);  //Call type conversion function
   
    //The following usage is wrong
    float *p3 = static_cast<float*>(p1);  //Cannot convert between pointers of two concrete types
    p3 = static_cast<float*>(0X2DF9);  //Cannot convert an integer to a pointer type
    return 0;
}

const_cast keyword

const_cast is easy to understand. It is used to remove the const or volatile modification of expressions. In other words, const_cast is used to convert const/volatile type to non const/volatile type.

Let's take const as an example to illustrate const_ Usage of cast:

#include <iostream>
using namespace std;
int main(){
    const int n = 100;
    int *p = const_cast<int*>(&n);
    *p = 234;
    cout<<"n = "<<n<<endl;
    cout<<"*p = "<<*p<<endl;
    return 0;
}

Operation results:

n = 100
*p = 234

&The type of const used to obtain the address of const n must be int *_ Cast can only be assigned to p after being converted to int * type. Since p points to N, and N occupies stack memory and has write permission, the value of N can be modified through p.

Some readers may ask, why are the values output through n and * p different? This is because the processing of constants in C + + is more like #define during compilation. It is a process of value replacement. All places using n in the code are replaced with 100 during compilation. In other words, the code in line 8 has been modified to the following form:

cout<<"n = "<<100<<endl;

In this way, even if the program modifies the value of n during running, it will not affect the cout statement.

Use const_cast cast can break through the constant limit of C/C + + and modify the value of constant, so it is dangerous; But if programmers do this, they will basically be aware of this problem, so they also have a certain degree of security.

reinterpret_cast keyword

Reinterpret means "reinterpret". As the name suggests, reinterpret_ This conversion of cast is only a reinterpretation of binary bits and will not adjust the data with the help of the existing conversion rules. It is very simple and rough, so the risk is very high.

reinterpret_cast can be considered as static_ A supplement to cast, some static_ If cast cannot complete the conversion, you can use reinterpret_cast, such as the conversion between two specific type pointers and the conversion between int and pointer (some compilers only allow int to pointer, not vice versa).

The following code demonstrates reinterpret_ Use of cast:

#include <iostream>
using namespace std;
class A{
public:
    A(int a = 0, int b = 0): m_a(a), m_b(b){}
private:
    int m_a;
    int m_b;
};
int main(){
    //Convert char * to float*
    char str[]="http://c.biancheng.net";
    float *p1 = reinterpret_cast<float*>(str);
    cout<<*p1<<endl;
    //Convert int to int*
    int *p = reinterpret_cast<int*>(100);
    //Convert A * to int*
    p = reinterpret_cast<int*>(new A(25, 96));
    cout<<*p<<endl;
   
    return 0;
}

Operation results:

3.0262e+29
25

You can imagine how absurd and dangerous it is to operate a char array with a float pointer. Such a conversion method should not be used unless it is absolutely necessary. Converting a to int and using pointers to directly access private members pierce the encapsulation of a class. A better way is to let the class provide get/set functions to indirectly access member variables.

dynamic_cast keyword

dynamic_cast is used for type conversion between the inheritance levels of a class. It allows both up casting and down casting. The upward transformation is unconditional and will not be tested, so it can be successful; The premise of downward transformation must be safe. Only some of them can succeed with the help of RTTI.

dynamic_cast and static_cast is relative and dynamic_cast means "dynamic conversion", static_cast means "static transformation". dynamic_cast will carry out type conversion with RTTI during program operation, which requires that the base class must contain virtual functions; static_cast completes type conversion during compilation, which can find errors more timely.

dynamic_ The syntax format of cast is:

dynamic_cast <newType> (expression)

newType and expression must be both pointer type or reference type. In other words, dynamic_cast can only convert pointer type and reference type, but not other types (int, double, array, class, structure, etc.).

For pointers, NULL will be returned if conversion fails; For references, STD:: bad will be thrown if the conversion fails_ Cast exception.

1) Upcasting

During upward transformation, as long as there is an inheritance relationship between the two types to be transformed and the base class contains virtual functions (this information can be determined during compilation), the transformation will be successful. Because upward transformation is always safe, dynamic_cast will not perform any checks during the running period, which is dynamic at this time_ Cast and static_cast makes no difference.

"Do not perform run-time detection during upward transformation" improves efficiency, but also leaves security risks. Please see the following code:

#include <iostream>
#include <iomanip>
using namespace std;
class Base{
public:
    Base(int a = 0): m_a(a){ }
    int get_a() const{ return m_a; }
    virtual void func() const { }
protected:
    int m_a;
};
class Derived: public Base{
public:
    Derived(int a = 0, int b = 0): Base(a), m_b(b){ }
    int get_b() const { return m_b; }
private:
    int m_b;
};
int main(){
    //Situation ①
    Derived *pd1 = new Derived(35, 78);
    Base *pb1 = dynamic_cast<Derived*>(pd1);
    cout<<"pd1 = "<<pd1<<", pb1 = "<<pb1<<endl;
    cout<<pb1->get_a()<<endl;
    pb1->func();
    //Situation ②
    int n = 100;
    Derived *pd2 = reinterpret_cast<Derived*>(&n);
    Base *pb2 = dynamic_cast<Base*>(pd2);
    cout<<"pd2 = "<<pd2<<", pb2 = "<<pb2<<endl;
    cout<<pb2->get_a()<<endl;  //Output a garbage value
    pb2->func();  //Memory error
    return 0;
}

The situation ① is correct and there is no problem. For case ②, pd points to the integer variable n, and does not point to an object of Derived class. Dynamic is used_ Cast does not check this point during type conversion, but directly assigns the value of pd to pb (there is no need to adjust the offset here), resulting in pb also pointing to n. Because pb does not point to an object, get_ (a) no m_ The value of a (in fact, it gets a garbage value), pb2 - > func() can't get the correct address of func() function.

The reason why pb2 - > func() can't get the correct address of func() is that pb2 points to a fake "object", which has no virtual function table and no virtual function table pointer, while func() is a virtual function, and its address can only be found in the virtual function table.

2) Downcasting

Downward transformation is risky, dynamic_cast will detect with the help of RTTI information to determine that the conversion can be successful only if it is safe, otherwise the conversion will fail. So, which downward transformation is safe and which is unsafe? Here's an example to illustrate:

#include <iostream>
using namespace std;
class A{
public:
    virtual void func() const { cout<<"Class A"<<endl; }
private:
    int m_a;
};
class B: public A{
public:
    virtual void func() const { cout<<"Class B"<<endl; }
private:
    int m_b;
};
class C: public B{
public:
    virtual void func() const { cout<<"Class C"<<endl; }
private:
    int m_c;
};
class D: public C{
public:
    virtual void func() const { cout<<"Class D"<<endl; }
private:
    int m_d;
};
int main(){
    A *pa = new A();
    B *pb;
    C *pc;
   
    //Situation ①
    pb = dynamic_cast<B*>(pa);  //Downward transformation failed
    if(pb == NULL){
        cout<<"Downcasting failed: A* to B*"<<endl;
    }else{
        cout<<"Downcasting successfully: A* to B*"<<endl;
        pb -> func();
    }
    pc = dynamic_cast<C*>(pa);  //Downward transformation failed
    if(pc == NULL){
        cout<<"Downcasting failed: A* to C*"<<endl;
    }else{
        cout<<"Downcasting successfully: A* to C*"<<endl;
        pc -> func();
    }
   
    cout<<"-------------------------"<<endl;
   
    //Situation ②
    pa = new D();  //Upward transformation is allowed
    pb = dynamic_cast<B*>(pa);  //Successful downward transformation
    if(pb == NULL){
        cout<<"Downcasting failed: A* to B*"<<endl;
    }else{
        cout<<"Downcasting successfully: A* to B*"<<endl;
        pb -> func();
    }
    pc = dynamic_cast<C*>(pa);  //Successful downward transformation
    if(pc == NULL){
        cout<<"Downcasting failed: A* to C*"<<endl;
    }else{
        cout<<"Downcasting successfully: A* to C*"<<endl;
        pc -> func();
    }
   
    return 0;
}

Operation results:

Downcasting failed: A* to B*
Downcasting failed: A* to C*
-------------------------
Downcasting successfully: A* to B*
Class D
Downcasting successfully: A* to C*
Class D

The inheritance order of classes in this code is: a -- > B -- > C -- > D. pa is a pointer of type A. when pa points to an object of type A, the downward transformation fails, and pa cannot be converted to type B or C. When pa points to an object of type D, the downward transformation is successful, and pa can be converted to type B or c *. Similarly, they are all downward transformation. Why do the results of transformation vary greatly with the objects pointed to by pa?

We talked about the real memory model of objects with virtual functions, and also learned that each class will save a copy of type information in memory, and the compiler will "connect" the type information of classes with inheritance relationship with pointers to form an Inheritance Chain, as shown in the following figure:

When using dynamic_ When cast performs type conversion on the pointer, the program will first find the object pointed to by the pointer, then find the type information of the current class (the class to which the object pointed to by the pointer belongs) according to the object, and traverse up the inheritance chain from this node. If the target type to be converted is found, it indicates that the conversion is safe and the conversion can be successful, If the target type to be converted is not found, it indicates that there is a great risk in this conversion, so it cannot be converted.

For case ① in this example, pa points to class A object. According to the type information of A found by this object, when the program starts to traverse upward from this node, it finds that there is no type B or type C to be converted above A (in fact, there is no type above A), so the conversion fails. For case ②, pa points to A class D object. What is found according to the object is the type information of D. in the process of traversing upward from this node, the program finds type C and type B, so the conversion is successful.

Overall, dynamic_cast will traverse the inheritance chain during the program running. If the target type to be converted is encountered on the way, the conversion will succeed. If the target type to be converted is not encountered until the vertex of the inheritance chain (the base class at the top level), the conversion will fail. For the same pointer (such as pa), the objects it points to are different, which will lead to different starting points of traversing the inheritance chain and different types that can be matched on the way. Therefore, the same type conversion produces different results.

Looks dynamic from the surface_ Cast can indeed be transformed downward. This example also proves this well: both B and C are derived classes of A. We successfully converted pa from type a pointer to type B and C pointer. But in essence, dynamic_cast still only allows upward transformation, because it only traverses the inheritance chain upward. The root cause of this illusion is that a derived class object can point to it with a pointer to any base class, which is always safe. In case ② in this example, the object pointed to by pa is of type D, and pa, pb and pc are pointers to the base class of D, so they can all point to objects of type D, dynamic_cast just makes different base class pointers point to the same derived class object.

Topics: C++ Back-end