New features of C++11/14 -- right value reference, mobile semantics and perfect forwarding

Posted by Stalingrad on Sat, 15 Jan 2022 13:50:53 +0100


1. Right value reference

C + + introduces R-value reference and mobile semantics, which can avoid unnecessary replication and improve program performance.


(1) Left and right values

All in C + + must belong to one of left value and right value.

Lvalue: refers to the persistent object that still exists after the end of the expression.
Right value: refers to a temporary object that no longer exists at the end of the expression.

All named variables or objects are left-hand values, while the right-hand values are not named.
A convenient way to distinguish between left and right values: see if you can get an address for the expression. If you can, it is a left value, otherwise it is a right value.

The right value is divided into dead value and pure right value.

  • Pure right value: it is the concept of right value in c++98 Standard, such as the temporary variable value returned by the function returned without reference; Some operation expressions, such as temporary variables generated by 1 + 2; Literal value not associated with the object, such as 2, 'c', true, "hello"; None of these values can be addressed.
  • Dead value: it is a new expression related to right value reference in c++11. Such expressions are usually the object to be moved, the return value of T & & function, the return value of std::move() function, etc.
int i=0;// i is the left value and 0 is the right value

class A {
  public:
    int a;
};
A getTemp()
{
    return A();
}
A a = getTemp();   // a is the left value, and the return value of getTemp() is the right value (temporary variable)



(2) Left value reference, right value reference

References in c++98 are very common, that is, aliases are given to variables. In c++11, because the concept of rvalue reference is added, references in c++98 are called lvalue reference.

int a = 10;
int &reA = a; 	//refA is the alias of a, modifying refA means modifying a, a is an lvalue, and moving left is an lvalue reference

int &b = 1;		//Compilation error! 1 is an R-value, and an l-value reference cannot be used

The symbol used for the right value reference in C++11 is & &, such as:

int &&a = 1;  	//In essence, an unnamed (anonymous) variable is aliased
int b = 1;
int &&c = b;	//Compilation error! You cannot copy an lvalue to an lvalue reference

class A {
  public:
    int a;
};
A getTemp()
{
    return A();
}
A && a = getTemp();   //The return value of getTemp() is an R-value (temporary variable)

The right value returned by getTemp() should have ended its life after the expression statement ended (because it is a temporary variable). Through the right value reference, the right value will be reborn, and its life cycle will be the same as that of the right value reference type variable A. as long as a is still alive, the right value temporary variable will survive. It's actually a name for that temporary variable.

Note: the type of a here is an R-value reference type (int & &), but if you distinguish it from an l-value and an R-value, it is actually an l-value. Because it can be addressed, and it also has a name, which is a named right value.

Therefore, lvalue references can only bind lvalues, and lvalue references can only bind lvalues. If the binding is wrong, the compilation will fail.

However, constant lvalue reference is a wonderful flower. It can be regarded as a "universal" reference type. It can bind non constant lvalues, constant lvalues and right values. When Binding right values, constant lvalue reference can also prolong the life of right values like right value reference. The disadvantage is that it can only be read and cannot be changed.

const int & a = 1; //Constant lvalue refers to bound lvalue, and no error will be reported

class A {
  public:
    int a;
};
A getTemp()
{
    return A();
}
const A &a = getTemp();   //No error will be reported, and a & A will report an error

In fact, in many cases, we don't realize the function of constant lvalue reference, such as the following example:

#include <iostream>
using namespace std;

class Copyable {
public:
    Copyable(){}
    Copyable(const Copyable &o) {
        cout << "Copied" << endl;
    }
};

Copyable ReturnRvalue() {
    return Copyable(); //Returns a temporary object
}

void AcceptVal(Copyable a) {
}

void AcceptRef(const Copyable& a) {
}

int main()
{
    cout << "pass by value: " << endl;
    AcceptVal(ReturnRvalue()); // Should I call the copy constructor twice??
    cout << "pass by reference: " << endl;
    AcceptRef(ReturnRvalue()); //Should the copy constructor be called only once??
}

When I finished typing the above source code, I found that the result was completely different from what I thought. Expected:

  • AcceptVal(ReturnRvalue()) needs to call the copy constructor twice. Once, the Copyable object is constructed in the ReturnRvalue() function. When returning, the copy constructor will be called to generate a temporary object. When AcceptVal() is called, the object will be copied to the local variable a of the function. The copy constructor has been called twice in total.
  • The difference of AcceptRef() is that the formal parameter is a constant lvalue reference. It can receive an lvalue without copying.

If it's you, what do you think is the final output of the above source code?

The following figure shows the output of the operation. Is it the same as what you expect?

In fact, either way, the copy constructor is not called at one time!

Reason: the compiler turns on return value optimization by default (RVO/NRVO, RVO, Return Value Optimization, or NRVO, Named Return Value Optimization). The compiler is very smart. It finds that an object is generated inside ReturnRvalue. After returning, it needs to generate a temporary object to call the copy constructor, which is very troublesome. Therefore, it is directly optimized into an object to avoid copying, and the temporary variable is assigned to the formal parameter of the function, which is still unnecessary. Therefore, the last three variables are replaced by a variable, There is no need to call the copy constructor.


Summary, where T is a specific type:
1) Lvalue reference, using T &, can only bind lvalues.
2) R-value reference, using T & &, can only bind R-value.
3) Constant lvalues, using const T &, can bind lvalues or lvalues.
4) A named right value reference that the compiler considers to be an lvalue.
5) The compiler has return value optimization, but don't rely too much on it.


2. Move construction and move assignment

Review how to use C + + to implement a string class MyString. MyString internally manages a char * array of C language. At this time, it is generally necessary to implement the copy constructor and copy assignment function, because the default copy is a shallow copy, and the pointer resource cannot be shared. Otherwise, one destructor will be finished.

Let's start with a code:

class MyString
{
public:
	static size_t CCtor; //Counts the number of calls to the copy constructor
//    static size_t CCtor; // Counts the number of calls to the copy constructor
public:
	// Constructor
	MyString(const char* cstr = 0)
	{
		if (cstr)
		{
			m_data = new char[strlen(cstr) + 1];
			strcpy(m_data, cstr);
		}
		else
		{
			m_data = new char[1];
			*m_data = '\0';
		}
	}

	// copy constructor 
	MyString(const MyString& str)
	{
		CCtor++;
		m_data = new char[strlen(str.m_data) + 1];
		strcpy(m_data, str.m_data);
	}

	// Copy assignment function = number overload
	MyString& operator=(const MyString& str)
	{
		if (this == &str) // Avoid self assignment!!
			return *this;

		delete[] m_data;
		m_data = new char[strlen(str.m_data) + 1];
		strcpy(m_data, str.m_data);
		return *this;
	}

	~MyString()
	{
		delete[] m_data;
	}

	char* get_c_str() const { return m_data; }
private:
	char* m_data;
};
size_t MyString::CCtor = 0;

int main(void)
{
	vector<MyString> vecStr;
	vecStr.reserve(1000); 			//Allocate 1000 spaces first. Otherwise, the number of calls may be much greater than 1000
	for (int i = 0;i < 1000;i++)
	{
		vecStr.push_back(MyString("hello"));
	}
	cout << MyString::CCtor << endl;

	return 0;
}

Does the code above see anything wrong?

Existing problems:
In the for loop, the constructor is copied 1000 times. If the string constructed by MyString("hello") is very long, it takes a long time to construct it, but it has to be copied again at last. MyString("hello") is only a temporary object, which is useless after copying, resulting in meaningless resource application and release, If you can directly use the resources that have been applied for by the temporary object, you can not only save resources, but also save the time of resource application and release.

The newly added mobile semantics of C++11 can do this.
To realize mobile semantics, two functions must be added: Mobile constructor and mobile assignment function.

#include <iostream>
#include <cstring>
#include <vector>
using namespace std;

class MyString
{
public:
    static size_t CCtor; //Counts the number of calls to the copy constructor
    static size_t MCtor; //Count the number of calls to the move constructor -- New
    static size_t CAsgn; //Count the number of calls to the copy assignment function
    static size_t MAsgn; //Count the number of calls to the move assignment function -- New

public:
    // Constructor
   MyString(const char* cstr=0){
       if (cstr) {
          m_data = new char[strlen(cstr)+1];
          strcpy(m_data, cstr);
       }
       else {
          m_data = new char[1];
          *m_data = '\0';
       }
   }

   // copy constructor 
   MyString(const MyString& str) {
       CCtor ++;
       m_data = new char[ strlen(str.m_data) + 1 ];
       strcpy(m_data, str.m_data);
   }
   // Move constructor -- New
   MyString(MyString&& str) noexcept
       :m_data(str.m_data) {
       MCtor ++;
       str.m_data = nullptr; //No longer point to previous resources
   }

   // Copy assignment function = number overload
   MyString& operator=(const MyString& str){
       CAsgn ++;
       if (this == &str) // Avoid self assignment!!
          return *this;

       delete[] m_data;
       m_data = new char[ strlen(str.m_data) + 1 ];
       strcpy(m_data, str.m_data);
       return *this;
   }

   // Move assignment function = number overload -- New
   MyString& operator=(MyString&& str) noexcept{
       MAsgn ++;
       if (this == &str) // Avoid self assignment!!
          return *this;

       delete[] m_data;
       m_data = str.m_data;
       str.m_data = nullptr; //No longer point to previous resources
       return *this;
   }

   ~MyString() {
       delete[] m_data;
   }

   char* get_c_str() const { return m_data; }
private:
   char* m_data;
};

size_t MyString::CCtor = 0;
size_t MyString::MCtor = 0;
size_t MyString::CAsgn = 0;
size_t MyString::MAsgn = 0;

int main()
{
    vector<MyString> vecStr;
    vecStr.reserve(1000); //Allocate 1000 spaces first
    for(int i=0;i<1000;i++){
        vecStr.push_back(MyString("hello"));
    }
    cout << "CCtor = " << MyString::CCtor << endl;
    cout << "MCtor = " << MyString::MCtor << endl;
    cout << "CAsgn = " << MyString::CAsgn << endl;
    cout << "MAsgn = " << MyString::MAsgn << endl;
}

Result output:

You can see:

  • The difference between the move constructor and the copy constructor is that the copy construction parameter is const mystring & STR, which is a constant lvalue reference; The move construction parameter is mystring & & STR, which is an R-value reference.
  • MyString ("hello") is a temporary object and an R-value. It takes priority to enter the mobile constructor rather than the copy constructor.
  • Unlike the copy constructor, the move constructor does not reallocate a new space to copy the object to be copied, but "steals" it, points its own pointer to other people's resources, and then modifies other people's pointer to nullptr. This step is very important. If you do not modify other people's pointer to null, Then this resource will be released when the temporary object is constructed, and "stealing" will be in vain. The following figure can explain the difference between copy and move.

The following figure can explain the difference between copy and move:

It's no wonder why you can rob other people's resources. It's also a waste if you don't make good use of the resources of temporary objects, because the life cycle is very short. After you execute this expression, it will be destroyed. Only by making full use of resources can you be efficient.

For an lvalue, the copy constructor must be called, but some lvalues are local variables and have a short life cycle. Can they be moved instead of copied?
**In order to solve this problem, C++11 provides the std::move() method to convert the left value to the right value, so as to facilitate the application of mobile semantics** I think it actually tells the compiler that although I am an lvalue, don't use the copy constructor for me, but use the move constructor instead.

int main()
{
    vector<MyString> vecStr;
    vecStr.reserve(1000); //Allocate 1000 spaces first
    for(int i=0;i<1000;i++){
        MyString tmp("hello");
        vecStr.push_back(tmp); //The copy constructor is called
    }
    cout << "CCtor = " << MyString::CCtor << endl;
    cout << "MCtor = " << MyString::MCtor << endl;
    cout << "CAsgn = " << MyString::CAsgn << endl;
    cout << "MAsgn = " << MyString::MAsgn << endl;

    cout << endl;
    MyString::CCtor = 0;
    MyString::MCtor = 0;
    MyString::CAsgn = 0;
    MyString::MAsgn = 0;
    vector<MyString> vecStr2;
    vecStr2.reserve(1000); //Allocate 1000 spaces first
    for(int i=0;i<1000;i++){
        MyString tmp("hello");
        vecStr2.push_back(std::move(tmp)); //The move constructor is called
    }
    cout << "CCtor = " << MyString::CCtor << endl;
    cout << "MCtor = " << MyString::MCtor << endl;
    cout << "CAsgn = " << MyString::CAsgn << endl;
    cout << "MAsgn = " << MyString::MAsgn << endl;
}

Operation results;

Here are a few more examples:

MyString str1("hello"); //Call constructor
MyString str2("world"); //Call constructor
MyString str3(str1); //Call copy constructor
MyString str4(std::move(str1)); // Call the move constructor

//    cout << str1. get_ c_ str() << endl; //  At this time, the internal pointer of STR1 has expired! Do not use
//Note: Although m in str1_ Dat has been called null, but str1 is still alive. It will not be destructed until its scope is known! Instead of deconstructing immediately after the move
MyString str5;
str5 = str2; //Call copy assignment function
MyString str6;
str6 = std::move(str2); // The contents of str2 are also invalid and should not be used again

The above examples need to pay attention to the following points:

  • str6 = std::move(str2). Although str2's resources are given to str6, str2 does not destruct immediately. It will destruct only when str2 leaves its scope. Therefore, if you continue to use str2's m_data variable, unexpected errors may occur.
  • If we do not provide a move constructor but only a copy constructor, std::move() will fail, but no error will occur. Because the compiler cannot find the move constructor, it will look for the copy constructor. This is why the parameter of the copy constructor is the lvalue reference of const T & constant!
  • All containers in c++11 implement the move semantics. Move only transfers the control of resources. In essence, it forcibly converts the left value into the right value for moving copies or assignment, so as to avoid unnecessary copies of objects containing resources. Move is valid for objects that have members of resources such as memory and file handles. If you use move for some basic types, such as int and char[10] arrays, copy will still occur (because there is no corresponding move constructor). Therefore, move is more meaningful for objects containing resources.


3. Universal reference

When the right value reference is combined with the template, it is complicated.
T & & does not necessarily represent an R-value reference, but may also be an l-value reference.

For example:

template<typename T>
void f( T&& param){
}

f(10);  //10 is the right value
int x = 10; //
f(x); //x is an lvalue

If the above function template represents an R-value reference, it must not pass an l-value, but in fact it can. The & & here is an undefined reference type, called universal reference. It must be initialized. Whether it is an lvalue reference or an lvalue reference depends on its initialization. If it is initialized by an lvalue, it is an lvalue reference. If it is initialized by an lvalue, it is an rvalue reference.

Note: only when automatic type inference occurs (such as automatic type derivation of function template or auto keyword), & & is a universal reference.

For example:

template<typename T>
void f( T&& param); //Here, the type of T needs derivation, so & & is a universal references

template<typename T>
class Test {
  Test(Test&& rhs); //Test is a specific type and does not require type derivation, so & & represents an R-value reference  
};

void f(Test&& param); //rvalue reference 

//A little more complicated
template<typename T>
void f(std::vector<T>&& param); //The inferred type in this vector < T > before calling this function
//It has been determined, so there is no type inference when calling the f function, so it is an R-value reference

template<typename T>
void f(const T&& param); //rvalue reference 
// universal references only occur under T & & and any additional conditions will invalidate it

So ultimately, it depends on what type T is derived into. If t is derived into a string, then T & & is a string & &, which is an R-value reference; If t is deduced as string &, a situation similar to string & & & will occur. For this situation, C++11 adds the rule of reference folding, which is summarized as follows:
1) All right value references superimposed on the right value reference are still an right value reference.
2) The overlay between all other reference types becomes an lvalue reference.

For example, the above T & & & is actually folded into a string &, which is an lvalue reference.

#include <iostream>
#include <type_traits>
#include <string>
using namespace std;

template<typename T>
void f(T&& param){
    if (std::is_same<string, T>::value)
        std::cout << "string" << std::endl;
    else if (std::is_same<string&, T>::value)
        std::cout << "string&" << std::endl;
    else if (std::is_same<string&&, T>::value)
        std::cout << "string&&" << std::endl;
    else if (std::is_same<int, T>::value)
        std::cout << "int" << std::endl;
    else if (std::is_same<int&, T>::value)
        std::cout << "int&" << std::endl;
    else if (std::is_same<int&&, T>::value)
        std::cout << "int&&" << std::endl;
    else
        std::cout << "unkown" << std::endl;
}

int main()

{
    int x = 1;
    f(1); // The parameter is an R-value T, which is derived into int, so it is int & & param, and the R-value reference
    f(x); // The parameter is an lvalue T, which is derived into int &, so it is int & & & param, folded into int &, lvalue reference
    int && a = 2;
    f(a); //Although a is an R-value reference, it is still an l-value, and T is derived as int&
    string str = "hello";
    f(str); //The parameter is an lvalue T, which is derived into a string&
    f(string("hello")); //The parameter is an R-value, and T is derived into a string
    f(std::move(str));//The parameter is an R-value, and T is derived into a string
}

To sum up: passing an lvalue in is an lvalue reference, and passing an lvalue in is an lvalue reference.


4. Perfect forwarding

The so-called forwarding is to continue to transfer parameters to another function for processing through one function. The original parameters may be right values or left values. If the original characteristics of parameters can be maintained, it is perfect.

void process(int& i){
    cout << "process(int&):" << i << endl;
}

void process(int&& i){
    cout << "process(int&&):" << i << endl;
}

void myforward(int&& i){
    cout << "myforward(int&&):" << i << endl;
    process(i);
}

int main()
{
    int a = 0;
    process(a); //a is treated as lvalue process (int &): 0
    process(1); //1 is treated as R-value process (int & &): 1
    process(move(a)); //Force a from lvalue to lvalue process (int & &): 0
    myforward(2);  //The right value is transferred to the process function through the forward function, but it is called an left value,
    //The reason is that the right value has a name, so it is process (int &): 2
    myforward(move(a));  // As above, during forwarding, the right value becomes the left value process (int &): 0
    // forward(a) / / incorrect usage. An R-value reference does not accept an l-value
}

The above example is imperfect forwarding, and c + + provides an std::forward() template function to solve this problem. Simply rewrite the above myforward() function:

void myforward(int&& i){
    cout << "myforward(int&&):" << i << endl;
    process(std::forward<int>(i));
}

myforward(2); // process(int&&):2

After the above modifications, the forwarding is still not perfect. The myforward() function can forward the right value, but not the left value. The solution is to use the universal references general reference type and the std::forward() template function to achieve perfect forwarding. Examples are as follows:

#include <iostream>
#include <cstring>
#include <vector>
using namespace std;

void RunCode(int &&m) {
    cout << "rvalue ref" << endl;
}

void RunCode(int &m) {
    cout << "lvalue ref" << endl;
}

void RunCode(const int &&m) {
    cout << "const rvalue ref" << endl;
}

void RunCode(const int &m) {
    cout << "const lvalue ref" << endl;
}

// universal references are used here. If you write T &, you do not support the incoming right value, while writing T &, you can support both left and right values
template<typename T>
void perfectForward(T && t) {
    RunCode(forward<T> (t));
}

template<typename T>
void notPerfectForward(T && t) {
    RunCode(t);
}

int main()
{
    int a = 0;
    int b = 0;
    const int c = 0;
    const int d = 0;

    notPerfectForward(a); // lvalue ref
    notPerfectForward(move(b)); // lvalue ref
    notPerfectForward(c); // const lvalue ref
    notPerfectForward(move(d)); // const lvalue ref

    cout << endl;
    perfectForward(a); // lvalue ref
    perfectForward(move(b)); // rvalue ref
    perfectForward(c); // const lvalue ref
    perfectForward(move(d)); // const rvalue ref
}

The above code test results show that with the cooperation of universal references and std::forward, the four types of forwarding can be completed perfectly.


5,emplace_back reduces memory copy and movement

Before we used vector, we generally liked to use push_back(), it can be seen from the above that unnecessary copies are easy to occur. The solution is to add mobile copies and assignment functions for your own classes, but there are simpler ways! Is to use empty_ Back() replaces push_back(), as shown in the following example:

#include <iostream>
#include <cstring>
#include <vector>
using namespace std;

class A 
{
public:
    A(int i)
    {
        // cout << "A()" << endl;
        str = to_string(i);
    }

    ~A(){}
    A(const A& other): str(other.str)
    {
        cout << "A&" << endl;
    }
public:
    string str;
};


int main()
{
    vector<A> vec;
    vec.reserve(10);
    for(int i=0;i<10;i++)
    {
        vec.push_back(A(i)); //The copy constructor was called 10 times
//        vec.emplace_back(i);  // The copy constructor has not been called once
    }

    for(int i=0;i<10;i++)
        cout << vec[i].str << endl;
}

It can be seen that the effect is obvious. Although there is no test time, it can indeed reduce the number of copies** emplace_back() * * you can construct an object directly through the parameters of the constructor, but only if there is a corresponding constructor.

For map and set, you can use empty (). Basically empty_ Back() corresponds to push_bakc(), emplce() corresponds to insert().

Mobile semantics also has a great impact on the * * swap() * * function. Previously, it may take three memory copies to implement swap. With mobile semantics, high-performance exchange functions can be realized.

template <typename T>
void swap(T& a, T& b)
{
    T tmp(std::move(a));
    a = std::move(b);
    b = std::move(tmp);
}

If T is movable, the whole operation will be very efficient. If T is not movable, it will be the same as the ordinary exchange function. No error will occur and it is very safe.


6. Summary

  • There are two value types, left and right.
  • There are three reference types, left value reference, right value reference and general reference. Lvalue references can only bind lvalues, and lvalue references can only bind lvalues. General references are determined by the type of value bound during initialization.
  • The left value and right value are independent of their types. The right value reference may be left value or right value. If the right value reference has been named, it is left value.
  • Reference folding rule: all right value references superimposed on the right value reference are still a right value reference, and other reference folds are left value references. When T & & is a template parameter, if you enter an lvalue, it will become an lvalue reference, and if you enter an lvalue, it will become a named lvalue application.
  • Mobile semantics can reduce unnecessary memory copies. To realize mobile semantics, we need to implement mobile constructors and mobile assignment functions.
  • std::move() converts an lvalue to an lvalue and forces the use of the move copy and assignment function. The function itself does not have any special operations on the lvalue.
  • std::forward() and universal references are used together to achieve perfect forwarding.
  • Use empalce_back() replaces push_back() increases performance.

Topics: C++ C++11