[C + +] simulation implementation of string class

Posted by alfoxy on Thu, 06 Jan 2022 00:29:14 +0100

I Simple string class design

It mainly realizes the resource management functions such as string class construction, copy construction, assignment operator overloading and destructor.

1. private members

Is a string pointer in C language

class string
{
public:

private:
	char* _str;
};

2. Constructor

We design a fully default default constructor. If you do not pass parameters, you will store \ 0 by default, which is an empty string

class string
{
public:
	//Constructor (incorrect writing)
	string(char* str = '\0')
	{
		_str = str;
	}

private:
	char* _str;
};

The above method is actually wrong. When we display the initial value (string) of the string object, the string is a constant stored in the code segment, which is read-only and not writable. If you can't modify, the string object is meaningless.

Since we can't pass the constant string that exists in the code segment, can we pass the string stored on the stack? No, the string on the stack is stored in the character array. Although we can modify it, we can't expand it, because the space is fixed when the array is defined, which is inconvenient for us to manage the string resources.

If you want to modify the string content and expand the capacity at any time, it is most appropriate to put the value of the string object on the heap. At this time, the string stored in the heap space can be modified like the string on the stack, and you can open up the space you want through new [] at any time.

class string
	{
	public:
		//Constructor (correctly written)
		string(const char* str = '\0')//const char* str = "" the two expressions are the same
		{
			// 1. Let_ str points to the space we opened on the heap (one more for storage \ 0)
			_str = new char [strlen(str) + 1];

			// 2. Copy STR content to_ Str (that is, copy the content of the code segment (STR) to the heap space (_str))
			strcpy(_str, str);
		}

	private:
		char* _str;
	};

3. Destructor

Use delete [] to free up the space we have opened up, and then_ str set to nullptr (prevent wild pointer)

class string
	{
	public:
	    //Destructor
		~string()
		{
			delete[] _str;
			_str = nullptr;
		}

	private:
		char* _str;
	};

4. Copy construction and assignment overload

The definition copy structure and assignment overload must be shown here. If the default is used, it will cause shallow copy problems (releasing the same space multiple times)

4.1 what is shallow copy?

The default copy construction and assignment overloading are implemented through shallow copy. Shallow copy is a byte by byte copy. Shallow copy is also called value copy

4.2 problems caused by shallow copy

4.3 deep copy completes copy construction and assignment overloading

Since it is a problem caused by pointing to the same space, we will reopen a space of the same size and use strcpy to copy the content of another space to the newly opened space, which is called deep copy.

  • Shallow copy: same space, same content
  • Deep copy: the space is different and the content is the same

copy construction

Traditional writing:

class string
	{
	public:
		//copy construction 
		string(const string& s)
		{
			// 1. In_ str points to a newly opened space of the same size (add one to store \ 0)
			_str = new char[strlen(_str) + 1];
			// 2. Copy the contents of str space to_ In the space that str points to
			strcpy(_str, s._str);
		}

	private:
		char* _str;
	};

Modern writing (more concise):

class string
	{
	public:
		string(const string& s)
		{
			string tmp(s._str);
			//The swap here is provided by c + +
			//Exchange_ str and TMP Space pointed to by str
			//When the tmp life cycle of the function ends, the destructor is automatically called to release the space of the tmp (that is, the space of the original s)
			swap(_str, tmp._str);
		}

	private:
		char* _str;
	};

Assignment overload
Both objects overloaded by assignment have been initialized, so before copying the right value to the left value, release the old space of the left value and point it to the new space

Traditional writing:

class string
	{
	public:
		//Assignment overload
		string& operator=(const string& s)
		{
		    if(this!=&s)
		    {
			   // 1. In_ str points to a newly opened space of the same size (add one to store \ 0)
			   char* newstr = new char[strlen(s._str) + 1];
			   // 2. Copy the contents of str space to the space pointed to by newstr
			   strcpy(newstr, s._str);
			   // 3. Release the old space
			   delete[] _str;
			   // 4. Let_ str points to a newly created space where the value has been copied
			   _str = newstr;
			   //return
			   return *this
			}
		}

	private:
		char* _str;
	};

Modern writing:

class string
	{
	public:
		string& operator=(const string& s)
		{
			if (this != &s)
			{
				string tmp(s);//Copy construct s
				swap(_str, tmp._str);
			}
			return *this;
		}

	private:
		char* _str;
	};

Description of assignment overload:

  • Return value: in order to support concatenation, the return value is string, and because * this (i.e. lvalue) still exists after this function, the reference of lvalue is returned (copy structure less once).
  • Parameter value: for the right value, we just read its value and copy it to the left value without modifying its content, so we add const modification

II Simulation Implementation of string class

private member

In addition to the character array (_str), the_ Size (record the current number of valid characters)_ capacity (how many valid characters can be stored in the record) and static constant npos(npos is - 1 of size_t type)

class string
{
public:

private:
	char* _str;
	size_t _size;
	size_t _capacity;
	static const size_t npos;
};

Next, we introduce several more complex interfaces

1. string class object capacity operation interface

1.1 reserve

Prototype: void reserve (size_t n = 0);
Function: expand the capacity of string objects. If n is less than or equal to the current capacity (_capacity), nothing happens; N expand the capacity if it is greater than the current capacity

void reserve(size_t n=0)
		{
			if (n > _capacity)
			{
				char* newstr = new char[n+1];// 1. Open a new space
				strcpy(newstr, _str);        // 2. Copy old space
				delete[] _str;               // 3. Release old space
				_str = newstr;               // 4. Point to new space
				_capacity = n;               // 5. Update capacity
			}
		}

1.2 resize

Prototype: void resize (size_t n, char c = '\ 0');
Function: change the number of valid characters to n. if it is larger than the original size, the extra valid space is filled with character C. if no character c is passed, the default is character '\ 0'; If it is smaller than the original size, the extra valid characters will be truncated.

void resize(size_t n, char c = '\0')
		{
			if (n > _size)
			{
				//Check whether capacity expansion is required
				if (n > _capacity)
				{
					reserve(n);
				}
				memset(_str + _size, c, n - _size);
				_size = n;
				_str[_size] = '\0';
			}
			else if(n<_size)
			{
				_size = n;
				_str[_size] = '\0';
			}
		}

		//After simplification, it can be written like this
		void resize(size_t n, char c = '\0')
		{
			if (n>_size)
			{
				if (n > _capacity)
				{
					reserve(n);
				}
				memset(_str + _size, c, n - _size);
			}
			_size = n;
			_str[_size] = '\0';
		}

If the required number of valid characters is greater than the original size, we use memset to set the additional effective space. Note that memset is copied byte by byte. Generally, this is only used when setting characters.

The following is an example to illustrate this problem: we need to set the contents of 10 capacity integer array arr to 3 with memset

Our results can also be supported by a calculator

Setting a byte by byte is only applicable to setting characters for a character array, because the size of a character is a byte

2. string object string operation interface

2.1 c_str

Prototype: const char* c_str() const;
Function: returns a C format string, which is only readable but not writable

Is to return member variables directly_ str, whose type is char*

const char* c_str() const
{
	return _str;
}

The C-FORMAT string is different from the string object. The C-FORMAT string looks at '\ 0', and it ends when it encounters' \ 0 'when outputting with cout; The string object looks at its number of valid characters (that is, _size), regardless of whether there is' \ 0 'in the middle

2.2 substr

Prototype: string substr (size_t pos = 0, size_t len = npos) const;
Function: start with pos subscript in str, intercept n characters, and then return them

string substr(size_t pos = 0, size_t len = npos) const
		{
			//Since it is a substring, the subscript must be legal
			assert(pos < _size);
			if (len > _size)
			{
				len = _size-pos;
			}
			char* tmp = new char[len + 1];// 1. Open a new space (one more for storage \ 0)
			strncpy(tmp, _str + pos, len);// 2. Copy the shell string to the new space
			tmp[len] = '\0';              // 3. Process end \ 0
			string s_tmp(tmp);            // 4. Use the substring space copy opened earlier to construct a string object
			delete[] tmp;                 // 5. Release the new space opened in front
			return s_tmp;                 // 6. Return the string class object of copy construction
		}

3. Modify operation interface of string class object

3.1 insert

Prototype: String & insert (size_t POS, const char * s);
Function: insert a string in pos position

string& insert(size_t pos, const char* str)
		{
			assert(pos <= _size);
			// 1. Judge whether the capacity is sufficient. If not, increase the capacity
			int len = strlen(str);
			if (_size + len > _capacity)
			{
				reserve(_size + len);
			}
			// 2. Make sure there is enough space to move the data (one character by one character)
			size_t end = _size;
			while ((int)pos <= (int)end)
			{
				_str[end + len] = _str[end];
				end--;
			}
			// 3. After moving, start to put the data
			strncpy(_str + pos, str, len);
			_size += len;
			return *this;
		}

Some notes on insert

3.2 erase

Prototype: String & erase (size_t POS = 0, size_t len = NPOs);
Function: delete the len length string after pos subscript

string& erase(size_t pos = 0, size_t len = npos)
		{
			assert(pos < _size);
			// 1. If the required length is greater than or equal to the length of the following valid characters, all valid characters after pos will be deleted
			if (len >= _size - pos)
			{
				len = _size - pos;
				resize(pos);
			}
			else// 2. If the middle paragraph is deleted, connect the front and back directly
			{
				strncpy(_str + pos, _str + pos + len, _size - pos - len + 1);
				_size -= len;
			}
			return *this;
		}

Some notes on erase

4. Non member function of string class

4.1 operator<<

Prototype: ostream & operator < < (ostream & out, const string & S);
Role: overload of < < operator of string class

ostream& operator<<(ostream& out, const string& s)
	{
		int len = s.size();
		// Output the characters of the string one by one, and a total of size will be output
		for (size_t i = 0; i < s.size(); i++)
		{
			out << s[i]; 
		}
		// Finally, return out to support continuous < < operations
		return out;
	}

4.2 operator>>

Prototype: istream & operator > > (istream & String & S);
Function: overload the < < operator of string class

This operator is the same as scanf in C language. When reading a string, you can't read spaces and carriage return. Both are input. The input is completed when carriage return.

istream& operator>>(istream& in, string& s)
	{
		while (1)
		{
			char c = in.get();// Receive data from the buffer, character by character

			//If you encounter a space or enter, the reception is completed
			if (c == ' ' || c == '\n')
			{
				break;
			}
			else// Otherwise, insert the end of the character into the object
			{
				s += c;
			}
		}
		return in;
	}

4.3 getline

Prototype: istream & getline (istream & is, string & STR);
Function: string class object receives a line of data (spaces can also be received)

Equivalent to C language's gets(), which can receive one line of data

istream& getline(istream& in, string& s)
	{
		while (1)
		{
			char c = in.get();
			if (c == '\n')// Receive data from the buffer and stop only when a carriage return is encountered
			{
				break;
			}
			else
			{
				s += c;
			}
		}
		return in;
	}

5. iterator of string class

For the string class, its typedef is char * (the iterator should be declared in the pubulic of the class). Since it is char *, it can be used like a pointer in the C language.

class string
{
public:
    //iterator of string class
	typedef char* iterator;

private:
	char* _str;
	size_t _size;
	size_t _capacity;
	static const size_t npos;
};

We can use iterator to traverse the string class object

void test_string()
	{
		string s("hello");
		my_string::string::iterator it = s.begin();
		while (it != s.end())
		{
			cout << *it << " ";
			it++;
		}
		cout << endl;
	}

In fact, the bottom layer of auto is also an iterator. Auto will eventually be converted into an iterator by the compiler

Topics: C++ string STL