[C + + from getting started to kicking the door] Chapter 7: simulating string

Posted by yuan on Fri, 04 Mar 2022 10:47:20 +0100


Use of string

First, be familiar with the use of string. You can refer to string user manual

String is the character to store the string, and dynamic memory management should be realized
Referring to the string of the standard library, there are three private member variables:

class mystring
{
    char* _str;//Point to the heap where the string is stored
    size_t _size;//Actual string length
    size_t _capacity;//capacity
};

Default member function

Constructor

Constructor of default parameter. The default parameter is an empty string (with '\ 0' at the end)_ capacity and_ size is preset to the length of the actual string, excluding \ 0

mystring(const char* str = "") : _size(strlen(str)), _capacity(_size)
	{
		//Open up space
		_str = new char[_capacity + 1];
		strcpy(_str,str);
	}

//Or split the default constructor: the default constructor and the ordinary constructor with parameters

copy constructor

Examples of errors:

mystring(const mystring& s):_size(s._size),_capacity(s._capacity)
{
    _str = s._str;
}

This involves the problem of deep and shallow copy, and the error example is shallow copy (value copy).

Shallow copy: a simple copy value. If the value is a pointer, the copy pointer and the source pointer will point to the same memory area.

If the string of one of them is changed, it is bound to affect another object that also uses the space, and the destructor is called at the end of the program, then the same space will be released twice.

Therefore, deep copy should be used. The copied object copies the values in the heap space of the source object. In this way, the two objects are independent of each other.

Correct writing:

mystring(const mystring& s):_size(s._size),_capacity(s._capacity)
{
	//Open up space
	_str = new char[_capacity];
	strcpy(_str, s._str);
}

Another way of writing is to use the source object_ str pointer constructs a temporary object temp, and then interchanges the contents of temp and * this to realize the trade-off.

mystring(const mystring&s):_str(nullptr),_size(0),_capacity(0)
{
    mystring temp(s._str);//Construct temporary objects
    swap(temp._str,_str);
    swap(temp._size,_size);
    swap(temp._capacity,_capacity);

}

Assignment overloaded function

Open up space and transfer the value

mystring& operator=(const mystring& s)
{
	if (this != &s)
	{
		char* temp = new char[s._capacity + 1];//Ensure that new will not fail, and then delete itself
		strcpy(temp, s._str);
		delete[] _str;
		_str = temp;
		_size = s._size;
		_capacity = s._capacity;
	}

	return *this;
}

Considering the second writing method of copy function (Reuse Construction), the assignment overload here can reuse the copy function to construct temporary objects

To facilitate the exchange, wrap the swap first (if you directly use the swap in the library, the constructor will be called many times)

	void swap( mystring& s2)
	{
		std::swap(_str, s2._str);
		std::swap(_size, s2._size);
		std::swap(_capacity, s2._capacity);
	}
mystring& operator=(const mystring& s)
{
	//Reuse the copy structure, generate the temporary object, and then exchange * this with the temporary object
	mystring temp(s);
	swap(temp);
	return *this;
}

Further, if the function parameter is set to pass value, the copy structure will be called automatically when passing value

mystring& operator=(mystring s)
{
	swap(s);
    return *this;
}

Destructor

Space is applied here. The default destructor will not help us delete, so we need to manually free space in the constructor

~mystring()
{
	delete[] _str;
	_str = nullptr;
	_capacity = 0;
	_size = 0;
}

Iterator correlation function

begin and end and rbegin and rend

Since string uses contiguous space, its iterator is its native pointer.

typedef char* iterator;
typedef const char* const_iterator;

	iterator begin()
	{
		return _str;
	}

	const_iterator begin()const
	{
		return _str;
	}

	iterator end()
	{
		return _str+_size;
	}

	const_iterator end()const
	{
		return _str+_size;
	}

Capacity management related functions

size , capacity

Check the actual size and capacity of the current object. Since there is no need to modify its content, const can be used to modify it.

⚠ Note: the this pointer itself is decorated with the top const. We will add the bottom const decoration later and lock * this.

size_t size()const
{
	return _size;
}

size_t capacity()const
{
	return _capacity;
}

reserve , resize

  • reserve: only expand capacity, not shrink capacity
// reserve - capacity expansion
void reserve(size_t n)
{
	//Only when n >_ Capacity will be expanded
	if (n > _capacity)
	{
		char* temp = new char[n + 1];
		strncpy(temp, _str,_size+1);
		delete[] _str;
		_str = temp;
		_capacity = n;
	}
}

There is no need for strcpy here, for fear of encountering a string with \ 0 in it. For example, "hello"Hello \ 0world"world" will only be copied to "hello"Hello \ 0"" and the data will be lost.

  • resize
    n> When the given capacity is not fixed, the string is filled with the given capacity (n), but the given capacity is not fixed, and the size is not changed.
//resize   
void resize(size_t n, char c = '\0')
{
	if (n < _size)
	{
		_str[n] = '\0';
		_size = n;
	}

	else
	{
		reserve(n);
		memset(_str + _size, c, sizeof(char) * (n - _size));
		_size = n;
		_str[_size] = '\0';
	}
}

clear , empty

void clear()
{
	resize(0);
}

bool empty()
{
	return strcmp(_str, "") == 0;
}

Find functions related to accessing strings

operator[]

//Return reference for easy modification
char& operator[](size_t pos)
{
	assert(pos < _size);
	return _str[pos];
}

//Pass in a constant string for read-only operations
const char& operator[](size_t pos)const
{
	assert(pos < _size);
	return _str[pos];
}

c_str

char* c_str()
{
    return _str;
}

const char* c_str()const
{
    return _str;
}

find

//Find single character
size_t find(const char ch,size_t pos=0)const
{
    while(pos < _size)
    {
        if (_str[pos] == ch)
        {
            return pos;
        }
        pos++;
    }
    return npos;
}

//Find substring
size_t find(const char* s, size_t pos = 0)const
{
    const char* ret = strstr(_str + pos,s);
    if (ret)
    {
        return ret - _str;
    }
    else
    {
        return npos;
    }
}

rfind

//Find character
//We reverse the parent string, then reuse find, and finally process the return value.
size_t rfind(const char ch, size_t pos = npos)const
{
    mystring temp(*this);
    reverse(temp.begin(), temp.end());
    if (pos >= _size)
    {
        pos = _size - 1;
    }
    size_t rpos = _size - pos - 1;
    //*this should look forward from pos, and temp should look back from rpos
    size_t ret = temp.find(ch, rpos);
    if (ret == npos)
    {
        return npos;
    }
    else
    {
        return _size - ret - 1;
    }

}

//Find substring
//Inversion can also be used here, but another space needs to be opened to reverse the substring, and then find it in the inverted parent string
//Let's find the last matching string by another method
size_t rfind(const char* s, size_t pos = npos)const
{
    size_t locate = find(s);//At this time, there are two situations for locate: 1 npos 2. Subscript of the first position (possibly before and after pos)
    size_t ret=locate;
    while (locate < pos)
    {
        ret = locate;
        locate = find(s, ret + 1);
    }
    if (ret <= pos)
    {
        return ret;
    }
    else
    {
        return npos;
    }
}

Modify string function

The difficulty of this kind of function lies in capacity expansion and shift: reusing the reserve function makes capacity expansion easy.

push_back

In_ Add characters to the size position. Don't forget to add another character at the end \ 0

//push_back
void push_back(char ch)
{
    //Check expansion
    if (_size == _capacity)
    {
        reserve(_capacity == 0 ? 4 : 2 * _capacity);
    }
    _str[_size] = ch;
    _size++;
    _str[_size] = '\0';
}

append

Considering the size of expansion_ Size also needs to change in real time

//The capacity expansion of append needs to be considered according to the size of the inserted string
void append(const char* s)
{
    size_t len = strlen(s);
    if(len+_size>_capacity)
    {
            reserve(_capacity == 0 ? len : len + _size);
    }
    strcpy(_str + _size, s);
    _size += len;
}

+=

Reuse push_back ,append

//+=
mystring& operator+=(char ch)
{
    push_back(ch);
    return *this;
}

mystring& operator+=(const char* s)
{
    append(s);
    return *this;
}

pop_back

void pop_back()
{
	_str[--_size] = '\0';
}

insert

//Insert a single character
mystring& insert(size_t pos, char ch)
{
    assert(pos <= _size);//_str[_size]='\0'
    //Consider capacity expansion
    if (_size == _capacity)
    {
        reserve(_capacity = 0 ? 4 : 2 * _capacity);
    }
    if (pos == _size)
    {
        push_back(ch);
    }
    else
    {
        memmove(_str + pos + 1, _str + pos, _size - pos+1);
        _str[pos] = ch;
        _size += 1;
    }
    return *this;
}

//Insert string
mystring& insert(size_t pos, const char* s)
{
    assert(pos <= _size);
    int len = strlen(s);
    //Consider capacity expansion
    if (_size+len> _capacity)
    {
        reserve(_size+len);
    }
    _size += len;
    memmove(_str + pos + len, _str + pos, _size - pos + 1);
    strncpy(_str + pos, s, len);
    return *this;

}

erase

mystring& erase(size_t pos = 0, size_t len = npos)
{
    assert(pos < _size);
    if (len == npos || pos + len > _size)
    {
        _str[pos] = '\0';
        _size = pos;
    }
    else
    {
        memmove(_str + pos, _str + pos + len, _size - pos - len + 1);
        _size -= len;
    }
    resize(_size);
    return *this;
}

Relational operator overloading

For such functions, we place global functions

bool operator<(const mystring& s1, const mystring& s2)
{
	return strcmp(s1.c_str(), s2.c_str()) < 0;
}

bool operator==(const mystring& s1, const mystring& s2)
{
	return strcmp(s1.c_str(), s2.c_str()) == 0;
}

bool operator>(const mystring& s1, const mystring& s2)
{
	return strcmp(s1.c_str(), s2.c_str()) > 0;
}

bool operator>=(const mystring& s1, const mystring& s2)
{
	return !(s1 < s2);
}

bool operator<=(const mystring& s1, const mystring& s2)
{
	return !(s1 > s2);
}

Stream operator overload

Due to the characteristics of the direction of stream extractor and stream inserter, it is not convenient to use them as member functions (because the this pointer will always be on the left side of the symbol), so they are set as global functions here.

//Enter > >
istream& operator>>(istream& in, mystring& s)
{
	//Clear the original content of the string first
	s.clear();
	char ch;
	ch=in.get();
	while (ch != ' ' && ch != '\n')//Encountered '' and '\ 0' end reading
	{
		s += ch;
		 ch=in.get();
	}
	return in;
}

istream& getline(istream& in, mystring& s)
{
	//Clear the original content of the string first
	s.clear();
	char ch;
	ch = in.get();
	while (ch != '\n')//Read one line until the end of the newline character
	{
		s += ch;
		ch = in.get();
	}
	return in;
}

//Output<<
ostream& operator<<(ostream& out,const mystring&s)
{
	//out << s.c_ str();// This way of writing burps fart when there is a string of \ 0 in it

	for (auto ch : s)
	{
		out << ch;
	}
	return out;
}
Green mountains do not change, green water flows long

Topics: C++ Back-end