[C + +] string of STL container

Posted by docmattman on Wed, 02 Feb 2022 06:51:42 +0100

preface

C + + introduces the idea of object-oriented. Compared with C language, a class can better manage and operate some data structures.

In C language, we use character array and string The library function in H is used to maintain a string. The data is separated from the method. Because the underlying space is maintained by itself, it may cross the boundary if you are not careful

In C + +, based on the idea of object-oriented, the string class used to manage strings came into being. In essence, the string class is an encapsulated character array

catalogue

1. Introduction to string

When we learn STL, documents are our sharp weapon. Learning to check documents will get twice the result with half the effort. Here are two C + + document websites:

Description of string:

  1. A string is a class that represents a sequence of characters.
  2. The standard string class provides support for such objects. Its interface is similar to that of the standard character container, but adds a design feature specifically for manipulating single byte character strings.
  3. The string class uses char (that is, as its character type, its default char_traits and allocator type) (for more information about templates, see basic_string).
  4. The string class is basic_ An instance of the string template class, which uses char to instantiate basic_ String template class with char_traits and allocator as basic_ The default parameter of string (based on more template information, please refer to basic_string).
  5. Note that this class processes bytes independently of the encoding used: if it is used to process sequences of multibyte or variable length characters (such as UTF-8), all members of this class (such as length or size) and its iterators will still operate according to bytes (rather than actually encoded characters).
typedef basic_string<char, char_traits, allocator> string

In other words, the string we often mention is actually basic_ The string class template uses the instantiation of single byte char and typedef into string and basic_sring can also use double byte wchar_t instantiation is used to process other characters that cannot be represented by ascii code, such as Chinese, etc

2. Common interface and Simulation Implementation of string

2.1 common constructions of string class objects

Function namefunction
string()Construct an empty string class object, that is, an empty string
string(const char* s)Construct a string class object with a constant string
string(size_t n, char c)The string class object contains n characters c
string(const string&s)copy constructor
~string()Destructor
operator=Assignment overload: assign a string object to another string object

Use of multiple constructors:

void Teststring()
{
string s1; // Construct an empty string class object s1
string s2("hello world"); // Construct string class object s2 with C format string
string s3(s2); // Copy construct s3
}

Analog implementation of interface:

  • Constructor
//Full default constructor with parameters
basic_string(const T* s = "")
	:_size(strlen(s))
	,_capacity(strlen(s)) 
{
	_str = new T[_size + 1];
	strcpy(_str, s);
}
  • Destructor
//Destructor
~basic_string() {
	delete[] _str;
	_str = nullptr;
	_size = 0;
	_capacity = 0;
}
  • copy construction
//Copy construction (modern writing)
basic_string(const basic_string& str)
	: _str(nullptr)
	{
		if (this != &str) {
			basic_string tmp(str._str);
			swap(tmp);
	}
}
  • Assignment overload
//Assignment overload (modern writing 2)
basic_string& operator=(basic_string str) {
	swap(str);
	return *this;
}

2.2 capacity operation of string objects

Interface nameInterface function
size()Returns the effective character length of a string (thecharacter length before '\ 0')
length()Same as size()
capacity()Returns the amount of space currently allocated to a string
empty()Determine whether the string is an empty string
reserve(n)Reset the size of capacity to reserve space for the string
resize(n, c)Reset the valid characters, change the number of valid characters to n, and fill the extra space with character c
clear()Clear valid characters and change the string to an empty string

be careful:

  1. The underlying implementation principles of size() and length() methods are exactly the same. The reason why size() is introduced is to keep consistent with the interfaces of other containers. Generally, size() is basically used.
  2. clear() just clears the valid characters in the string without changing the size of the underlying space.
  3. Both resize(size_t n) and resize(size_t n, char c) change the number of valid characters in the string to N. the difference is that when the number of characters increases: resize(n) fills the extra element space with 0, and resize(size_t n, char c) fills the extra element space with C. Note: when resizing the number of elements, increasing the number of elements may change the size of the underlying capacity. If reducing the number of elements, the total size of the underlying space will remain unchanged.
  4. reserve(size_t res_arg=0): reserve space for string without changing the number of effective elements. When the parameter of reserve is less than the total size of the underlying space of string, the reserve will not change the capacity.

Analog implementation of interface:

  • size()
//size() interface
size_t size() const{
	return _size;
}
  • length()
//length() interface
size_t length() const{
	return _size;
}
  • capacity()
//capacity() interface
size_t capacity() const {
	return _capacity;
}
  • empty()
// Air judgment
bool empty() const {
	return _size == 0;
}
  • reserve(n)
void reserve(int new_capacity) {
	T* tmp = new T[new_capacity + 1];//The size of an extra T is used to store '\ 0'
	strcpy(tmp, _str);
	delete[] _str;
	_str = tmp;
	tmp = nullptr;
	_capacity = new_capacity;
	}
  • resize(n, c)
void resize(size_t n, T c = '\0') {
	if (n <= _size) {
		_str[n] = '\0';
		_size = n;
	}
	else {
		if (n > _capacity) {
			reserve(n);	
		}
		for (size_t i = _size; i < n; i++) {
			_str[i] = c;
		}
		_str[n] = '\0'; 
		_size = n;
	}
}
  • clear()
// eliminate
void clear() {
	_str[0] = '\0';
	_size = 0;
}

2.3 get the interface between element and iterator by string class object

Interface nameInterface function
operator[ ]Returns the character of the pos position
begin()Returns the iterator of the first valid character
end()Returns the iterator at the next position of the last character
rbegin()Returns the iterator of the last valid character
rend()Iterator that returns the previous position of the first character

Note: in the string class, an iterator is a character pointer
Note: auto it = rbegin(); it + + is an address that allows it to point to the previous location

Analog implementation of interface:

  • operator[]
//Overload []
T& operator[](size_t pos) {
	assert(pos < _size);
	return _str[pos];
}
const T& operator[](size_t pos) const{
		assert(pos < _size);
		return _str[pos];
	}
  • begin() and end()
typedef T* iterator;
iterator begin() {
	return _str;
}
iterator end() {
	return _str + _size;
}

2.4 modifying element interface of string class object

Interface nameInterface function
push_back()Insert character c after string
append()Insert string s after string
operator+=Insert character c / String s after string
insert()Insert n characters c / s from the position marked pos in the string
erase()Start deleting n characters c / String s at the position marked pos in the string
swap()Swap two class objects

Analog implementation of interface:

  • push_back() / +=
//push_ Insert a character at the end of back
void push_back(const T c) {
	if (_size == _capacity) {
		size_t new_capacity = _capacity == 0 ? 4 : 2 * _capacity;
		reserve(new_capacity);
	}
	_str[_size] = c;
	_size++;
	_str[_size] = '\0';
}
// Reuse push_back overload + = character
void operator+=(const T c) {
	push_back(c);
}
  • qppend() / +=
//append trailing string
void append(const T* str) {
		size_t len = strlen(str);
		if (_size + len > _capacity) {
			reserve(_size + len);
		}
		strcpy(_str + _size, str);
		_str[_size + len] = '\0';
		_size += len;
	}
//Reuse append, overload + = string
void operator+=(const T* str) {
	append(str);
}
  • insert()
//Insert character
basic_string& insert(size_t pos, const T c) {
	assert(pos <= _size);
	if (_size == _capacity) {
		size_t new_capacity = _capacity == 0 ? 4 : 2 * _capacity;
		reserve(new_capacity);
	}
	_size++;
	size_t end = _size + 1;
	while (end > pos) {
		_str[end] = _str[end - 1];
		end--;
	}
	_str[pos] = c;
	return *this;
}
//Insert string
basic_string& insert(size_t pos, const T* s) {
	assert(pos <= _size);
	size_t len = strlen(s);
	if (len + _size > _capacity) {
		reserve(len + _size);
	}
	size_t end = _size + len;
	while (end - len + 1 > pos) {
		_str[end] = _str[end - len];
		end--;
	}
	_size += len;
	memcpy(begin() + pos, s, len * sizeof(T));
	return *this;
}
  • erase()
iterator erase(size_t pos, size_t len = npos) {
	assert(pos < _size);
	if (pos + len >= _size) {
		_str[pos] = '\0';
	}
	else {
		strcpy(begin() + pos, begin() + pos + len);
	}
	_size = strlen(_str);
	return begin() + pos;
}
  • swap()
void swap(basic_string& str) {
	::swap(_size, str._size);
	::swap(_capacity, str._capacity);
	::swap(_str, str._str);
}

2.5 string operation interface

Interface nameInterface function
find()Find the character c from the pos position of the string and return the position of the character in the string. If it is not found, return npos
rfind()Find the character c from the pos position of the string and return the position of the character in the string. If it is not found, return npos
c_str()Returns a character pointer in a string class object as a constant string
substr()Intercept the substring s with length n from the pos position and return the class of the new substring; N defaults to npos

Analog implementation of interface:

  • find()
size_t find(const T c, size_t pos = 0) const {
	assert(pos < _size);
	for (size_t i = pos; i < _size; i++) {
		if (_str[i] == c) {
			return i;
		}
	}
	return npos;
}
size_t find(const T* s, size_t pos = 0) const {
	assert(pos < _size);
	T* ret = strstr(_str + pos, s);
	if (ret == nullptr) {
		return npos;
	}
	else {
		while (*ret != _str[pos]) {
			pos++;
		}
		return pos;
	}
}
  • c_str()
const T* c_str() const {
	return _str;
}
  • substr()
basic_string substr(size_t pos, size_t n = npos) {
	assert(pos < _size);
	basic_string tmp;
	tmp.resize(_size);
	if (pos + n >= _size || n == npos) {
		strcpy(tmp._str, _str + pos);
		_size -= pos + 1;
	}
	else {
		strcpy(tmp._str, _str + pos);
		*(tmp._str + n) = '\0';
		_size = n;
	}
	cout << _size << endl;
	return tmp;
}

2.6 member constants

Interface nameInterface function
nposstatic const size_t npos = -1;

Analog implementation of interface:

  • npos
static const size_t npos = -1;

2.7 non member function overloading

Interface nameInterface function
operator+Returns a newly constructed string object that adds two strings
operator<<Output all valid characters
operator>>Input, space encountered or line feed terminated
getline()Input, line feed termination encountered

Analog implementation of interface:

  • operator+
//operator+
basic_string operator+(const basic_string& str) {
	basic_string ret(*this);
	ret += str._str;
	return ret;
}
basic_string operator+(const T* s) {
	basic_string ret(*this);
	ret += s;
	return ret;
}
  • operator<<
ostream& operator<<(ostream& out, const basic_string<char>& str) {
	for (size_t i = 0; i < str.size(); i++) {
		out << str[i];
	}
	return out;
}
  • operator>>
istream& operator>>(istream& in, basic_string<char>& str) {
	str.clear();
	char ch;
	ch = in.get();
	while (ch != ' ' && ch != '\n') {
		str += ch;
		ch = in.get();
	}
	return in;
}
  • getline()
istream& getline(istream& in, basic_string<char>& str) {
	str.clear();
	char ch;
	ch = in.get();
	while (ch != '\n') {
		str += ch;
		ch = in.get();
	}
	return in;
}

2.8 overloading of string object comparison operator

Interface nameInterface function
<less than
==be equal to
!=Not equal to
>greater than
>=Greater than or equal to
<=Less than or equal to

Interface simulation implementation:

  • <
bool operator<(const basic_string& str) {
	size_t end = 0;
	while (end < _size && end < str._size) {
		if (_str[end] > str._str[end]) {
			return false;
		}
		else if(_str[end] < str._str[end]){
			return true;
		}
		end++;
	}
	if (end == str._size) {
		return false;
	}
	else {
		return true;
	}
}
  • ==
bool operator==(const basic_string& str) {
	if (_size != str._size) {
		return false;
	}
	size_t end = 0;
	while (end < _size && end < str._size) {
		if (_str[end] != str._str[end]) {
			return false;
		}
		end++;
	}
	return true;
}
  • !=
bool operator!=(const basic_string& str) {
	return !(*this == str);
}
  • >
bool operator>(const basic_string& str) {
	return !((*this < str) || (*this == str));
}
  • >=
bool operator>=(const basic_string& str) {
	return !(*this < str);
}
  • <=
bool operator<=(const basic_string& str) {
	return !(*this > str);
}

2.9 three traversal methods of string objects

void Teststring()
{
	string s("hello world");
	// Three traversal modes:
	// Note that in addition to traversing the string object, the following three methods can also be used to modify the characters in the string,
	// In addition, the first of the following three methods is the most used for string s
// 1. for+operator[]
	for(size_t i = 0; i < s.size(); ++i)
		cout<<s[i]<<endl;
// 2. Iterator
	string::iterator it = s.begin();
	while(it != s.end())
	{
		cout<<*it<<endl;
		++it;
	}
	string::reverse_iterator rit = s.rbegin();
	while(rit != s.rend())
	{
		cout<<*rit<<endl;
		rit++;
	}
// 3. Scope for
	for(auto ch : s)
		cout << ch << endl;
}

3. How to get familiar with the interface - brush questions

Brushing questions must be the best way for beginners to master STL. Replace learning with questions and get twice the result with half the effort. Here are some string exercises

Niuke.com:

leetcode:

4. Other words

  1. Simulating stl is a boring and time-consuming process, but it can help us deeply understand pointers, data structures, object-oriented programming, logical thinking ability and code ability

  2. "STL source code analysis" -- Hou Jie. I'm going to read it

  3. All codes of this blog: github,gitee

Topics: C++ data structure string Container STL